Overview
Create an E2E failsafe integration test that proves the ergonomic @CopilotTool + ToolDefinition.fromObject() API produces identical wire behavior to the low-level ToolDefinition.create() API, tested against the replay proxy.
Branch: edburns/1682-java-tool-ergonomics on upstream (⚠️ NOT main — PRs must target this branch)
Prerequisites
- Tasks 4.1–4.4 must be complete and merged to the branch.
- Before writing any code, read the entire implementation plan at:
1682-java-tool-ergonomics-prompts-remove-before-merge/dd-3018003-ignorance-reduction-for-implementation-plan.md
Relevant plan sections to carefully re-read
- Section 4.5 — E2E integration test (the primary task description, includes test code outline)
- Phase 2 ✅ — Verify the existing low-level path works in Java (context:
LowLevelToolDefinitionIT is the reference)
- The existing skill file for creating Java E2E tests:
.github/skills/new-java-e2e-test-yaml-and-test/SKILL.md — read this skill file for the exact patterns, conventions, and harness usage for creating new E2E tests.
Deliverables
Files to create
test/snapshots/tools/ergonomic_tool_definition.yaml — replay proxy snapshot (may be identical to low_level_tool_definition.yaml since the wire format is the same; if so, symlink or copy is acceptable).
java/src/test/java/com/github/copilot/e2e/ErgonomicToolDefinitionIT.java — the failsafe IT class.
Test specification
The test must define a tools class using the ergonomic API:
class MyTestTools {
String currentPhase;
@CopilotTool("Sets the current phase of the agent")
String setCurrentPhase(@Param("The phase to transition to") String phase) {
currentPhase = phase;
return "Phase set to " + phase;
}
@CopilotTool("Search for items by keyword")
String searchItems(@Param("Search keyword") String keyword) {
return "Found: item_alpha, item_beta";
}
@CopilotTool(value = "Custom grep override", name = "grep", overridesBuiltInTool = true)
String grepOverride(@Param("Search query") String query) {
return "CUSTOM_GREP: " + query;
}
}
The test method:
- Creates
MyTestTools instance.
- Calls
ToolDefinition.fromObject(tools) to get tool definitions.
- Creates a
CopilotSession configured with the replay proxy URL and the tool definitions.
- Sends a prompt that triggers tool invocations.
- Asserts that:
- Tools were invoked (via the tool handler callbacks).
- The correct arguments were passed to each tool.
- The session completed successfully.
- The wire-level behavior is identical to
LowLevelToolDefinitionIT.
Snapshot YAML
The snapshot YAML must match the exchange pattern from test/snapshots/tools/low_level_tool_definition.yaml. The tool schemas sent over the wire by the ergonomic API must be byte-for-byte identical to what the low-level API sends (proving the abstraction is lossless).
If the snapshot can be reused as-is (same tool names, same schemas), reference the existing file. If tool names differ, create a new snapshot with appropriate tool definitions.
Gating tests and criteria
All of the following must pass before this task is considered complete:
-
IT runs and passes: mvn verify -Dit.test=ErgonomicToolDefinitionIT passes.
-
Wire equivalence: The tool definitions registered via fromObject() produce identical JSON-RPC tool registration messages as those from LowLevelToolDefinitionIT. Verify by comparing:
- Tool names sent to the server.
- Tool schemas (JSON Schema) sent to the server.
- Tool invocation request/response format.
-
Tool invocation verification: Assert that during the test session:
- At least one tool was invoked by the model.
- The tool handler received the correct arguments.
- The tool handler's return value was sent back to the server.
- State was mutated correctly (e.g.,
currentPhase field was set).
-
Override tool verification: If the snapshot exercises the grep override tool, verify it was invoked and returned "CUSTOM_GREP: ...".
-
No regression: mvn clean verify passes (all existing ITs including LowLevelToolDefinitionIT still pass).
-
Spotless format check: mvn spotless:check passes.
Constraints
-
✅✅ YOU MUST run mvn spotless:apply before every commit.
-
Follow the exact E2E test patterns established by LowLevelToolDefinitionIT and documented in .github/skills/new-java-e2e-test-yaml-and-test/SKILL.md.
-
Use E2ETestContext for managing the replay proxy lifecycle.
-
Test method names are converted to lowercase snake_case for snapshot filenames.
-
Do NOT modify any files outside the java/ and test/snapshots/ directories.
-
Do NOT modify LowLevelToolDefinitionIT or its snapshot.
Overview
Create an E2E failsafe integration test that proves the ergonomic
@CopilotTool+ToolDefinition.fromObject()API produces identical wire behavior to the low-levelToolDefinition.create()API, tested against the replay proxy.Branch:⚠️ NOT
edburns/1682-java-tool-ergonomicsonupstream(main— PRs must target this branch)Prerequisites
1682-java-tool-ergonomics-prompts-remove-before-merge/dd-3018003-ignorance-reduction-for-implementation-plan.mdRelevant plan sections to carefully re-read
LowLevelToolDefinitionITis the reference).github/skills/new-java-e2e-test-yaml-and-test/SKILL.md— read this skill file for the exact patterns, conventions, and harness usage for creating new E2E tests.Deliverables
Files to create
test/snapshots/tools/ergonomic_tool_definition.yaml— replay proxy snapshot (may be identical tolow_level_tool_definition.yamlsince the wire format is the same; if so, symlink or copy is acceptable).java/src/test/java/com/github/copilot/e2e/ErgonomicToolDefinitionIT.java— the failsafe IT class.Test specification
The test must define a tools class using the ergonomic API:
The test method:
MyTestToolsinstance.ToolDefinition.fromObject(tools)to get tool definitions.CopilotSessionconfigured with the replay proxy URL and the tool definitions.LowLevelToolDefinitionIT.Snapshot YAML
The snapshot YAML must match the exchange pattern from
test/snapshots/tools/low_level_tool_definition.yaml. The tool schemas sent over the wire by the ergonomic API must be byte-for-byte identical to what the low-level API sends (proving the abstraction is lossless).If the snapshot can be reused as-is (same tool names, same schemas), reference the existing file. If tool names differ, create a new snapshot with appropriate tool definitions.
Gating tests and criteria
All of the following must pass before this task is considered complete:
IT runs and passes:
mvn verify -Dit.test=ErgonomicToolDefinitionITpasses.Wire equivalence: The tool definitions registered via
fromObject()produce identical JSON-RPC tool registration messages as those fromLowLevelToolDefinitionIT. Verify by comparing:Tool invocation verification: Assert that during the test session:
currentPhasefield was set).Override tool verification: If the snapshot exercises the
grepoverride tool, verify it was invoked and returned"CUSTOM_GREP: ...".No regression:
mvn clean verifypasses (all existing ITs includingLowLevelToolDefinitionITstill pass).Spotless format check:
mvn spotless:checkpasses.Constraints
✅✅ YOU MUST run
mvn spotless:applybefore every commit.Follow the exact E2E test patterns established by
LowLevelToolDefinitionITand documented in.github/skills/new-java-e2e-test-yaml-and-test/SKILL.md.Use
E2ETestContextfor managing the replay proxy lifecycle.Test method names are converted to lowercase snake_case for snapshot filenames.
Do NOT modify any files outside the
java/andtest/snapshots/directories.Do NOT modify
LowLevelToolDefinitionITor its snapshot.