NimbleBrainInc · JoeCardoso13 · Mar 27, 2026 · Mar 23, 2026 · Mar 25, 2026 · Mar 26, 2026
@@ -67,64 +67,13 @@ make check
 ```
 This runs: `format:check` → `lint` → `typecheck` → `test`
 
-### Integration tests
-
-**Python:**
-```bash
-make test-integration         # real API calls (needs <NAME>_API_KEY in .env)
-```
-
-Real API calls against the live service. The template scaffolds `tests-integration/test_core_tools.py` as a stub — you must replace it with real tests.
-
-**How to write them:** Open `api_client.py` and list every public method (skip `__init__`, `close`, `_request`, `_ensure_session`, and dunder methods). For each method, write a test:
-
-- **Read methods** (list, get, search): Call with minimal valid parameters, assert the response has the expected shape (list, dict, or model with expected keys).
-- **Write methods** (create, update, delete): Create a test resource, verify it, then clean it up in a `finally` block. If no delete method exists, mark the resource as completed/archived and leave a comment.
-- **Chained methods:** Some methods need an ID from a prior call (e.g., `list_workspaces` returns a GID needed by `search_tasks`). Chain them — call the list method first, use the first result's ID.
-- **Tier-gated methods:** If the API has premium endpoints that may not be available on the user's plan, write a `has_<feature>_access` helper that probes the endpoint and returns `False` on 401/402/403. Use `pytest.skip()` in the test if access is unavailable.
-
-See `references/PATTERNS.md` → "Integration Test Patterns (Python)" for concrete examples.
-
-**How to run them:**
-
-1. Ask the contributor to add their API key to `.env` (e.g., `ASANA_API_KEY=xxx`). The `.env` file is already in `.gitignore` and `.mcpbignore`. The contributor was asked for this key at the start of the process — they should have it ready.
-2. Run `make test-integration`.
-3. All tests should pass or skip (for tier-gated features). Fix any failures before proceeding.
-4. If the contributor says auth setup is too complex for now (e.g., OAuth flows, multi-step app configuration), proceed — the tests are written and ready to run later. Do not skip writing the tests.
-
-### LLM smoke tests
-
-**Python:**
-```bash
-make test-llm                 # needs <NAME>_API_KEY + ANTHROPIC_API_KEY in .env
-```
-
-Verify Claude Haiku selects the correct tool given the skill resource. Requires both the service API key and `ANTHROPIC_API_KEY`.
-
-**How to write them:** The template scaffolds `get_server_context()` and `get_anthropic_client()` — leave those as-is. Replace the commented-out test stub with 3–5 real tests, one per key tool. Extract a `call_llm()` helper to avoid repeating the system prompt construction across tests.
-
-Each test sends a natural language prompt and asserts the LLM selected the expected tool. Include concrete values for any required parameters in the prompt (IDs, coordinates, dates) — without them, the LLM will ask for clarification instead of calling the tool.
-
-See `references/PATTERNS.md` → "LLM smoke tests" for the `call_llm()` helper pattern and the concrete-identifiers rule.
-
-**How to run them:**
-
-1. Ask the contributor to add `ANTHROPIC_API_KEY` to `.env` alongside the service API key.
-2. Run `make test-llm`.
-3. All tests should pass. If a test fails because the LLM picked the wrong tool, adjust the prompt to be more specific before touching the SKILL.md.
-4. If the contributor does not have an `ANTHROPIC_API_KEY`, proceed — the tests are written and ready to run later. Do not skip writing the tests.
-
 ## Gate
 
 **Criteria:**
 - [ ] All tool logic, models, and client code implemented
 - [ ] Linting passes with no errors
 - [ ] Type checking passes with no errors
 - [ ] Unit tests pass (`make check`)
-- [ ] Integration tests written with real assertions (not stubs or TODOs)
-- [ ] Integration tests pass or skip, if API key is available
-- [ ] LLM smoke tests written with real assertions (not stubs or TODOs)
-- [ ] LLM smoke tests pass, if `ANTHROPIC_API_KEY` is available
 
 **If any criterion fails:** Fix the reported issues and re-run checks.
 

@@ -116,13 +116,71 @@ The response should include a resource with `uri: "skill://<name>/usage"`.
 
 > **Note:** The embedded skill is encouraged but not mandatory per mpak spec. If no meaningful workflows exist yet (e.g., the server has only 1-2 tools), it's acceptable to skip this phase and add the skill later.
 
+## 5f: Integration & LLM Smoke Tests
+
+### Integration tests
+
+**Python:**
+```bash
+make test-integration         # real API calls (needs <NAME>_API_KEY in .env)
+```
+
+Real API calls against the live service. The template scaffolds `tests-integration/test_core_tools.py` as a stub — you must replace it with real tests.
+
+**How to write them:** Open `api_client.py` and list every public method (skip `__init__`, `close`, `_request`, `_ensure_session`, and dunder methods). For each method, write a test:
+
+- **Read methods** (list, get, search): Call with minimal valid parameters, assert the response has the expected shape (list, dict, or model with expected keys).
+- **Write methods** (create, update, delete): Create a test resource, verify it, then clean it up in a `finally` block. If no delete method exists, mark the resource as completed/archived and leave a comment.
+- **Chained methods:** Some methods need an ID from a prior call (e.g., `list_workspaces` returns a GID needed by `search_tasks`). Chain them — call the list method first, use the first result's ID.
+- **Tier-gated methods:** If the API has premium endpoints that may not be available on the user's plan, write a `has_<feature>_access` helper that probes the endpoint and returns `False` on 401/402/403. Use `pytest.skip()` in the test if access is unavailable.
+
+See `references/PATTERNS.md` → "Integration Test Patterns (Python)" for concrete examples.
+
+**How to run them:**
+
+1. Ask the contributor to add their API key to `.env` (e.g., `ASANA_API_KEY=xxx`). The `.env` file is already in `.gitignore` and `.mcpbignore`. The contributor was asked for this key at the start of the process — they should have it ready.
+2. Run `make test-integration`.
+3. All tests should pass or skip (for tier-gated features). Fix any failures before proceeding.
+4. If the contributor says auth setup is too complex for now (e.g., OAuth flows, multi-step app configuration), proceed — the tests are written and ready to run later. Do not skip writing the tests.
+
+### LLM smoke tests
+
+**Python:**
+```bash
+make test-llm                 # needs <NAME>_API_KEY + ANTHROPIC_API_KEY in .env
+```
+
+Verify Claude Haiku selects the correct tool given the skill resource. Requires both the service API key and `ANTHROPIC_API_KEY`.
+
+**How to write them:** The template scaffolds `get_server_context()` and `get_anthropic_client()` — leave those as-is. Replace the commented-out test stub with 3–5 real tests, one per key tool. Extract a `call_llm()` helper to avoid repeating the system prompt construction across tests.
+
+Each test sends a natural language prompt and asserts the LLM selected the expected tool. Include concrete values for any required parameters in the prompt (IDs, coordinates, dates) — without them, the LLM will ask for clarification instead of calling the tool.
+
+See `references/PATTERNS.md` → "LLM smoke tests" for the `call_llm()` helper pattern and the concrete-identifiers rule.
+
+**How to run them:**
+
+1. Ask the contributor to add `ANTHROPIC_API_KEY` to `.env` alongside the service API key.
+2. Run `make test-llm`.
+3. All tests should pass. If a test fails because the LLM picked the wrong tool, adjust the prompt to be more specific before touching the SKILL.md.
+4. If the contributor does not have an `ANTHROPIC_API_KEY`, proceed — the tests are written and ready to run later. Do not skip writing the tests.
+
+**Working through failures:** These tests can be challenging depending on the target API's auth method, plan-gated endpoints, and rate limits. Work through failures interactively with the contributor:
+- If auth is complex (OAuth, multi-step), help the contributor get a working token and update the test fixtures
+- If endpoints are plan-gated, use the tier-skip pattern (see PATTERNS.md) to gracefully skip inaccessible endpoints
+- If the contributor doesn't have the required API keys or wants to move on, that's fine — these tests are recommended but **not blocking** for initial release
+
 ## Gate
 
 **Criteria:**
 - [ ] Contributor has approved the SKILL.md content
 - [ ] Skill resource is wired in server code
 - [ ] `resources/list` includes `skill://<name>/usage`
+- [ ] Integration tests written with real assertions (not stubs or TODOs)
+- [ ] Integration tests pass (recommended, not blocking)
+- [ ] LLM smoke tests written with real assertions (not stubs or TODOs)
+- [ ] LLM smoke tests pass (recommended, not blocking)
 
-**If any criterion fails:** Revisit the relevant sub-step above.
+**If any criterion fails:** Revisit the relevant sub-step above. For integration/LLM tests, discuss with the contributor whether to fix now or defer.
 
-**When all pass:** Proceed to Phase 6.
+**When all pass (or non-blocking items deferred):** Proceed to Phase 6.