feat(gmail): upgrade to SDK 2.0.0 with full unit + integration test coverage#325
Open
TheRealAgentK wants to merge 7 commits into
Open
feat(gmail): upgrade to SDK 2.0.0 with full unit + integration test coverage#325TheRealAgentK wants to merge 7 commits into
TheRealAgentK wants to merge 7 commits into
Conversation
…on bump)
Source-side upgrade per the upgrading-sdk-v2 skill:
* Import ActionError alongside ActionResult
* Convert all 21 error returns from
`return ActionResult(data={"error": str(e)}, cost_usd=0.0)`
to
`return ActionError(message=str(e))`
* Drop redundant `"result": True` keys from every success-path
return — success/failure now lives on `result.type` (ACTION_SUCCESS
vs ACTION_ERROR) instead of a duplicated payload field
* Strip the matching `"error"` and `"result"` properties (and
their entries in `required`) from every action's output_schema
in config.json — 42 schema keys removed across 21 actions
* Harden the auth lookup (skill gotcha #9) so a missing access_token
surfaces as an upstream auth error rather than a KeyError:
`context.auth.get("credentials", {}).get("access_token", "")`
* Bump config.json version 0.1.0 → 2.0.0 (major bump for the
SDK breaking change)
* requirements.txt: autohive-integrations-sdk~=1.0.2 → ~=2.0.0
No `context.fetch()` work needed — gmail uses Google's
`googleapiclient.discovery.build()` directly, so the FetchResponse
breaking change has no impact on the source.
Validation:
✅ validate_integration.py — 0 errors, 1 warning (unused-scopes
false positive on gmail.modify)
✅ check_code.py — passed
✅ ruff check / ruff format — clean
Refs #324
Removes the legacy `tests/context.py` shim and the placeholder
`test_gmail.py`, replaces them with a `tests/conftest.py` that
matches the current writing-unit-tests skill conventions:
* sys.path setup so test files can use plain `from gmail import gmail`
* mock_context fixture override pre-loaded with the
PlatformOauth2 envelope Gmail expects, so every test in this
directory inherits credentials of the right shape
Refs #324
Adds gmail/tests/test_gmail_unit.py with 56 tests covering the full
integration surface:
* 7 helper tests for create_email_message:
- plain text body, plain text + attachments
- HTML body sanitisation (script tags stripped, javascript:
protocol blocked, allowed tags preserved)
- multipart/alternative structure with text+html parts
- multipart/mixed structure for HTML + attachments
* 1 service-build test (verifies hardened auth lookup does not
KeyError on missing access_token)
* 48 action tests across all 21 actions:
- mark_emails_as_read, mark_emails_as_unread, archive_emails
- get_user_info, read_email
- read_inbox (default scope, unread scope -> Gmail q filter,
pagination round-trip)
- read_all_mail
- send_email (text, HTML sanitised, exception path)
- reply_to_thread (with original message header lookup)
- list_labels, create_label
- add_labels_to_emails, remove_labels_from_emails
- list_emails_by_label
- get_thread_emails
- create_draft (new draft + reply-mode with thread/message id)
- update_draft, list_drafts, get_draft, send_draft, delete_draft
Each action gets at minimum:
* Happy path (verifies result.type == ResultType.ACTION + key data)
* Exception path (verifies result.type == ResultType.ACTION_ERROR
+ the original error message is preserved on the ActionError)
* Request-shape verification where it adds value (label add/remove
body, query filters, threading IDs)
Mock pattern note: Gmail uses googleapiclient.discovery.build directly
rather than context.fetch, so the standard mock_context.fetch fixture
is not the right tool. Tests use `patch("gmail.gmail.build")` to
inject a MagicMock service that mimics the chained
service.users().messages().<verb>().execute() pattern.
Validation:
✅ 56 tests pass
✅ ruff check / ruff format (with tooling ruff.toml) — clean
✅ validate_integration.py — 0 errors
✅ check_code.py — passed
Refs #324
…verage - 8 read-only tests (profile, labels, inbox/all-mail, drafts, threads) - 5 destructive lifecycle tests (label CRUD, send/archive, draft CRUD, send-from-draft, reply-to-thread) — all marked @pytest.mark.destructive with explicit docstrings describing what is created/modified - Uses live_context fixture (PlatformOauth2 envelope) — gmail uses googleapiclient directly, not context.fetch - Supports GMAIL_TEST_THREAD_ID / GMAIL_TEST_MESSAGE_ID env overrides; otherwise picks most-recent inbox message dynamically - .env.example documents GMAIL_ACCESS_TOKEN and optional overrides
🔍 Integration Validation ResultsCommit: Changed directories:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cebb3f69c6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Addresses code-quality bot feedback on PR #325. Gmail uses googleapiclient directly, so FetchResponse is never referenced.
Addresses Codex review on PR #325. Prepending the gmail/ integration dir to sys.path could shadow the gmail package with the gmail.py module file, breaking 'from gmail.gmail import ...'. The repo-root conftest.py already puts the workspace root on sys.path and patches Integration.load() from the caller frame, so no per-integration sys.path tweaking is needed.
…-recipient bug Addresses Codex review on PR #325. - New module-level build_raw_email() consolidates raw RFC822 payload construction previously duplicated across SendEmail, ReplyToThread, CreateDraft, and UpdateDraft (3 near-identical _create_raw_email methods plus an inline block in ReplyToThread). - Fixes a latent bug in ReplyToThread that assumed 'to' and 'cc' were always lists. With a string value, recipients.extend(inputs['to']) / ', '.join(inputs['cc']) silently character-split the address into per-letter recipients, producing malformed headers. The new helper routes both forms through _normalize_addresses so a string is always treated as a single recipient. - Helper accepts extra_to (used by ReplyToThread to prepend the original sender), subject_override (for 'Re: ...'), and in_reply_to/references for threading. Callers compute the final References string so each existing behavior (append vs pass-through) is preserved verbatim. - Adds a TestBuildRawEmail unit test class covering the regression (string vs list recipients), extra_to prepending, subject override, the 'me' From sentinel, and threading header behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #324
Summary
Brings the freshly-migrated
gmail/integration into compliance with three skills:upgrading-sdk-v2writing-unit-testswriting-integration-testsCommits (logical)
feat(gmail): upgrade to SDK 2.0.0— bumpsrequirements.txtto~=2.0.0andconfig.jsonversion to2.0.0. Switches all error returns fromActionResult(data={"error": ...})toActionError(message=...). Drops the"result": Truesuccess flag and removeserror/resultproperties from every output schema (Option A — rely on the SDK envelope'sresult.typeinstead). Hardens auth lookup with.get(...)chains.chore(gmail): replace tests scaffolding with conftest.py— drops legacytests/context.py+ placeholdertest_gmail.py, addstests/conftest.pywith themock_contextfixture using the PlatformOauth2 auth envelope.test(gmail): add unit test suite covering all 21 actions— 56 unit tests. Patchesgmail.gmail.buildto inject a chained MagicMock Gmail service (gmail usesgoogleapiclientdirectly, notcontext.fetch).test(gmail): add integration test suite with destructive lifecycle coverage— 8 read-only + 5 destructive lifecycle tests (label CRUD, send/archive, draft CRUD, send-from-draft, reply-to-thread). Destructive tests are marked@pytest.mark.destructivewith explicit docstrings describing exactly what they create/modify. Uses dynamic discovery for thread/message IDs withGMAIL_TEST_THREAD_ID/GMAIL_TEST_MESSAGE_IDenv overrides.Notes
context.fetch()— it usesgoogleapiclient.discovery.build(). The SDK 2.0FetchResponsebreaking change therefore does not affect source code.bleachpinning is intentionally left as follow-up.Validation
Running integration tests