Skip to content

feat: validate credentials after config init#1151

Open
dc-bytedance wants to merge 7 commits into
larksuite:mainfrom
dc-bytedance:feat/config-probe
Open

feat: validate credentials after config init#1151
dc-bytedance wants to merge 7 commits into
larksuite:mainfrom
dc-bytedance:feat/config-probe

Conversation

@dc-bytedance
Copy link
Copy Markdown

@dc-bytedance dc-bytedance commented May 28, 2026

Summary

After lark-cli config init saves credentials, validate them against the open platform so that users get immediate, hard-to-miss feedback when they typed the wrong App ID or App Secret, instead of discovering it later from a failed business request. The check is best-effort: it stays silent on anything ambiguous (network errors, server 5xx, unknown transport failures, timeout) so valid configurations are never disturbed by upstream noise.

Changes

  • Add internal/credential/tat_fetch.go::FetchTAT — a pure-HTTP TAT request helper (no config / no keychain access). Returns typed errs.* errors per the §984 contract: *errs.AuthenticationError for HTTP 401/403 or any non-zero TAT body code, *errs.NetworkError for transport failures and HTTP 5xx (with CauseKind: "5xx" + Retryable: true), *errs.InternalError for 4xx-non-401/403 / JSON parse failures / marshal failures / empty-token-on-success.
  • Refactor internal/credential/default_provider.go::doResolveTAT to delegate to FetchTAT — observable behaviour unchanged for existing TAT consumers; the underlying transport path now yields typed errors which propagate up the credential chain.
  • Add cmd/config/init_probe.go::runProbe — best-effort post-init validator, bounded by a single 3-second context.WithTimeout. When FetchTAT returns *errs.AuthenticationError (a deterministic credential-rejection signal — any non-zero TAT body code, or HTTP 401/403), runProbe returns a re-wrapped *errs.AuthenticationError so config init exits 3 with a structured authentication error envelope. The re-wrap carries a jargon-free message (configured credentials may be invalid: please verify the App ID and App Secret) + an actionable Hint, preserves the upstream Code, and chains the original via Cause. Every other outcome — ambiguous failures (NetworkError / InternalError: transport, 5xx, parse, timeout, http-client init) and probe-endpoint outcomes — returns nil, keeping config init silent and exit-0.
  • Wire runProbe into all four cmd/config/init.go paths (non-interactive, --new, interactive TUI, legacy readline) as if err := runProbe(...); err != nil { return err }. Probe is skipped when the user reused an existing secret (no fresh plaintext in memory). The saved config is intentionally not rolled back on rejection — the stdout envelope records what was saved, while exit 3 + stderr report that it does not authenticate.
  • Tests:
    • internal/credential/tat_fetch_test.go covers success, body-code rejection, HTTP 401/403, HTTP 5xx, non-401/403 4xx (400/404/415/429), transport error, parse error, empty-token-on-success, brand routing, context cancel — each asserts the specific typed error and key fields.
    • cmd/config/init_probe_test.go (15 black-box tests) covers the real-world TAT body codes (10014 "invalid app secret", 10003 "invalid param"), any-non-zero-body-code coverage (e.g. 99999 → still propagates), HTTP 401/403, silent-and-nil ambiguous cases (HTTP 5xx, transport error, parse failure, http-client init error, timeout), probe request shape (URL, method, Bearer header, JSON body with from: "lark-cli/<version>"), and brand routing. Each credential-rejection case asserts a propagated *errs.AuthenticationError with the expected Code; an invariant asserts runProbe writes nothing to stderr.

Test Plan

  • make unit-test passed (cmd/config incl. 15 TestRunProbe_* + internal/credential TestFetchTAT_*, under -race)
  • make build + go vet ./... + gofmt clean
  • E2E 7/7 in tests_e2e/config/ passed — verified via LARK_CLI_BIN=$PWD/lark-cli go test ./tests_e2e/config/ against the live endpoint, including TestConfigInit_CredentialRejection_ExitsThree (real endpoint rejects a fake App ID → exit 3 + typed authentication envelope). The Docker eval sandbox was attempted first but failed at its own auth status setup gate (environmental, unrelated to this change), so the E2E suite was run directly against the real binary instead.
  • acceptance-reviewer re-run for this exit-code reversal: 5/5 scenarios passed against the live endpoint. Verified: (1) invalid creds → exit 3 + typed type: authentication envelope with the jargon-free message, stdout still carries the saved-config JSON; (2) network-unreachable → silent, exit 0 (regression guard); (3) --brand lark routes to open.larksuite.com (confirmed at the CONNECT layer); (4) real App ID + wrong secret → exit 3 code 10014; (5) §2.6 vocabulary clean in the envelope and both source files. Reviewer judged exit-3 + typed envelope a clear improvement over a missable stderr warning, and consistent with the rest of the CLI's auth-failure contract (a token-rejected im +chat-list exits 3 with the same envelope shape).
  • manual verification against the live TAT endpoint:
    • App Secret wrong (real App ID + wrong secret): exit 3, stderr structured type: authentication envelope (code 10014), stdout still records the saved config JSON
    • App ID nonexistent (random ID + any secret): exit 3, type: authentication envelope (code 10003)
    • Network unreachable (HTTPS_PROXY=http://127.0.0.1:1): exit 0, stdout JSON, no error (transport failure stays silent per design)
    • --brand lark with wrong secret: exit 3, type: authentication envelope (lark host routing confirmed at the CONNECT layer)

Consumer note

When config init exits non-zero, key off the exit code, read stdout as the config record and stderr as the diagnostic. On a credential rejection, stdout still carries the saved-config JSON (the config was written) while stderr carries the typed error envelope and exit is 3 — these are not contradictory (saved ≠ authenticates). Do not parse stderr as a single JSON document: it may contain a leading OK: Configuration saved line before the error envelope.

Related Issues

N/A

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 28, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions Bot added the size/L Large or sensitive change across domains or core paths label May 28, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Centralizes tenant-access-token exchange in FetchTAT, adds runProbe (3s timeout) invoked after saving app credentials in init flows, returns an AuthenticationError for deterministic credential rejections, and otherwise silently attempts a probe POST. Also refactors provider to use FetchTAT and adds comprehensive tests.

Changes

Post-Config Credential Probing

Layer / File(s) Summary
TAT fetch core function and error mapping
internal/credential/tat_fetch.go
FetchTAT exchanges app credentials for tenant access tokens with comprehensive error typing: request/JSON errors become InternalError, transport failures become retryable NetworkError, HTTP 401/403 become AuthenticationError, HTTP 5xx become retryable NetworkError, and response decode/validation errors map to appropriate error subtypes. Supports brand-based URL routing.
TAT fetch test suite
internal/credential/tat_fetch_test.go
Tests cover successful token retrieval, request payload validation, authentication error classification (response codes and HTTP 401/403), network error handling, JSON parse failures, brand-specific routing, and context cancellation. Uses stubRoundTripper and urlRewriteRT test helpers.
Default provider TAT fetch refactor
internal/credential/default_provider.go
Refactors existing TAT resolution to use the new FetchTAT function, removing inline HTTP wiring and JSON decode logic while maintaining caching via sync.Once. Removes unused imports.
Probe invocation function and error detection
cmd/config/init_probe.go
runProbe performs TAT fetch within a timeout, re-wraps deterministic authentication rejections into a returned AuthenticationError with an actionable message/hint, and silently issues a brand-routed probe POST using the TAT as a bearer token; probe outcomes are ignored and other errors are silent.
Probe test suite with request verification
cmd/config/init_probe_test.go
Tests validate TAT error outcomes (API codes and HTTP 401/403 return AuthenticationError; 5xx and transport errors are silent), probe request contracts (HTTP POST, correct URL, Authorization header, from with build.Version), brand routing, client initialization failures, and timeout enforcement. Includes fakeRT, jsonResp, and fakeFactory helpers.
Config init flow integration
cmd/config/init.go
Adds runProbe calls after config save in four paths: non-interactive --app-id/--app-secret, --new app creation, interactive TUI (only when new secret), and legacy readline fallback (only when secret input non-empty).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

feature

Suggested reviewers

  • liangshuo-1
  • Roy-oss1

Poem

🐰 I hop through code where tokens gleam,
I fetch a TAT and softly scheme.
If creds are wrong I raise a bell,
If nets are flaky, I say well—
A quiet probe, then back to dream.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding credential validation after config init, which is the primary purpose of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description is comprehensive and well-structured, covering all required template sections: Summary clearly describes motivation, Changes lists all main modifications with technical detail, Test Plan documents thorough verification including unit tests and manual verification, and Related Issues is addressed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/config/init_probe_test.go`:
- Around line 63-79: The test builds a Factory directly in fakeFactory; replace
that with the repo-standard TestFactory to ensure proper test isolation: in the
shared setup call t.Setenv("LARKSUITE_CLI_CONFIG_DIR", t.TempDir()) and create
the factory via cmdutil.TestFactory(t, config) instead of manually constructing
&cmdutil.Factory and HttpClient; update fakeFactory (and the similar block at
the other occurrence around lines 255-266) to return the factory and error
buffer from cmdutil.TestFactory so tests use the canonical TestFactory and
isolated config dir.

In `@internal/credential/tat_fetch.go`:
- Around line 23-31: The comment block describing error classification in
internal/credential/tat_fetch.go is out of sync: update the line that currently
states “other non-2xx → *errs.NetworkError” to match the runtime behavior where
non-2xx responses (excluding 401/403 and 5xx) are returned as
*errs.InternalError (subtype "sdk_error"); keep the existing rules for
*errs.AuthenticationError (401/403 or body code via Problem.Code) and
*errs.NetworkError (5xx or transport failures wrapped via Cause) so callers see
the same contract as the code path implemented around the non-2xx handling in
the block that returns *errs.InternalError (see the code near the current return
sites for *errs.InternalError in the 112-119 region).
- Around line 137-147: When parsed.Code == 0 you must also validate
parsed.TenantAccessToken is non-empty; if TenantAccessToken is empty, return an
internal SDK error instead of a successful empty token. Update the success
branch in tat_fetch.go (the block that currently returns
parsed.TenantAccessToken) to check parsed.TenantAccessToken (and/or
parsed.TenantAccessToken == "") and return an errs.InternalError (populate
errs.Problem with Category: errs.CategoryInternal, an appropriate Subtype such
as errs.SubtypeSDK, and a clear Message like "missing tenant_access_token on TAT
API success") so callers fail fast with a clear cause.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d60a3a6b-dc9b-4969-b50f-cab448016cba

📥 Commits

Reviewing files that changed from the base of the PR and between b91f6a2 and 1af43d8.

📒 Files selected for processing (6)
  • cmd/config/init.go
  • cmd/config/init_probe.go
  • cmd/config/init_probe_test.go
  • internal/credential/default_provider.go
  • internal/credential/tat_fetch.go
  • internal/credential/tat_fetch_test.go

Comment thread cmd/config/init_probe_test.go
Comment thread internal/credential/tat_fetch.go Outdated
Comment thread internal/credential/tat_fetch.go
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 28, 2026

🚀 PR Preview Install Guide

🧰 CLI update

npm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@72dabf540f8f3583a1888bdd93f56e821b1d28c0

🧩 Skill update

npx skills add dc-bytedance/cli#feat/config-probe -y -g

Drop the ad-hoc *HTTPStatusError and classify TAT outcomes via the
canonical errs.* taxonomy:

  * HTTP 401/403 + non-zero TAT body code → *errs.AuthenticationError
    with Problem.Code carrying the upstream signal
  * HTTP 5xx                              → *errs.NetworkError with
                                             CauseKind "5xx" and
                                             Retryable=true
  * 4xx other than 401/403, JSON parse,
    or 2xx with empty tenant_access_token → *errs.InternalError with
                                             subtype "sdk_error"
  * Transport / DNS / TLS / context cancel → *errs.NetworkError wrapping
                                              the underlying cause

classifyTATError now extracts the typed signal via errors.As, removing
the parallel apiCode return.

Test factory switched to cmdutil.TestFactory(t, nil) with
LARKSUITE_CLI_CONFIG_DIR=t.TempDir() per repo guideline, then HttpClient
overridden with stubRoundTripper to keep per-test response control.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

❌ Patch coverage is 77.27273% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.82%. Comparing base (a2cc5e1) to head (72dabf5).

Files with missing lines Patch % Lines
internal/credential/tat_fetch.go 81.81% 14 Missing and 2 partials ⚠️
cmd/config/init.go 0.00% 6 Missing ⚠️
cmd/config/init_probe.go 83.33% 3 Missing and 3 partials ⚠️
internal/credential/default_provider.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1151      +/-   ##
==========================================
+ Coverage   68.77%   68.82%   +0.04%     
==========================================
  Files         628      630       +2     
  Lines       58670    58777     +107     
==========================================
+ Hits        40353    40455     +102     
  Misses      15021    15021              
- Partials     3296     3301       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Previously a deterministic credential-rejection signal from the post-init
probe was printed as a best-effort stderr warning with exit 0. An AI-agent
caller that keys off exit code and stdout could miss it entirely.

Propagate the *errs.AuthenticationError up through RunE instead: config init
now exits 3 with a structured authentication error envelope when the saved
App ID / App Secret are rejected by the open platform. The error is
re-wrapped with a jargon-free message and an actionable hint, preserving the
upstream code and chaining the original cause.

Scope is unchanged from the empirical-lane policy: only AuthenticationError
(any non-zero TAT body code or HTTP 401/403) propagates. Ambiguous failures
(transport, 5xx, JSON parse, timeout, http-client init) stay silent and
non-blocking, so valid configurations are never disturbed by upstream noise.

The saved config is intentionally not rolled back; the stdout envelope still
records what was saved, while exit 3 + stderr report that it does not
authenticate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large or sensitive change across domains or core paths

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants