feat(intelligence): capture User-Agent for traffic classification#128
Merged
Merged
Conversation
…classification Without this we cannot distinguish developers from crawlers. User-Agent is not PII — it is a software identifier, not a person. Stored alongside hashed client_id (ftua:<ip>, same daily TTL), never with raw IP. Enables automated classification: curl/wget/python=script, ethers/viem/web3=developer, Mozilla=browser, *bot/spider=crawler. conversion_targets export now includes user_agent + classification. Captured once per IP per day (first call), so the hot path takes a single extra Redis write. Also adds docs/TRAFFIC_INTELLIGENCE_REPORT.md: volume-based classification of the current 354k calls/day across 2,204 IPs. Key finding: 78% of traffic is 6 crawlers (now 429 abuse-blocked); developers are unidentifiable until this UA field exists. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Satelink-Protocol
added a commit
that referenced
this pull request
Jun 14, 2026
…r job (#129) * fix(security): npm audit fix — clear semver-safe vulns (32→29) Non-breaking npm audit fix. Cleared 2 high + 1 moderate (semver-compatible). Remaining 29 all require breaking major bumps, deliberately NOT applied: - ws via @ethersproject (needs ethers@5.8.0 major) — risky on Polygon mainnet - shell-quote/concurrently 2 criticals (needs concurrently@10 major) — dev-tooling only Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(intelligence): capture User-Agent for traffic classification (#128) feat(intelligence): capture User-Agent in free-tier gate for traffic classification Without this we cannot distinguish developers from crawlers. User-Agent is not PII — it is a software identifier, not a person. Stored alongside hashed client_id (ftua:<ip>, same daily TTL), never with raw IP. Enables automated classification: curl/wget/python=script, ethers/viem/web3=developer, Mozilla=browser, *bot/spider=crawler. conversion_targets export now includes user_agent + classification. Captured once per IP per day (first call), so the hot path takes a single extra Redis write. Also adds docs/TRAFFIC_INTELLIGENCE_REPORT.md: volume-based classification of the current 354k calls/day across 2,204 IPs. Key finding: 78% of traffic is 6 crawlers (now 429 abuse-blocked); developers are unidentifiable until this UA field exists. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * fix(settlement): correct epoch_ledger table and column names in anchor job Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The free-tier gate currently stores only a per-IP daily counter (
ft:<ip>→ int).Method, timing, and User-Agent are captured nowhere — so of 354k calls/day across 2,204
IPs, ~92% of traffic is unclassifiable. We can spot crawlers by volume but cannot find
the developers who can actually pay.
What
User-Agentinfree_tier_gate.js, stored asftua:<ip>with the same dailyTTL as the counter. Written once per IP per day (first call only) → one extra Redis
write on the hot path, not per-request.
classifyUserAgent():curl/python/wget=script,ethers/viem/web3=developer,Mozilla=browser,*bot/spider=crawler.conversion_targetsexport now includesuser_agent+classification.hashed
client_id, never with the raw IP.Report
Adds
docs/TRAFFIC_INTELLIGENCE_REPORT.md— volume-based classification of current livetraffic. Key findings:
single highest-ROI action, and it ships here.
Verify after deploy
Tomorrow's
GET /system/free-tier→conversion_targets[]entries will carryuser_agent+classification. Re-run the report against classified data.🤖 Generated with Claude Code