Retrieval-only backend for Arbitrum Stylus ecosystem research.
It indexes official docs, Stylus blog posts, and curated community repos, then returns
context + references + agent_guidance to downstream LLM consumers (MCP, IDE tools, web chat).
- Returns references-first context for Stylus questions.
- Emits
agent_guidancethat setscode_generation=disallowed. - Does not synthesize contract/application code.
- For porting-auditor requests with a GitHub target URL, performs static Solidity signal extraction on the target repository/files and injects those findings into the returned context.
GET /healthGET /skillsPOST /skills/{skill_id}/searchPOST /feedback(thumbs up/down for a prompt + response, feeds logs and optional RAG booster)POST /platform-feedback(captures general platform feedback entries and logs them for later review)POST /openrouter/chat/completions(server-side OpenRouter proxy; keeps API key off the frontend)GET /admin/platform-feedback(requires admin token; streams the most recent platform feedback lines)POST /admin/auth(exchange admin password for a short-lived bearer token)GET /admin/logs/{request|ingestion|stats}/paginate(paged log/text slice)GET /admin/logs/{request|ingestion|stats}/stream(stream entire log file)- Conversation capture endpoints:
POST /conversations/start-> returnssession_idPOST /conversations/{session_id}/turn-> append prompt/response (+optional rating/skill/metadata)GET /conversations/{session_id}-> fetch threadGET /admin/conversations/export(admin token) -> export rated turns for retraining- Shortcut: user-facing search endpoints (
/stylus-chat,/stylus-porting-audit,/skills/{id}/search) auto-create a session on first call and returnX-Session-Idresponse header; clients should resend that header to keep appending turns. - Rated turns (
rating=1) are indexed into Chroma alongside feedback so retrieval can surface high-signal user-approved answers; hits from the sameX-Session-Idare boosted during ranking.
Skill metadata contract (GET /skills):
system_prompt: canonical prompt loaded fromskills/<id>/agents/openai.yaml#default_promptprompt_source: explicit source path for traceabilityskill_doc_path: path to the published skill instructions (SKILL.md)behavior_hash: SHA-256 fingerprint over the published skill behavior files
Consumers should use system_prompt from /skills (instead of frontend-local prompt text) to keep behavior consistent with published skills.
Compatibility aliases:
POST /stylus-chat-> research skillPOST /stylus-porting-audit-> porting auditor skill
Request:
{ "prompt": "What tooling is current for Stylus testing?" }Response (example):
{
"found": true,
"as_of_date": "2026-02-25",
"context": "Top references:\n1. ...",
"chunks_used": 25,
"query_mode": "tooling",
"quality_signals": {
"confidence": "high",
"time_sensitive": false,
"evidence_profile": {
"official_count": 2,
"community_count": 4,
"canonical_count": 1,
"unique_domains": 3
}
},
"answer_contract": {
"format": "direct_answer_why_links",
"length_target_lines": "10-20",
"uncertainty_mode": "state_uncertainty_plus_best_bet",
"audience": "builder_engineer"
},
"recommended_answer_outline": {
"direct_answer": "...",
"why": ["..."],
"links": [{ "title": "...", "url": "...", "source_type": "official" }],
"caveats": []
},
"agent_guidance": {
"behavior": "references_first",
"code_generation": "disallowed"
},
"references": [{ "title": "...", "url": "..." }]
}Note: the extended quality fields in this example are produced by
/skills/sift-stylus-research/search. Other skill endpoints may return only the core fields.
OpenRouter proxy request (example):
{
"model": "openai/gpt-4o-mini",
"messages": [{ "role": "user", "content": "What are the newest Stylus tools?" }],
"tools": [],
"tool_choice": "auto"
}- Endpoint:
POST /feedback - Payload:
prompt(string),response(string),rating(-1 | 0 | 1), optionalskillandmetadata(dict). - Side effects:
- Appends every event to
logs/feedback_events.jsonl(respectsLOG_DIRenv override). - Positive ratings (
1) are also added to Chroma collectionstylus_feedbackfor retrieval enrichment.
- Appends every event to
- Example:
curl -X POST http://localhost:8001/feedback \
-H "content-type: application/json" \
-d '{
"prompt":"How do I test Stylus contracts?",
"response":"Use cargo stylus test ...",
"rating":1,
"skill":"sift-stylus-research",
"metadata":{"client":"cli"}
}'- Platform feedback:
POST /platform-feedbacklets clients submit free-form messages, optional categories, and metadata; entries append tologs/platform_feedback.jsonl(override path withPLATFORM_FEEDBACK_LOG_PATH). Administrators can fetch those entries viaGET /admin/platform-feedbackwhen authenticated.
# one-time setup
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# refresh data + rebuild Chroma
python src/run_all_data_ingestions.py
# serve the API
uvicorn main:app --app-dir src --host 0.0.0.0 --port 8001Notes:
- Ingestion requires outbound internet access to GitHub, Arbitrum docs/blog, and OpenZeppelin docs.
- Logs land in
logs/ingestion_logs.log; seesrc/basic_logs.py. - Full pipeline details live in
src/ingestion/README.md.
To enable the LLM proxy endpoint:
export OPENROUTER_API_KEY=...Admin auth & protected logs:
POST /admin/authexpects{ "password": "..." }and returns a signed bearer token withexpires_in(seconds). Tokens are HMAC-SHA256 signed usingADMIN_BEARER_TOKENand include an expiry set byADMIN_TOKEN_TTL_SECONDS(default 3600).- Store the password hash in env as base64(SHA256(password)). Quick helper:
python3 - <<'PY'
import hashlib, base64, getpass
p = getpass.getpass('Admin password: ')
print(base64.b64encode(hashlib.sha256(p.encode()).digest()).decode())
PY- Log endpoints require
Authorization: Bearer <token>and expose three sources:request->logs/request_logs.logingestion->logs/ingestion_logs.logstats->logs/ingestion_stats.json
.env.example documents the runtime contract:
HOST/PORTfor API bind addressCORS_ORIGINSfor allowed frontend originsOPENROUTER_API_KEYfor server-side LLM proxyingGITHUB_TOKENfor ingestion scrapingADMIN_HASHED_PASSWORDbase64(SHA256(...)) used by/admin/authADMIN_BEARER_TOKENsigning secret for issued bearer tokensADMIN_TOKEN_TTL_SECONDS(optional) validity window for issued admin tokens (default3600)
Runtime note:
- On startup, backend auto-loads missing env vars from
.envcandidates (current backend repo/worktree, workspace root, and siblingbackend/frontendrepos/worktrees) without overriding already-exported shell variables.
Repo-level checks:
python -m pytestpytest now runs with coverage reporting and an 80% fail-under gate for backend runtime modules (configured via pytest.ini + .coveragerc).
Workspace-level check (if using paired workspace scripts):
./scripts/qa-backend.sh setup-dev-env 8001This runs:
- Python compile check
pytestsuite- health probe
/skills/{skill_id}/searchsmoke request
Run directly from this repo:
docker network create stylus-dev-net 2>/dev/null || true
docker compose up -d --buildStop:
docker compose down --remove-orphansHealth checks:
curl http://localhost:8001/health
curl -X POST http://localhost:8001/openrouter/chat/completions \
-H "content-type: application/json" \
-d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"ping"}]}'src/run_all_data_ingestions.pynow rebuilds Chroma after ingestion.src/debug_chroma_query.pyis a manual utility, not a pytest module.
This repo contains Codex skills under skills/:
sift-stylus-porting-auditorsift-stylus-researchsift-stylus-code-helper
Install all from one CLI command:
npx sift-stylus \
--repo getFairAI/angel-stylus-coding-assistantInstall one skill only:
npx sift-stylus \
--repo getFairAI/angel-stylus-coding-assistant \
--skills sift-stylus-researchInstaller package source:
tools/sift-stylus-skills-installer/
docs/deployment-and-proxy.mdfor architecture, security model, and deployment flow.