From Reddit/forum complaints → evidence-backed project ideas → build-ready Codex prompts.
TaskSignal is an AI-assisted engine that mines public developer and community discussions, detects concrete repetitive tasks people complain about, clusters similar pain signals, scores software opportunities, and generates Codex-ready MVP prompts.
TaskSignal is a portfolio-ready MVP built by Yurii Bakurov. It is designed for one local operator on their own machine: fixture data works out of the box, a local workspace profile stores that user's research defaults, and repeatable API-backed workflows can be enabled for supported public sources when credentials are provided.
Current public posture: TaskSignal is an early public application repository, not a widely adopted package. Its strongest evidence today is reproducibility, release hygiene, CI, security/privacy documentation, contributor issues, and a browser-verified demo flow. See the demo evidence snapshot and Codex for OSS evidence for the current review package.
Useful starting points:
- Product context
- Architecture
- API reference
- Demo evidence snapshot
- Deployment notes
- Data ethics
- Source limits and terms
- Model card
- Roadmap
- Threat model
- Maintainer automation plan
- Codex for OSS application evidence
- Changelog
- Contributing guide
- Security policy
Most idea lists are generic. TaskSignal is a task-replacement radar: it looks for specific repeated workflows people hate doing, such as exporting Stripe data into a spreadsheet every Friday and turning it into a client report.
TaskSignal is for maintainers, builders, indie hackers, developer-tool teams, and researchers who want a local-first way to review public pain signals before deciding what to build. It is not for scraping private communities, profiling individuals, spam, outreach automation, or replacing human product judgment.
- Loads demo fixture data with no API keys.
- Stores one local workspace profile with owner/focus/default research settings.
- Saves repeatable research projects with source, query, limit, labels, cadence, last run, next run, and run count.
- Reports integration readiness without exposing secret values.
- Records scan outcomes with found/saved items, detected signals, generated opportunities, and guidance when live data produces no ranked opportunity.
- Normalizes Reddit, Hacker News, GitHub Issues, Stack Exchange, and fixture-style records.
- Stores author hashes instead of raw usernames by default.
- Detects complaints, manual workflows, tool requests, workarounds, buying intent, and confusion.
- Generates local embeddings with
sentence-transformers/all-MiniLM-L6-v2when available. - Falls back to deterministic local vectors when the model is unavailable.
- Clusters signals with a local thematic fallback by default, with optional DBSCAN when
TASKSIGNAL_USE_SKLEARN_CLUSTERING=1. - Scores opportunities using frequency, recency, pain, concreteness, buying intent, feasibility, and competition penalty.
- Generates opportunity cards, full Codex-ready build prompts, and richer Codex task packs.
- Optionally enhances generated prompts through OpenAI API or local Ollama when explicitly configured.
flowchart TD
A[Public sources and fixtures] --> B[Ingestion connectors]
B --> C[Normalizer and deduplicator]
C --> D[(PostgreSQL + pgvector)]
D --> E[Pain and task detector]
E --> F[Embedding service]
F --> G[Thematic fallback clustering / optional DBSCAN]
G --> H[Opportunity scoring]
H --> I[Prompt generator]
I --> J[FastAPI API]
J --> K[Next.js dashboard]
Frontend: Next.js, TypeScript, Tailwind CSS, TanStack Query, Recharts, React Markdown, Zod-ready types.
Backend: FastAPI, Pydantic v2, SQLAlchemy 2, Alembic, PostgreSQL, pgvector, pytest, ruff, scikit-learn.
ML/NLP: sentence-transformers with local-only load when the model cache exists, deterministic fallback vectors, optional DBSCAN clustering, rule-based signal detector.
Infra: Docker Compose, Makefile, GitHub Actions CI, scheduled ingestion template.
cp .env.example .env
make doctor
make upOpen the frontend at http://localhost:3000, go to Projects, save a research workflow, then run it. For a first proof path, go to Dashboard and click Process demo data. To use live public data, choose a source, query, and limit in Live source, then click Run scan.
If setup fails or a fresh checkout looks incomplete, run:
make doctormake doctor checks the required files, local .env, Python, Node 20+, npm,
repo-local Python dev tools, fixture files, and whether generated files are
accidentally tracked. Docker is only required for the Compose quickstart.
API health check:
curl http://localhost:8000/healthRun the API and frontend separately:
cd apps/api
../../.venv/bin/uvicorn app.main:app --reloadcd apps/web
npm run devRun checks before publishing changes:
make test
make lint
make verifyThe Makefile prefers repo-local Python tools in .venv/bin. On Apple Silicon
macOS it also prepends Homebrew Node 20 from /opt/homebrew/opt/node@20/bin
when available, matching the runtime required by the Next.js web app.
Run the release-readiness gate before tagging a release:
make release-checkRun the first-run smoke check to verify the credential-free fixture path against a temporary database, including dashboard route wiring and task-pack export:
make smokeTo also boot the Next.js dev server and request /dashboard, run:
apps/api/.venv/bin/python -u scripts/first_run_smoke.py --with-web-serverUse the local CLI for headless operation:
scripts/tasksignal_cli.py readiness
scripts/tasksignal_cli.py configure-workspace --owner "Local Builder" --goal "Find developer-tool opportunities" --source hackernews --query ask --cadence daily
scripts/tasksignal_cli.py create-project --name "Track CI/CD pain" --source hackernews --query ask --cadence daily
scripts/tasksignal_cli.py run-due
scripts/tasksignal_cli.py task-pack <opportunity-id> --output task-pack.mdTaskSignal does not require multi-user accounts for this local mode. The local workspace profile is a singleton in the app database and is meant for the person running the app on that machine.
TaskSignal is currently an application repository, not a published Python or npm library. Use the source checkout or Docker Compose workflow above. Reusable packages may be split out later if a stable library boundary emerges.
For a quick public review, inspect:
- Latest release
- Open contributor issues
- Release-readiness workflow
- Demo evidence snapshot
- Threat model
apps/api FastAPI backend, ML pipeline, database models, tests
apps/web Next.js dashboard, opportunity views, prompt export UI
data Demo fixtures for local-first processing
docs Architecture, API, deployment, ethics, and model notes
notebooks Classifier training and evaluation workbooks
Fixture mode is the default. It loads records from data/fixtures, processes them end to end, and should generate at least five opportunity cards:
- AI-generated code audit tool
- Early-stage SaaS lead/community signal radar
- Simple onboarding drop-off analyzer
- GitHub Actions workflow debugging assistant
- Spreadsheet-to-report automation helper
Live scans use official APIs and keep the same local-first scoring/generation pipeline as fixture mode. The unauthenticated POST /api/scans endpoint is restricted to public API-safe sources (fixture and hackernews) so network callers cannot spend server-side credentials or retrieve data visible to server-side tokens.
Trusted operators can still configure the internal connector pipeline with source credentials when running controlled jobs outside the public endpoint:
REDDIT_CLIENT_ID,REDDIT_CLIENT_SECRET,REDDIT_USER_AGENTGITHUB_TOKENSTACK_EXCHANGE_KEY
Hacker News works without credentials through the public Firebase API. GitHub and Stack Exchange can run without keys at lower rate limits. Reddit requires OAuth credentials. No paid LLM key is required. LLM_PROVIDER=none is the default.
Connector credentials belong in environment variables, not source registry
records. Source registry write endpoints require OPERATOR_SCAN_TOKEN, reject
secret-like config_json keys, and read endpoints return redacted config so
local rows cannot expose token values.
PUBLIC_SCAN_SOURCES can narrow the public endpoint further, for example to hackernews only. Credentialed sources such as GitHub, Reddit, and Stack Exchange stay reserved for trusted internal scan jobs.
Browser-triggered runs of credentialed sources are available through saved
research projects only when OPERATOR_SCAN_TOKEN is configured on the API and
the same token is entered locally in the Projects or Integrations page. This
keeps hosted deployments from silently spending server-side credentials while
still letting trusted local operators connect APIs.
Saved projects support manual, hourly, daily, weekly, and custom-hour cadences.
TaskSignal does not hide a scheduler inside the web process. Run due projects
from the Projects page, scripts/tasksignal_cli.py run-due, cron, GitHub
Actions, or another explicit worker.
Optional prompt enhancement uses LLM_PROVIDER=openai plus OPENAI_API_KEY, or
LLM_PROVIDER=ollama plus a local Ollama server. Browser-triggered enhancement
requires OPERATOR_SCAN_TOKEN on the API and the matching
X-Operator-Scan-Token request header so network callers cannot spend
server-side model credentials. ChatGPT/Codex subscriptions do not provide
backend API credentials; TaskSignal supports subscription users by exporting
task packs they can open in their own signed-in Codex app, CLI, IDE extension,
or Codex web session.
Destructive fixture resets require DEMO_RESET_TOKEN and the matching X-Demo-Reset-Token request header. The normal dashboard demo-processing action is non-destructive by default.
Each opportunity can export:
- A generated Codex prompt.
- An evidence bundle.
- A Codex task pack with objective, suggested MVP, score, evidence, acceptance criteria, privacy constraints, and recommended Codex flow.
Task packs are designed for users who want to use their own signed-in Codex app,
CLI, IDE extension, or Codex web session. They do not spend ChatGPT/Codex plan
usage from the TaskSignal backend. A repo-local skill package is available at
skills/tasksignal-opportunity-builder for agents that can load Codex-style
skills.
The MVP uses transparent rules first. It scores pain phrases, repetition phrases, tool requests, buying intent, and task concreteness hints. Embeddings use sentence-transformers/all-MiniLM-L6-v2 only when locally available; otherwise deterministic vectors keep the demo working.
opportunity_score =
0.25 * frequency_score
+ 0.20 * recency_score
+ 0.20 * pain_intensity_score
+ 0.15 * task_concreteness_score
+ 0.10 * buying_intent_score
+ 0.10 * feasibility_score
- 0.10 * competition_penalty
TaskSignal is designed for public-data research, product discovery, and learning. It does not store raw usernames by default, preserves source URLs for attribution, respects API boundaries, and should not be used for spam or harassment workflows.
Before enabling live connectors, review Data ethics, configure API credentials through environment variables or GitHub repository secrets, and avoid committing .env files or exported datasets.
Developers need clearer GitHub Actions failure diagnosis
Problem: teams spend repetitive time reading noisy CI logs, searching YAML errors, and guessing root causes.
Suggested MVP: a CI log summarizer and workflow linter that identifies likely YAML mistakes, dependency failures, and next fixes.
# Build Developers need clearer GitHub Actions failure diagnosis
You are a senior full-stack engineer. Build a working MVP...This repository demonstrates full-stack engineering, API design, Python backend development, TypeScript frontend development, PostgreSQL/pgvector modeling, ML/NLP pipelines, clustering, product scoring, privacy-conscious design, Docker, CI/CD, tests, and technical writing.
- Publish and maintain tagged releases with changelog entries.
- Expand contributor-friendly fixtures, docs, and public issues.
- Add richer source scheduling and rate-limit state after privacy review.
- Add pgvector ANN search in production mode.
- Add reviewer workflow for human labels.
See Roadmap for maintainer tasks, security milestones, and longer-term ideas.
