A sophisticated qualitative analysis framework for conducting in-depth, systematic analysis of website content, structure, and user experience elements.
Website-Q-A is a comprehensive qualitative analysis platform designed to extract, process, and analyze website characteristics across multiple dimensions. The system enables researchers, UX specialists, and analysts to perform detailed qualitative assessments of websites through both automated processing and agent-based analysis workflows.
- 🕷️ BFS Crawler: Traverses entire websites or specific paths with depth control.
- 📊 Multi-Dimensional Analysis: Evaluates HTTP status, redirects, performance, and layout integrity (Desktop & Mobile).
- 🤖 AI-Powered QA: Uses Gemini Pro to perform intelligent form testing, content verification, and root cause diagnosis.
- 🚀 Production-Ready API: Enqueue long-running QA tasks and fetch reports asynchronously.
- 💾 Scalable Storage: Job results are stored on disk to optimize memory usage.
- ⚙️ Centralized Config: Manage everything from
config.yamlor Environment variables.
git clone https://github.com/AliHaSSan-13/Website-Q-A.git
cd Website-Q-A python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
playwright installCreate a .env file in the root directory:
GEMINI_API_KEY=your_gemini_api_key_hereRun a quick scan directly from your terminal:
python qa_tool.py --url https://example.com --max-pages 5 --max-depth 2 --headlessArguments:
--url: Starting URL (Required).--max-pages: Max internal pages to crawl (Default: Unlimited).--max-depth: Max crawl depth (Default: Unlimited).--headless / --no-headless: Run browser with or without UI.--interactive: Manually provide form inputs during the crawl.
Start the API server:
python app.pyBy default, the server runs on http://localhost:8000.
Submit a website for analysis. This is an asynchronous operation.
Endpoint: POST /api/run-qa
Example Request:
curl -X POST http://localhost:8000/api/run-qa \
-H "Content-Type: application/json" \
-d '{
"url": "https://practice.qabrains.com/registration",
"max_pages": 1,
"max_depth": 0,
"run_ai": true,
"timeout_seconds": 300
}'Parameters:
url(string, required): Target URL.max_pages(int, optional): Limit internal pages.max_depth(int, optional): Limit crawl depth.run_ai(bool, optional): Enable AI analysis (Default: true).timeout_seconds(int, optional): Max time for the job (0 for no timeout).
Response:
Returns a job_id used to track progress.
Poll this endpoint to see if the job is finished.
Endpoint: GET /api/status/<job_id>
Example Request:
curl http://localhost:8000/api/status/YOUR_JOB_ID_HEREResponse:
status: "pending": In queue.status: "running": Crawling and analyzing.status: "success": Data is ready (includesresultobject).status: "failed": Error details included inerrorfield.
Most settings are managed in config.yaml.
gemini:
model: "gemini-2.5-flash"
max_steps: 15
browser:
headless: true
nav_timeout_ms: 30000
viewports:
desktop: { width: 1920, height: 1080 }
mobile: { width: 375, height: 812 }
api:
port: 8000
results_dir: "job_results" # Folder where JSON reports are savedRailway is the recommended platform for this tool because it supports long-running background threads and persistent volumes.
- Link your GitHub repository to Railway.app.
- Railway will automatically detect the
Procfileandnixpacks.toml. - Add your
GEMINI_API_KEYto the Variables tab in Railway.
By default, Railway's file system is temporary. To keep your results and job database after a restart:
- Go to your Railway project.
- Click + New -> Volume.
- Mount the volume to
/app/data. - Update
config.yamlor set environment variableAPI__DB_PATHto/app/data/jobs.db. - (Optional) Create another volume for
/app/crawl_resultsif you use the CLI.
The included nixpacks.toml ensures that Playwright and Chromium dependencies are correctly installed during the build process.
- Database Persistence: The API uses a SQLite database (
jobs.db) to store job metadata and results, ensuring your data survives server restarts. - Memory Management: Results are loaded from the database only when requested via the status API, keeping the server's idle memory usage low.
- Async Execution: Each QA job runs in a dedicated background thread to keep the API responsive.