CrawlerX is a ready-to-run Laravel 12 crawler service. It exposes one generic provider-based API, validates common crawl input, queues a crawl job, and lets provider crawler classes parse responses behind the shared crawling contract.
POST /api/v1/{provider}
Example:
curl -X POST http://127.0.0.1:8000/api/v1/onejav \
-H "Accept: application/json" \
-d "url=https://onejav.com/new" \
-d "callback_url=https://client-app.test/webhooks/crawlerx"Queued response:
{
"success": true,
"message": "Crawl job queued.",
"data": {
"provider": "onejav",
"url": "https://onejav.com/new",
"status": "queued"
}
}The job posts the completed or failed crawl result to callback_url. Because crawl requests and results are DB-less, callback_url is required. Unsupported providers return a clean JSON error and do not dispatch a job. The provider is always read from the route path, not the request body.
POST /api/v1/{provider} -> CrawlerController -> CrawlRequest -> CrawlJob -> CrawlerService -> CrawlingResolver -> provider crawler -> AbstractBaseCrawling -> jooservices/client -> provider parse() -> CrawlingResultDto -> callback delivery.
The controller does not crawl or parse. The queued job performs the crawl and sends the callback. Provider classes own endpoint, options, site code, and parsing only.
PHP target version: PHP 8.5.
composer install
cp .env.example .env
php artisan key:generate
php artisan migrate
php artisan serveRun the database queue worker:
php artisan queue:work --queue=crawlerxHorizon is installed for queue monitoring. Horizon requires Redis-backed queues, so keep the default database queue for simple local development unless Redis is configured.
To use Horizon:
QUEUE_CONNECTION=redis
CRAWLERX_QUEUE=crawlerxThen run:
php artisan horizoncomposer lint
composer testNever commit failing lint or tests.
Work on the current branch unless asked otherwise. Before committing:
git status
git branch --show-current
git config user.name "Viet Vu"
git config user.email "jooservices@gmail.com"
git config user.name
git config user.email
composer update
composer lint
composer testUse short, meaningful commit messages and group commits by feature area. If composer update changes composer.lock, commit it with the relevant change. Completed work must be committed locally after successful checks; do not leave finished work in git status.
Detailed workflow docs:
docs/01-development/04-git-workflow.mddocs/01-development/05-dependency-policy.md.github/skills/git-workflow/SKILL.md.github/skills/dependency-and-package-policy/SKILL.md
- Create a provider crawler under
app/Services/Crawling/Sites/. - Extend
AbstractBaseCrawling. - Implement endpoint, options, site code, and
parse(). - Register the provider in
config/crawlerx.php. - Add mocked client tests with fixture HTML.
- Update docs when the public API or contracts change.
- No Laravel Modules.
- No persistence for crawl results unless explicitly requested.
- No repositories for crawl results unless explicitly requested.
- No result/status/history/retry endpoints.
- No provider-specific controllers.
- No page-specific response contracts.
- No primary item concept.
- No relation concept.