pxq is a command-line tool for managing job queues with support for local execution and RunPod GPU instances. It provides a simple interface for submitting, monitoring, and managing jobs with automatic pod lifecycle management.
- Job Queue Management: Submit, list, and monitor jobs with a simple CLI
- Local Execution: Run jobs locally for development and testing
- RunPod Integration: Execute jobs on GPU instances with automatic provisioning
- Managed Mode: Automatic pod stopping after job completion for cost savings
- Web Dashboard: Real-time job monitoring with auto-refresh
- SSH Access: Connect to running pods for debugging
# Install via uv
uv tool install pxq
# Or install via pip
pip install pxq
# Verify installation
pxq --help# Install directly from GitHub (no clone required)
uv tool install git+https://github.com/takeru1205/pxq.git
# Install a specific version/tag
uv tool install git+https://github.com/takeru1205/pxq.git@v0.1.0
# Verify installation
pxq --help# Clone the repository
git clone https://github.com/takeru1205/pxq.git
cd pxq
# Install in editable mode
uv tool install -e .
# Verify installation
pxq --helpAfter installation, you can use pxq directly:
pxq server
pxq add "python hello.py"
pxq ls
pxq status# Clone the repository
git clone https://github.com/takeru1205/pxq.git
cd pxq
# Install dependencies
uv sync
# Run using uv run (no installation needed)
uv run pxq --help-
Start the server:
pxq server
-
In another terminal, add a job:
pxq add "echo hello" -
Check job status:
pxq status
Start the pxq server. Required before using other commands.
pxq server [--port PORT] [--host HOST]Add a new job to the queue.
pxq add COMMAND [--provider local|runpod] [--gpu TYPE] [--managed]Examples:
# Local job
pxq add "python script.py"
# RunPod GPU job
pxq add "python train.py" --provider runpod --gpu "RTX4090:1"
# Managed RunPod job (auto-stop after completion)
pxq add "python train.py" --provider runpod --gpu "RTX4090:1" --managedFor complete reference, see docs/cli-reference.md.
List jobs in the queue.
pxq ls [--all]Check job status.
pxq status [JOB_ID] [--all]SSH into a running job's pod.
pxq ssh JOB_IDCancel a queued, provisioning, or uploading job.
pxq cancel JOB_IDExamples:
# Cancel a queued job
pxq cancel 5
# Cancel a provisioning job
pxq cancel 28Note: Running jobs cannot be cancelled. Use pxq stop to stop a running job.
Stop a running job. If JOB_ID is provided, stops that specific job. Otherwise, stops the single running job (exactly one must be RUNNING).
Examples:
# Stop the single running job
pxq stop
# Stop a specific job by ID
pxq stop 42Note: If no jobs are running or multiple jobs are running, an error is returned.
All configuration uses the PXQ_ prefix:
| Variable | Description | Default |
|---|---|---|
PXQ_RUNPOD_API_KEY |
RunPod API key | None |
PXQ_MAX_PARALLELISM |
Maximum parallel jobs | 4 |
PXQ_LOG_MAX_SIZE_MB |
Log rotation limit per job | 100 |
PXQ_PROVISIONING_TIMEOUT_MINUTES |
Pod provisioning timeout | 15 |
PXQ_SERVER_HOST |
Server host | 127.0.0.1 |
PXQ_SERVER_PORT |
Server port | 8765 |
PXQ_DB_PATH |
Database path | ~/.pxq/pxq.db |
Use a YAML config file with --config:
# config.yaml
provider: runpod
gpu_type: RTX4090:1
managed: true
volume: vol-abc123
env:
API_KEY: "{{ RUNPOD_SECRET_API_KEY }}"
DATABASE_URL: "{{ RUNPOD_SECRET_DATABASE_URL }}"pxq passes env values with {{ RUNPOD_SECRET_* }} placeholders directly to RunPod. RunPod expands these server-side at pod startup. Templates are NOT required for secrets.
Access the web dashboard at http://127.0.0.1:8765/ for real-time job monitoring.
Features:
- Job list with status
- Job detail view with logs
- Auto-refresh every 10 seconds
QUEUED -> PROVISIONING -> UPLOADING -> RUNNING -> SUCCEEDED -> STOPPING -> SUCCEEDED (success)
\-> FAILED -> STOPPING -> FAILED (failure)
Managed jobs automatically delete the pod after completion. Successful jobs end at SUCCEEDED, failed jobs at FAILED.
RUNNING (any job) -> STOPPING -> STOPPED
Manual stop via pxq stop always results in STOPPED status with pod deletion.
QUEUED -> PROVISIONING -> UPLOADING -> RUNNING (remains until pxq stop)
Non-managed jobs stay in RUNNING status after command completion, awaiting explicit pxq stop.
QUEUED -> PROVISIONING -> UPLOADING -> RUNNING -> SUCCEEDED/FAILED/STOPPED
\-> CANCELLED (terminal)
# Run tests
uv run pytest
# Type check
uv run pyright src/pxq
# Format code
uv run ruff format .MIT
Symptom: GET /jobs/{id} or GET /api/jobs/{id} returns 500 Internal Server Error.
Root cause: Fixed in recent update. The issue was in src/pxq/storage.py where sqlite3.Row.get() was used, but sqlite3.Row does not have a .get() method.
Fix: Changed to bracket notation with key check:
# Before (broken):
content_value = row.get("content") if "content" in row.keys() else None
# After (fixed):
content_value = row["content"] if "content" in row.keys() else NoneVerification:
curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:8765/api/jobs/{id}
# Expected: 200 (or 404 if job not found)# Check if server is running
uv run python -m pxq server status
# Verify listener on port 8765
lsof -nP -iTCP:8765 -sTCP:LISTENSymptom: Job list shows/hides intermittently when polling.
Diagnosis:
- Check server singleton:
uv run python -m pxq server status - Verify only one listener on port 8765
- Check server logs:
cat ~/.pxq/server.log | tail -50
Fixed: Regression tests added in tests/dashboard/test_polling_stability.py.