A PHP SDK for the RunPod Serverless API built with Saloon v4.
- Requirements
- Installation
- Quick Start
- Job Lifecycle
- API Reference
- RunPodResponse
- Advanced Configuration
- Error Handling
- Rate Limits
- Result Retention
- Laravel Integration
- Support & Security
- License
- PHP 8.2 or higher
- Composer
Saloon v4 is installed automatically as a dependency.
composer require marceloeatworld/runpod-serverless-phpuse MarceloEatWorld\RunPod\RunPod;
// 1. Create the client with your API key
$runpod = new RunPod('your-api-key');
// 2. Target a specific endpoint
$endpoint = $runpod->endpoint('your-endpoint-id');
// 3. Submit a job
$result = $endpoint->run(['prompt' => 'A beautiful landscape']);
echo "Job submitted: " . $result->id; // e.g. "cb68890e-436f-4234-..."
echo "Status: " . $result->status; // "IN_QUEUE"Your API key is available at runpod.io/console/user/settings. Your endpoint ID is the alphanumeric string visible in your endpoint's URL on the RunPod dashboard.
Every RunPod job goes through a state machine:
+---> COMPLETED
|
IN_QUEUE ---> IN_PROGRESS +---> FAILED
| | |
| | +---> TIMED_OUT (executionTimeout exceeded)
| |
+---> TIMED_OUT (TTL expired before pickup)
|
+---> CANCELLED (manual cancel)
| Status | Description |
|---|---|
IN_QUEUE |
Waiting for an available worker |
IN_PROGRESS |
Actively being processed |
COMPLETED |
Finished successfully, output available |
FAILED |
Worker returned an error |
CANCELLED |
Manually stopped via cancel() |
TIMED_OUT |
Expired (TTL in queue or executionTimeout during processing) |
Submits a job and returns immediately. You then poll status() or use a webhook.
$result = $endpoint->run(['prompt' => 'A futuristic city']);
echo $result->id; // "cb68890e-436f-4234-..."
echo $result->status; // "IN_QUEUE"Payload limit: 10 MB.
Submits a job and waits for completion. Best for fast tasks (< 90 seconds).
$result = $endpoint->runSync(['prompt' => 'Hello world']);
if ($result->isCompleted()) {
$output = $result->getOutput();
}If the job takes longer than ~90 seconds,
runSyncreturns with statusIN_PROGRESS. You must then fall back to pollingstatus().
Payload limit: 20 MB.
$result = $endpoint->run(['prompt' => 'Generate something']);
// Poll until terminal state
while ($result->isInQueue() || $result->isInProgress()) {
sleep(2); // Wait 2 seconds between polls
$result = $endpoint->status($result->id);
}
// Handle terminal states
if ($result->isCompleted()) {
$output = $result->getOutput();
echo "Done! Worker: " . $result->getWorkerId();
echo "Execution time: " . $result->getExecutionTime() . " ms";
echo "Queue delay: " . $result->getDelayTime() . " ms";
} elseif ($result->isFailed()) {
echo "Error: " . $result->getError();
} elseif ($result->isTimedOut()) {
echo "Timed out, retrying...";
$result = $endpoint->retry($result->id);
} elseif ($result->isCancelled()) {
echo "Job was cancelled";
}Retrieve the current state and results of a job.
$status = $endpoint->status('cb68890e-436f-4234-...');
echo $status->status; // "COMPLETED"
echo $status->getOutput(); // The worker's output
echo $status->getExecutionTime(); // 2297 (ms)
echo $status->getDelayTime(); // 2188 (ms)
echo $status->getWorkerId(); // "smjcwth8e5sqvv"Retrieve incremental results from a streaming job. The worker must support streaming.
$result = $endpoint->run(['prompt' => 'Write a story']);
// Wait a bit for the worker to start producing chunks
sleep(5);
$stream = $endpoint->stream($result->id);
$chunks = $stream->getStream(); // Array of stream chunks
foreach ($chunks as $chunk) {
echo $chunk['output'];
}Streaming in RunPod is poll-based, not chunked transfer encoding. Each chunk is limited to 1 MB.
Cancel a queued or running job.
$result = $endpoint->run(['prompt' => 'Something expensive']);
// Changed my mind
$cancelled = $endpoint->cancel($result->id);Requeue a failed or timed-out job. RunPod re-uses the same job ID and original input.
$status = $endpoint->status($jobId);
if ($status->isFailed() || $status->isTimedOut()) {
$retry = $endpoint->retry($jobId);
echo "Retrying: " . $retry->id; // Same job ID
echo "Status: " . $retry->status; // "IN_QUEUE"
}Get worker pool and job pipeline statistics.
$health = $endpoint->health();
// The raw data contains:
// {
// "jobs": { "completed": 367, "failed": 6, "inProgress": 0, "inQueue": 0, "retried": 0 },
// "workers": { "idle": 1, "initializing": 0, "ready": 1, "running": 0, "throttled": 0, "unhealthy": 0 }
// }
$data = $health->data;
echo "Workers ready: " . $data['workers']['ready'];
echo "Jobs in queue: " . $data['jobs']['inQueue'];
echo "Jobs failed: " . $data['jobs']['failed'];Remove all pending jobs from the queue. Running jobs are not affected.
$endpoint->purgeQueue();Use with caution. This is irreversible and has a strict rate limit (2 calls per 10 seconds).
Every method returns a RunPodResponse object wrapping the raw API JSON response.
$response->isCompleted(); // COMPLETED
$response->isInQueue(); // IN_QUEUE
$response->isInProgress(); // IN_PROGRESS
$response->isFailed(); // FAILED
$response->isCancelled(); // CANCELLED
$response->isTimedOut(); // TIMED_OUT| Method | Return Type | Description |
|---|---|---|
$response->id |
?string |
Unique job identifier |
$response->status |
?string |
Current job status |
$response->data |
array |
Complete raw API response |
->getOutput() |
mixed |
Worker's output (when COMPLETED) |
->getError() |
mixed |
Error details (when FAILED) |
->getMetrics() |
?array |
Execution metrics |
->getExecutionTime() |
?int |
Active processing time in ms |
->getDelayTime() |
?int |
Time spent waiting in queue in ms |
->getWorkerId() |
?string |
ID of the worker that processed the job |
->getStream() |
?array |
Array of stream chunks (from stream()) |
RunPodResponse implements JsonSerializable, so you can pass it directly to json_encode() or return it from a Laravel controller:
// Plain PHP
echo json_encode($response);
// Laravel
return response()->json($response);The fluent methods withWebhook(), withPolicy(), and withS3Config() configure options on the endpoint resource. They are chainable and apply to the next run() or runSync() call.
Note: These options are sticky on the resource instance. If you call
withWebhook()once, subsequentrun()calls on the same instance will continue sending that webhook. Create a new endpoint instance if you need different config.
Instead of polling status(), you can provide a webhook URL. RunPod will POST the complete response JSON to your URL when the job finishes.
$result = $endpoint
->withWebhook('https://your-site.com/api/runpod/callback')
->run(['prompt' => 'Your prompt']);
// No need to poll - RunPod will call your webhook
echo "Job submitted: " . $result->id;Webhook behavior:
- RunPod POSTs the full response JSON on completion
- Your endpoint must return HTTP 200
- On failure, RunPod retries 2 more times with a 10 second delay between retries
Control job timeout and priority behavior.
$result = $endpoint
->withPolicy([
'executionTimeout' => 900000, // 15 min - max active runtime (ms)
'lowPriority' => false, // true = won't trigger worker autoscaling
'ttl' => 3600000, // 1 hour - total job lifespan from submission (ms)
])
->run(['prompt' => 'Your prompt']);| Parameter | Default | Range | Description |
|---|---|---|---|
executionTimeout |
600,000 (10 min) | 5s - 7 days | Max time a job can actively run on a worker |
ttl |
86,400,000 (24h) | 10s - 7 days | Total lifespan from submission (includes queue wait) |
lowPriority |
false |
- | If true, the job won't trigger autoscaling of new workers |
executionTimeoutvsttl: TTL counts from when the job is submitted (including queue time). executionTimeout counts from when a worker starts processing the job. If TTL expires while a job is running, it's immediately removed.
For large payloads exceeding the 10/20 MB limits, use S3 integration to pass data via object storage.
$result = $endpoint
->withS3Config([
'accessId' => 'your-access-key-id',
'accessSecret' => 'your-secret-access-key',
'bucketName' => 'your-bucket-name',
'endpointUrl' => 'https://your-s3-endpoint.com',
])
->run(['prompt' => 'Your prompt']);All fluent methods are chainable:
$result = $endpoint
->withWebhook('https://your-site.com/callback')
->withPolicy(['executionTimeout' => 120000, 'ttl' => 600000])
->withS3Config(['accessId' => '...', 'accessSecret' => '...', 'bucketName' => '...', 'endpointUrl' => '...'])
->run(['prompt' => 'Your prompt']);This client uses Saloon's AlwaysThrowOnErrors trait. Any HTTP 4xx/5xx response automatically throws an exception. Connection-level errors (DNS, timeout) are also thrown.
use Saloon\Exceptions\Request\RequestException;
use Saloon\Exceptions\Request\FatalRequestException;
use Saloon\Exceptions\Request\ClientException;
use Saloon\Exceptions\Request\ServerException;
try {
$result = $endpoint->run(['prompt' => 'test']);
} catch (FatalRequestException $e) {
// Connection-level errors: DNS failure, TLS error, timeout
echo "Connection failed: " . $e->getMessage();
} catch (ClientException $e) {
// 4xx errors
$status = $e->getResponse()->status();
match ($status) {
401 => 'Invalid API key',
404 => 'Endpoint not found or job TTL expired',
429 => 'Rate limit exceeded - implement backoff',
default => 'Client error: ' . $status,
};
} catch (ServerException $e) {
// 5xx errors
echo "RunPod server error: " . $e->getResponse()->status();
} catch (RequestException $e) {
// Catch-all for any other HTTP error
echo "Request failed: " . $e->getResponse()->status();
}Exception hierarchy:
SaloonException
FatalRequestException (connection errors - always thrown)
RequestException (HTTP errors)
ServerException (5xx)
InternalServerErrorException (500)
ServiceUnavailableException (503)
GatewayTimeoutException (504)
ClientException (4xx)
UnauthorizedException (401)
ForbiddenException (403)
NotFoundException (404)
UnprocessableEntityException (422)
TooManyRequestsException (429)
RunPod enforces per-endpoint rate limits:
| Endpoint | Max per 10s | Max Concurrent |
|---|---|---|
/run |
1,000 | 200 |
/runsync |
2,000 | 400 |
/status |
2,000 | 400 |
/stream |
2,000 | 400 |
/cancel |
100 | 20 |
/purge-queue |
2 | - |
Exceeding these limits returns HTTP 429. Implement exponential backoff with jitter when retrying.
RunPod automatically deletes job results after a retention period:
| Mode | Retention After Completion |
|---|---|
Async (run) |
30 minutes |
Sync (runSync) |
1 minute (5 minutes max) |
Fetch your results within these windows, or use webhooks to receive results immediately.
Add to config/services.php:
'runpod' => [
'api_key' => env('RUNPOD_API_KEY'),
],Add to your .env:
RUNPOD_API_KEY=your-api-key-here
Register as a singleton in AppServiceProvider (or a dedicated provider):
use MarceloEatWorld\RunPod\RunPod;
public function register(): void
{
$this->app->singleton(RunPod::class, function () {
return new RunPod(config('services.runpod.api_key'));
});
}use MarceloEatWorld\RunPod\RunPod;
use Illuminate\Http\Request;
class AIController extends Controller
{
public function generate(RunPod $runpod, Request $request)
{
$endpoint = $runpod->endpoint('your-endpoint-id');
$result = $endpoint->run($request->validated());
return response()->json([
'job_id' => $result->id,
'status' => $result->status,
]);
}
public function status(RunPod $runpod, string $jobId)
{
$endpoint = $runpod->endpoint('your-endpoint-id');
$status = $endpoint->status($jobId);
return response()->json($status); // Uses JsonSerializable
}
}use MarceloEatWorld\RunPod\RunPod;
class ProcessAITask implements ShouldQueue
{
public function __construct(
private string $endpointId,
private array $input,
) {}
public function handle(RunPod $runpod): void
{
$endpoint = $runpod->endpoint($this->endpointId);
$result = $endpoint->runSync($this->input);
if ($result->isCompleted()) {
// Store output...
}
}
}For security issues, please email diagngo@gmail.com.
MIT License - see LICENSE