The chunk size is the most important performance lever:
| Workload | Recommended Chunk Size |
|---|---|
| Lightweight items (small DTOs) | 500 – 5000 |
| Medium items (with relations) | 100 – 500 |
| Heavy items (with file I/O) | 10 – 100 |
| Single-row tasks | 1 |
Rule of thumb: balance commit frequency (durability/restart granularity) against transaction overhead. Profile to find the sweet spot.
| Technique | Notes |
|---|---|
| Cursor-based readers | PdoItemReader uses unbuffered queries — O(1) memory |
| Streaming file readers | CsvItemReader reads line-by-line; never file_get_contents() |
Generators in IteratorItemReader |
Pass a generator to avoid loading all data into memory |
Periodic gc_collect_cycles() |
Call after every N chunks for very long-running jobs |
| Avoid object retention | Don't hold references to processed items — let GC reclaim them |
For unbuffered queries (MySQL):
$pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);For batch INSERTs, prefer prepared statements; for very high volumes consider
PdoBatchItemWriter to issue the same statement once per item with optional
update assertion.
Each chunk is wrapped in a single transaction. Keep transactions short:
read()is not inside the chunk transaction.- Only the
write()call is wrapped in a transaction.
For CPU-bound or I/O-bound workloads, use PartitionStep:
$partitionStep = $stepBuilderFactory->get('parallelImport')
->partitioner($partitioner)
->workerStep($workerStep)
->gridSize(8)
->build();Choose the appropriate task executor (FiberTaskExecutor,
ProcessTaskExecutor, SimpleAsyncTaskExecutor, SyncTaskExecutor) when
configuring the partition handler.
| Executor | Best for |
|---|---|
FiberTaskExecutor |
I/O-bound (HTTP, DB, file) — light context switches |
ProcessTaskExecutor |
CPU-bound — true parallelism via pcntl_fork |
SimpleAsyncTaskExecutor |
Simple async wrapper with concurrency limit |
| Symfony Messenger | Distributed — each partition processed on a worker |
The metadata schema generated by PdoJobRepositorySchema includes the
indexes required for the framework's queries. For very large execution
histories you may benefit from additional indexes based on your monitoring
queries (e.g. on create_time, (job_instance_id, status)).
Schedule regular cleanup of old executions:
# Symfony
php bin/console batch:cleanup
# Laravel — add a custom command or use deleteJobExecution() directlyFor long-running CLI batch jobs:
; php.ini
opcache.enable_cli=1
opcache.validate_timestamps=0For jobs running > 30 minutes prefer running them via a worker pool (Symfony Messenger / Laravel Queue) rather than direct CLI to avoid memory fragmentation.
Use AsyncJobLauncher to dispatch jobs to a queue:
$env = BatchProcessing::asyncEnvironment(
dispatcher: function (int $execId, string $jobName, JobParameters $params): void {
$messageBus->dispatch(new RunJobMessage($execId, $jobName, $params));
},
);Workers process the actual execution, freeing the request thread.
- Keep listeners fast — they run synchronously inside the chunk loop.
- Aggregate metrics in the listener and emit at
afterStep/afterJobrather than per-item.
The repository ships with:
- PHPStan at the highest level (
phpstan analyse) - PHP-CS-Fixer for consistent style
- PHPUnit with high coverage requirements (
composer test) - Infection for mutation testing (
infection.json) - PHPMD for complexity checks (
phpmd.xml)
composer stan # PHP-CS-Fixer + PHPStan
composer test # PHPUnit