Skip to content

Prevent ElasticSearchAliasInit jobs overlapping#1056

Merged
outdooracorn merged 1 commit intomainfrom
T416158
Feb 16, 2026
Merged

Prevent ElasticSearchAliasInit jobs overlapping#1056
outdooracorn merged 1 commit intomainfrom
T416158

Conversation

@outdooracorn
Copy link
Member

@outdooracorn outdooracorn commented Feb 13, 2026

Elasticsearch can only process cluster state changes (e.g. adding new aliases) sequentially on the master node. Running multiple instances of this job (e.g. with the EnsureElasticSearchAliases command) can overload the Elasticsearch master node causing Jobs to fail.

Prevent multiple ElasticSearchAliasInit jobs from running at the same time for each Elasticsearch cluster by using the WithoutOverlapping middleware 1

Bug: T416158

@outdooracorn outdooracorn self-assigned this Feb 13, 2026
use Illuminate\Queue\Middleware\WithoutOverlapping;
use Illuminate\Support\Facades\Log;

class ElasticSearchAliasInit extends Job {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also add $tries and $backoff class properties to this job so that any failed jobs are retried after a period of time. https://laravel.com/docs/12.x/queues#dealing-with-failed-jobs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the worst idea. I think perhaps we could also add this with a follow-up if, when we run it, we see any failure. For now I'd leave it be

dati18
dati18 previously requested changes Feb 16, 2026
Copy link
Contributor

@dati18 dati18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's beneficial to add a test function testMiddlewareConfiguration() in the test file.
Note: Something weird I noticed is that the test file is in /tests/Jobs/CirrusSearch/, but the job file is in /Jobs/ instead of /Jobs/CirrusSearch/, which is confusing for me when I was looking for the test file.

Elasticsearch can only process cluster state changes, such as adding new aliases, sequentially on the master node. Runing multiple instances of this job (e.g. with the `EnsureElasticSearchAliases` command) can overload the Elasticsearch master node causing Jobs to fail.

Prevent multiple `ElasticSearchAliasInit` jobs from runing at the same time by using the `WithoutOverlapping` middleware [1]

[1]: https://laravel.com/docs/10.x/queues#preventing-job-overlaps

Bug: T416158
@outdooracorn
Copy link
Member Author

outdooracorn commented Feb 16, 2026

I think it's beneficial to add a test function testMiddlewareConfiguration() in the test file.

I did think about that but decided against it as I would effectively be testing the Laravel framework, which already has a test suite.

Note: Something weird I noticed is that the test file is in /tests/Jobs/CirrusSearch/, but the job file is in /Jobs/ instead of /Jobs/CirrusSearch/, which is confusing for me when I was looking for the test file.

That would be an unrelated change to this PR. As it would be a small change, I suggest that you create a follow-up PR to address this (and put a Phab task in ready for review).

public function middleware(): array {
return [
// Only allow one job per ES host to run at a time to avoid DoSing the ES cluster with alias updates
new WithoutOverlapping("elasticsearch-alias-init-{$this->esHost}"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: nice, and good to do it per cluster rather than globally

@outdooracorn outdooracorn requested a review from dati18 February 16, 2026 14:43
@dati18 dati18 dismissed their stale review February 16, 2026 14:49

The test for middleware wiring is not worth the effort and could potentially bite back in the future or add more work.

@outdooracorn outdooracorn merged commit 33423a4 into main Feb 16, 2026
5 checks passed
@outdooracorn outdooracorn deleted the T416158 branch February 16, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants