Skip to content

Investigate Event-Driven Job Completion via S3 → SQS #252

@ajkiessl

Description

Background

We currently run a long-lived Ruby process that periodically scans the database for Job records in a processing state and checks whether their expected output file exists in S3.

This polling approach is simple and works at our current scale, but it has some tradeoffs:

  • Continuous DB reads
  • Repeated S3 checks
  • Latency between job completion and detection
  • Requires a perpetual scanning process

AWS S3 supports emitting events when objects are created, which can be delivered to SQS. This opens up the possibility of shifting from a polling-based model to an event-driven model.


Goal

Investigate whether we should move from:

DB + S3 polling → detect completion

to:

S3 event → SQS message → worker updates Job state

This would allow the system to react to job completion instead of repeatedly checking for it.


Questions to Explore

  • How complex is the S3 → SQS setup?

  • What changes would be required in our Ruby worker architecture?

  • How would we safely handle duplicate or out-of-order events?

  • Would this materially reduce:

    • DB load?
    • S3 API calls?
    • long-running process complexity?
  • Operational tradeoffs:

    • observability
    • retries
    • failure handling

Non-Goals (for now)

This is not a commitment to implement — only an investigation.

We are not trying to prematurely optimize or redesign the pipeline without evidence.


Outcome

Document:

  • Pros / cons
  • Implementation complexity
  • Estimated effort
  • Recommendation: stay with polling vs move to event-driven

Motivation

If job volume increases, an event-driven approach may provide:

  • faster completion detection
  • lower infrastructure load
  • better scalability

Worth understanding before we need it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions