Skip to content

Conversation

@edandylytics
Copy link
Collaborator

@edandylytics edandylytics commented Jan 8, 2026

External API v1 for Programmatic Job Creation

This PR introduces a new versioned external API (/v1/...) that allows external systems to create and start jobs programmatically via JWT-authenticated endpoints, without requiring a user session.

Internal docs for the EA env

Major Changes

New Endpoints

  • POST /v1/jobs — Initialize a job. Returns the job uid and presigned S3 upload URLs for each required file.
  • POST /v1/jobs/:jobUid/start — Start a job after files have been uploaded. Validates that all expected files exist in S3 before starting.
  • POST /v1/token/verify — Test endpoint to verify token validity and inspect allowed partner scopes. Allows API consumers to make sure their token is valid without having to create a job.

The flow of (a) initializing a job, (b) uploading files to S3, and (c) starting the job is what we already use when submitting jobs via the Web UI. You could imagine collapsing these all into a single API call, but I'm deliberately not doing that. It would consolidate the API without simplifying it much: instead of three straightforward API calls, you'd have one complex one. The most important reason, though, is that it'd introduce the app as a point of failure in the S3 upload. There's increased risk that job records are created that are never run, but the app handles that fine. Plus, this mirrors what's in place for the web interaction today and I'd like to keep these code paths symmetrical.

I opted for a separate set of endpoints and DTOs for the external API instead of trying to reuse the endpoints the frontend uses. This is because I:

  • I want the external API to have stability guarantees and versioning whereas I don't want to encumber the web backend with those requirements
  • The requirements for auth and, in particular, partner- vs. tenant auth context are different for each context and mingling the logic could lead to errors down the road.

Instead, I try to share code at the service level.

For versioning, I plan to have major versions. The endpoint includes v1. I feel OK about being able to have stable interfaces per version, though the mechanics of how I'd implement those interfaces (e.g what parts of the controller/module/dto/etc would be shared vs. version-specific) I'm not entirely sure of yet. I think that's OK for now. But if there are points where the v1 interface feels brittle or not extensible, let's dig into those.

JWT Authentication (ExternalApiTokenGuard)

  • Tokens are validated against a JWKS endpoint discovered from the configured OIDC issuer (EXTERNAL_API_TOKEN_ISSUER).
  • Endpoints require specific scopes (e.g., create:jobs).
  • Partner-level access is enforced via partner:<partner-code> scopes—requests for resources outside the token's partner scope are rejected.

Job UID

  • Added uid (UUID) column to the Job table to provide an opaque identifier for external consumers, keeping internal integer IDs unexposed.

I intend to use this opaque ID in the web frontend, too, but that's for another day. With this change, at least, API consumers won't have to make any changes when the frontend eventually moves to the opaque ID, too.

Other Changes

  • Additional validation.
    • FileService.doFilesExist() to verify file uploads before job start.
    • Validates that the job input matches the bundle repo metadata.
    • Both of these changes apply to jobs initiated via the external API and the web frontend.
  • Frontend shows "API Initiated" for jobs created without a user.
    • This is the simple version of the change. There's probably a deeper update lurking that involves rethinking how the createdBy column is populated (currently by a postgres hook that assumes a request-scoped client).
  • Test infrastructure updates for mocking external API tokens with jose.

@edandylytics edandylytics marked this pull request as ready for review January 20, 2026 20:47
Copy link

@bhaugeea bhaugeea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I scrolled through haphazardly to start getting my head in the game, and these are the thoughts I had. Absolutely not a well-thought-out comprehensive review, but I'll give you what I have so far anyway.

Comment on lines +69 to +82
): Promise<
| { status: 'success'; job: Job & { files: JobFile[] } }
| {
status: 'error';
code:
| 'bundle_not_found'
| 'missing_required_params'
| 'invalid_param_values'
| 'unexpected_params'
| 'missing_required_files'
| 'unexpected_files';
message: string;
}
> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same take-it-or-leave-it comment about verbose typings

Copy link

@bhaugeea bhaugeea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm submitting everything I have now, before I dig into the last piece which is the big new test file and my own playing around with test cases.

};
}

// ─── Validate params ────────────────────────────────────────────────────

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no spec anywhere detailing the schema for the bundles/registry is there? If so a link would be super helpful, if not that's okay.

Copy link

@bhaugeea bhaugeea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole thing looks great to me. Super thorough tests, and to the extent I could think of anything to poke at, it all holds up (e.g. remove the tampering from the invalidPayload and the parsing/serialization does work as promised and yields a 201). All the comments so far represent ideas or questions that IMO are worth sharing, but none rise to the level of a blocker. My ordering by most to least worthwhile is probably:

  1. transformers fix for the webpack config so DTOs populate correctly in the swagger doc
  2. Issuer missing from env.copyme, assuming I'm not mistaken about that and it should indeed be there
  3. HTTP status codes — not because they're that bad or important, but because if you are going to change them you should do it now before anyone is using the API
  4. Everything else (e.g. the as const typescript thing, or that the table cell override might not be necessary), which I intend as things that (maybe) end up in some personal todo list, but which definitely shouldn't get in the way of a release.

@edandylytics edandylytics requested a review from bhaugeea January 26, 2026 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants