Skip to content

Pipelines with large job graphs fail at GitHub workflow startup (4700+ dependency edges) #1456

Description

@davidecaltagirone

Description

When cdk-pipelines-github generates a workflow that deploys across a large number of AWS accounts (in our case ~50), the resulting YAML (~9,000 lines, ~160 jobs, 4,700+ needs: dependency edges) fails immediately at workflow startup — before any job executes.

GitHub Support confirmed this is caused by an undocumented internal database storage limit on job dependency graphs. The constraint is based on the combined size of all job names and their needs references, not just job count alone.

Error behaviour

  • Workflow run is created but immediately transitions to a failed state
  • No jobs execute
  • No error message is surfaced in the GitHub Actions UI — the failure happens before any runner is allocated

Root cause (per GitHub Support)

"Your workflow contains 160 jobs with approximately 4,700+ dependency edges across them, generating a job dependency graph that exceeds an internal database storage limit. This causes the run to fail at startup before any job executes."

This is a known issue tracked internally by GitHub Engineering. Their intended long-term fix is to gracefully handle oversized graphs (skip UI graph rendering rather than failing the run), but it has not shipped yet.

Context

  • Pipeline deploys to ~50 AWS accounts in a single generated workflow file
  • The workflow is entirely auto-generated by cdk-pipelines-github — the YAML cannot be edited directly
  • There is no documented GitHub limit for job count or dependency edges

Workaround suggested by GitHub Support

Split the generated workflow into multiple smaller workflow files and chain them using the workflow_run trigger to preserve stage ordering. For example:

on:
  workflow_run:
    workflows: ["InfraStage"]
    types: [completed]

Since the YAML is auto-generated, this change must be implemented in the CDK pipeline source.

Proposed solution

Add support for splitting a pipeline across multiple workflow files, with stages or wave boundaries acting as split points. Possible approaches:

  • A splitWorkflowAt option (stage name or wave index) that emits separate .github/workflows/ files and wires them together via workflow_run triggers automatically
  • An automatic split when job count or estimated edge count exceeds a configurable threshold
  • At minimum, documentation of this limit and a manual escape hatch for users to define workflow boundaries

Environment

  • ~50 AWS accounts / ~160 generated jobs / ~4,700 needs edges
  • Generated YAML: ~9,000 lines
  • GitHub confirmed the issue via Support ticket

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions