Deterministic execution-state analysis for multi-step LLM workflows.
This repository accompanies the paper:
Trajectory Drift and Execution Validity in Multi-Step LLM Workflows
The work introduces a deterministic framework for analyzing execution trajectories across multi-step workflows using replayable lexical and structural signals only.
The analysis focuses on:
- continuation behavior
- trajectory drift
- branching execution
- convergence behavior
- transition stability
- execution persistence over time
The framework intentionally avoids:
- embeddings
- semantic evaluators
- judge models
- probabilistic scoring
- learned continuation policies
The contribution is a deterministic structural analysis framework for multi-step execution behavior.
Multi-step LLM workflows can remain locally coherent while progressively diverging from their originating execution trajectory.
Adjacent execution steps may continue appearing structurally stable even as long-range trajectory persistence weakens across continuation depth.
This creates measurable local-versus-global mismatch regimes where execution appears locally coherent despite weakening baseline persistence over time.
Request-level telemetry alone does not expose whether iterative execution remains structurally aligned over time.
Long-running workflows may continue consuming:
- retries
- orchestration cycles
- tool calls
- latency budget
- infrastructure resources
while progressively weakening in structural persistence relative to originating trajectory conditions.
The paper frames this as a runtime execution analysis problem rather than purely:
- a token-efficiency problem
- a reasoning-compression problem
- or a semantic evaluation problem
This repository contains:
- final manuscript
- publication figures
- deterministic replay notes
- methodology documentation
- representative trajectory examples
- scope and boundary documentation
Internal capture infrastructure and unreleased experimental tooling are intentionally excluded from this release.
| Directory | Purpose |
|---|---|
paper/ |
Final manuscript (PDF/DOCX) |
plots/ |
Publication figures |
docs/ |
Methodology, replay, and scope documentation |
examples/ |
Representative trajectory examples |
The analysis separates four execution families:
| Family | Description |
|---|---|
| Continuation | Sustained refinement preserving trajectory direction |
| Drift | Progressive divergence from originating trajectory |
| Branching | Divergent execution paths from a shared origin |
| Convergence | Independent trajectories converging structurally over time |
These families exhibit distinguishable transition behaviors across deterministic replay analysis.
The framework derives the following deterministic runtime diagnostics from replayable structural signals:
- baseline alignment
- local continuity
- drift velocity
- transition stability
- branch divergence
- branch convergence
- redundancy accumulation
All measurements remain deterministic and replayable.
The corpus includes provider-backed replayable traces captured from:
- OpenAI models
- Anthropic models
The experimental corpora include an infrastructure validation corpus focused on:
- deterministic replay
- serialization stability
- branch isolation
- persistence guarantees
- capture integrity
The release also includes an extended trajectory corpus introducing:
- deeper continuation depth
- branch-separated analysis
- transition-resolution diagnostics
- family-separated execution analysis
No synthetic traces were used in persisted experimental artifacts.
This research establishes deterministic execution primitives later usable in continuation-aware runtime systems.
X-Ray extends the deterministic replay and trajectory-analysis primitives explored in this work into continuation-aware execution analysis for multi-step orchestration workflows.
Repository:
https://github.com/veloryn-intel/veloryn-xray
local continuity can remain high while baseline alignment progressively weakens.
This means workflows may continue appearing locally coherent even while long-range trajectory persistence deteriorates over time.
The paper argues that:
continued execution is not sufficient evidence of continued trajectory persistence.
This work does not evaluate:
- semantic correctness
- factual accuracy
- hallucination detection
- reasoning quality
- task success
The framework analyzes structural execution evolution only.
This repository contains the experimental infrastructure and deterministic trajectory-analysis framework developed across two connected research directions:
-
Efficiency Collapse in Multi-Step LLM Execution: An Empirical Study of Cost, Redundancy, and Phase Dynamics
- execution redundancy
- phase-transition behavior
- continuation inefficiency
-
Trajectory Drift and Execution Validity in Multi-Step LLM Workflows
- execution-state evolution
- local/global mismatch
- deterministic trajectory analysis
- continuation-state diagnostics
X-Ray is a deterministic execution-state analysis framework for multi-step LLM systems.
The current research focuses on:
- execution-state evolution
- trajectory drift
- continuation behavior
- branch divergence and convergence
- deterministic replayable analysis
The broader direction is the development of continuation-aware execution infrastructure and runtime control primitives for long-horizon AI systems.
Rather than treating continuation solely as a token-efficiency problem, the framework investigates how execution trajectories evolve structurally over time under iterative workflows.
If you use this work in research, runtime analysis, or execution instrumentation contexts, please cite:
@report{veloryn2026trajectorydrift,
title={Trajectory Drift and Execution Validity in Multi-Step LLM Workflows},
author={P., V.},
year={2026},
doi={10.5281/zenodo.20290421}
}DOI: https://doi.org/10.5281/zenodo.20290421
Apache 2.0