Skip to content

veloryn-intel/trajectory-drift-execution-validity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Trajectory Drift and Execution Validity in Multi-Step LLM Workflows

Deterministic execution-state analysis for multi-step LLM workflows.

This repository accompanies the paper:

Trajectory Drift and Execution Validity in Multi-Step LLM Workflows

The work introduces a deterministic framework for analyzing execution trajectories across multi-step workflows using replayable lexical and structural signals only.

The analysis focuses on:

  • continuation behavior
  • trajectory drift
  • branching execution
  • convergence behavior
  • transition stability
  • execution persistence over time

The framework intentionally avoids:

  • embeddings
  • semantic evaluators
  • judge models
  • probabilistic scoring
  • learned continuation policies

The contribution is a deterministic structural analysis framework for multi-step execution behavior.


Core Finding

Multi-step LLM workflows can remain locally coherent while progressively diverging from their originating execution trajectory.

Adjacent execution steps may continue appearing structurally stable even as long-range trajectory persistence weakens across continuation depth.

This creates measurable local-versus-global mismatch regimes where execution appears locally coherent despite weakening baseline persistence over time.


Runtime Relevance

Request-level telemetry alone does not expose whether iterative execution remains structurally aligned over time.

Long-running workflows may continue consuming:

  • retries
  • orchestration cycles
  • tool calls
  • latency budget
  • infrastructure resources

while progressively weakening in structural persistence relative to originating trajectory conditions.

The paper frames this as a runtime execution analysis problem rather than purely:

  • a token-efficiency problem
  • a reasoning-compression problem
  • or a semantic evaluation problem

Repository Contents

This repository contains:

  • final manuscript
  • publication figures
  • deterministic replay notes
  • methodology documentation
  • representative trajectory examples
  • scope and boundary documentation

Internal capture infrastructure and unreleased experimental tooling are intentionally excluded from this release.


Repository Structure

Directory Purpose
paper/ Final manuscript (PDF/DOCX)
plots/ Publication figures
docs/ Methodology, replay, and scope documentation
examples/ Representative trajectory examples

Execution Families

The analysis separates four execution families:

Family Description
Continuation Sustained refinement preserving trajectory direction
Drift Progressive divergence from originating trajectory
Branching Divergent execution paths from a shared origin
Convergence Independent trajectories converging structurally over time

These families exhibit distinguishable transition behaviors across deterministic replay analysis.


Runtime Diagnostics

The framework derives the following deterministic runtime diagnostics from replayable structural signals:

  • baseline alignment
  • local continuity
  • drift velocity
  • transition stability
  • branch divergence
  • branch convergence
  • redundancy accumulation

All measurements remain deterministic and replayable.


Experimental Scope

The corpus includes provider-backed replayable traces captured from:

  • OpenAI models
  • Anthropic models

The experimental corpora include an infrastructure validation corpus focused on:

  • deterministic replay
  • serialization stability
  • branch isolation
  • persistence guarantees
  • capture integrity

The release also includes an extended trajectory corpus introducing:

  • deeper continuation depth
  • branch-separated analysis
  • transition-resolution diagnostics
  • family-separated execution analysis

No synthetic traces were used in persisted experimental artifacts.


Relationship to X-Ray

This research establishes deterministic execution primitives later usable in continuation-aware runtime systems.

X-Ray extends the deterministic replay and trajectory-analysis primitives explored in this work into continuation-aware execution analysis for multi-step orchestration workflows.

Repository:

https://github.com/veloryn-intel/veloryn-xray

Key Observation

Primary Empirical Observation

local continuity can remain high while baseline alignment progressively weakens.

This means workflows may continue appearing locally coherent even while long-range trajectory persistence deteriorates over time.

The paper argues that:

continued execution is not sufficient evidence of continued trajectory persistence.


Scope Boundaries

This work does not evaluate:

  • semantic correctness
  • factual accuracy
  • hallucination detection
  • reasoning quality
  • task success

The framework analyzes structural execution evolution only.


Research Direction

This repository contains the experimental infrastructure and deterministic trajectory-analysis framework developed across two connected research directions:

  1. Efficiency Collapse in Multi-Step LLM Execution: An Empirical Study of Cost, Redundancy, and Phase Dynamics

    • execution redundancy
    • phase-transition behavior
    • continuation inefficiency
  2. Trajectory Drift and Execution Validity in Multi-Step LLM Workflows

    • execution-state evolution
    • local/global mismatch
    • deterministic trajectory analysis
    • continuation-state diagnostics

Positioning

X-Ray is a deterministic execution-state analysis framework for multi-step LLM systems.

The current research focuses on:

  • execution-state evolution
  • trajectory drift
  • continuation behavior
  • branch divergence and convergence
  • deterministic replayable analysis

The broader direction is the development of continuation-aware execution infrastructure and runtime control primitives for long-horizon AI systems.

Rather than treating continuation solely as a token-efficiency problem, the framework investigates how execution trajectories evolve structurally over time under iterative workflows.


Citation

If you use this work in research, runtime analysis, or execution instrumentation contexts, please cite:

@report{veloryn2026trajectorydrift,
  title={Trajectory Drift and Execution Validity in Multi-Step LLM Workflows},
  author={P., V.},
  year={2026},
  doi={10.5281/zenodo.20290421}
}

DOI: https://doi.org/10.5281/zenodo.20290421

License

Apache 2.0

About

Deterministic trajectory-aware execution analysis for multi-step LLM workflows using replayable structural and lexical signals.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors