Skip to content

veloryn-intel/efficiency-collapse-llm-execution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Efficiency Collapse in Multi-Step LLM Execution

An Empirical Study of Cost, Redundancy, and Phase Dynamics


Overview

Multi-step interaction with large language models is widely used in:

  • iterative refinement
  • agent loops
  • retry chains
  • structured prompting workflows

A common assumption is that additional steps improve output quality.

This work examines that assumption at the execution level, focusing on how marginal contribution evolves across steps.


Core Insight

Across models, tasks, and prompt variations, we observe a consistent pattern:

  • early steps account for the majority of measured information gain
  • marginal contribution declines rapidly with continued execution
  • redundancy increases across steps
  • cost grows monotonically while measured information gain declines

Execution can remain locally valid at each step while producing globally diminishing marginal contribution, with no intrinsic signal indicating when continuation ceases to be productive.


What This Paper Studies

This is an empirical study of execution behavior.

Instead of evaluating only final outputs, we analyze execution step-by-step using observable signals:

  • marginal output gain
  • incremental cost (tokens)
  • redundancy between steps

We define an efficiency signal representing marginal information gain per unit cost.

In this work, information gain (sometimes referred to as useful output) is approximated using redundancy-adjusted output and should be interpreted as a proxy for marginal textual contribution, not task-level utility.


Key Results

Early efficiency peak

Initial steps account for the majority of measured information gain

Rapid efficiency decline

Later steps contribute progressively less new information

Cost–output divergence

Cost increases monotonically, while measured information gain grows sub-linearly

Redundancy accumulation

Later steps increasingly reuse prior content

Execution phase behavior

Execution is better described as a distribution over phases rather than fixed transitions

Early termination potential

In evaluated settings, stopping at intermediate steps (Step 2–3):

  • retains ~50–84% of measured information gain
  • reduces cost by ~50–70%

Why This Matters

Multi-step execution is widely assumed to improve outputs.

This work shows that beyond early convergence, continued execution often produces diminishing marginal contribution without any intrinsic signal indicating when to stop.

As a result:

  • systems continue execution without observable signals for marginal contribution
  • cost accumulates without proportional gain
  • redundancy becomes dominant in later steps

This exposes a structural gap:

  • continuation decisions are not conditioned on observable execution state

Scope

This study focuses on:

  • linear, iterative execution
  • single-task continuation across steps
  • execution behavior beyond early convergence under continued iteration

It does not evaluate:

  • task correctness or factual accuracy
  • tool use or multi-stage pipelines
  • agent planning or branching workflows

Positioning

This work focuses on execution behavior, not model capability.

It identifies a structural limitation in current systems:

  • continuation decisions are not conditioned on execution state
  • cost constraints alone do not ensure productive execution

The results suggest that effective execution requires:

  • step-level evaluation of marginal contribution
  • trajectory-aware monitoring of execution behavior
  • state-aware continuation decisions based on observed signals

Repository Contents

/paper → full paper (PDF)
/figures → plots used in the paper


Paper


Citation

P., V. (2026).
Efficiency Collapse in Multi-Step LLM Execution:
An Empirical Study of Cost, Redundancy, and Phase Dynamics.
https://doi.org/10.5281/zenodo.19928793


Notes

  • Metrics are model-agnostic and proxy-based
  • Results reflect execution behavior, not outcome quality
  • Absolute values vary across models, but patterns are consistent

Related Direction

This work isolates the information-generation component of execution.

It motivates further work on:

  • detecting non-progressing execution during runtime
  • identifying trajectory-level degradation patterns
  • conditioning continuation on observable execution signals

License

CC-BY-4.0

About

Empirical analysis of multi-step LLM execution showing diminishing marginal contribution, redundancy accumulation, and cost–output divergence.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors