Skip to content

kvr06-ai/llm-program-equilibrium

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM-as-Program-Equilibrium Harness

A reproducible testbed for the partial-information program-equilibrium direction Caspar Oesterheld raised in AXRP Episode 49 (Feb 2026, 02:24:19):

"If my program is 'I prompt a particular language model' and then you know my prompt but you don't know all the weights of my language model... that is a sort of partial information program equilibrium. So I think that is another natural direction."

Each LLM agent is a program — a triple (model_id, system_prompt, temperature). Each agent receives the other's prompt and model id (but not weights), then both simulate each other up to ε-bounded depth on canonical 2-player mixed-motive games.

This is the open-source-game-playing extension that CoopEval (Tewolde, Zhang, Piedrahita, Conitzer, Jin; AAAI-26) names in §7 as a natural direction beyond its four-mechanism suite (repetition, reputation, mediation, contracts), implemented entirely on open-weight models (Llama 3.1, Llama 3.2, Phi-4, Mixtral — none of which appear in Oesterheld et al. 2026 on surrogate goals).

Provider-agnostic design

The harness talks to any OpenAI-compatible chat-completions endpoint. Pick the provider that fits — no local model download required:

Provider Cost Setup Notes
NVIDIA NIM (default) Free for developers API key from build.nvidia.com Llama 3.1, Llama 3.2, Phi-4, Mixtral all free; datacenter GPU latency
Cerebras Free tier API key from cloud.cerebras.ai Very fast inference
Groq Free tier API key from console.groq.com Very fast; rate-limited
Local Ollama Free ollama serve + ollama pull Fully offline, ~25 GB disk for full panel
Custom varies LLM_BASE_URL + LLM_API_KEY env vars Any other OpenAI-compatible endpoint (Together, OpenRouter, ...)

Quick start

Prerequisites

  • Python 3.9+
  • An API key for one of the providers above (or local Ollama)

Install

cd llm-program-equilibrium
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

Run smoke test (2 trials, ~30s on hosted)

With NVIDIA NIM (recommended):

export NVIDIA_API_KEY="nvapi-..."
python notebooks/headline_experiment.py --provider nvidia --grid smoke --trials 1

With local Ollama:

ollama serve
ollama pull llama3.1:8b-instruct  # one model is enough for the smoke grid
python notebooks/headline_experiment.py --provider ollama --grid smoke --trials 1

With a custom OpenAI-compatible provider:

export LLM_BASE_URL="https://api.together.xyz/v1"
export LLM_API_KEY="..."
python notebooks/headline_experiment.py --provider custom --grid smoke --trials 1

Run the headline grid (~30 min on NIM, longer locally)

python notebooks/headline_experiment.py --provider nvidia --grid headline --trials 10

Run the full grid (~1-3 hours on NIM with rate limits)

python notebooks/headline_experiment.py --provider nvidia --grid full --trials 20

Results are written to results/<grid>.json after every trial (incremental, safe to interrupt). A summary table is printed at the end.

Project layout

llm-program-equilibrium/
├── README.md
├── LICENSE                            # Apache-2.0
├── requirements.txt
├── src/
│   ├── program.py                     # Program = (model_id, system_prompt, temperature)
│   ├── games.py                       # PD, Stag Hunt, Chicken, BoS
│   ├── llm_client.py                  # Provider-agnostic OpenAI-compatible client
│   ├── simulator.py                   # εGroundedπBot recursive simulation
│   ├── experiment.py                  # Condition + TrialResult + run_grid
│   └── analysis.py                    # Cooperation rate, 95% CI, refusal rate
├── notebooks/
│   └── headline_experiment.py         # Entry point: provider × grid
├── tests/
│   └── test_smoke.py                  # Stub-LLM tests (no network)
├── results/                           # JSON outputs
├── writeup/
│   └── tech_report.md                 # 4-6 page tech report
└── notes/
    └── surrogate_goals_paper_notes.md # Pre-build study notes

What the harness measures

For each (game × program-pair × ε × max_depth) condition with N trials:

  • Cooperation rate — fraction of trials where the joint outcome is in the game's cooperative set.
  • 95% confidence interval1.96 × sample SD (Wald), matching the convention in Oesterheld et al. 2026.
  • LLM call count — total inference attempts per round; empirical version of the simulation cost in Oesterheld's compiler-optimization direction (AXRP 49, 01:09:25).
  • Refusal rate — proportion of LLM calls returning unparseable output or a network error. Watch this per (model, prompt) — the surrogate-goals paper hit ~46% on GPT-3.5 for a related task.

Run the tests

python -m pytest tests/ -v

Smoke tests use a stub LLM client and run in milliseconds; no provider credentials required.

Motivation

Program equilibrium (Tennenholtz, 2004; Oesterheld, 2019; Clift, Kovařík, Oesterheld, Conitzer, 2025) provides cooperation-supporting equilibria for agents that can read each other's source code. The natural LLM-agent specialization — where the "program" is a prompt and weights are not directly inspectable — was articulated in AXRP Episode 49 (Feb 2026) but lacked a public implementation. Independently, CoopEval (Tewolde et al., AAAI-26) §7 names "open-source game playing" as a natural extension to its four-mechanism cooperation suite. This harness fills both gaps with a reproducible empirical surface and a working definition of stochastic-program partial-information counterfactual.

Full method, definition, and discussion: writeup/tech_report.md.

License

Apache-2.0. See LICENSE.

About

Reproducible testbed for partial-information program equilibrium with LLM agents — εGroundedπBot, multi-game, multi-provider OpenAI-compatible (NVIDIA NIM / Cerebras / Groq / Ollama).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages