Zack Fitch johnzfitch

_{SF Bay Area • Git Page • All icons from iconics}

OpenAI Codex: Finding the Ghost in the Machine

Important

Solved a pre-main()(⁠#[ctor::ctor]) environment stripping bug causing 11–300× GPU slowdowns that eluded OpenAI's debugging team for months. This was the main blocker to Codex spawning and controlling effective subagents. The regression often times caused delayed cpu fallback or silent failures in ML-related tasks across all operating systems.

Proof: Issue #8945 | PR #8951 | Release notes (rust-v0.80.0)

Full Investigation Details

The Ghost

In October 2025, OpenAI assembled a specialized debugging team to investigate mysterious slowdowns affecting Codex. After a week of intensive investigation: nothing.

The bug was literally a ghost — pre_main_hardening() executed before main(), stripped critical environment variables (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH), and disappeared without a trace. Standard profilers saw nothing. Users saw variables in their shell, but inside codex exec they vanished.

The Hunt

Within 3 days of their announcement, I identified the problematic commit PR #4521 and contacted @tibo_openai.

But identification is not proof. I spent 2 months building an undeniable case.

Timeline

Date	Event
Sept 30, 2025	PR #4521 merges, enabling `pre_main_hardening()` in release builds
Oct 1, 2025	`rust-v0.43.0` ships (first affected release)
Oct 6, 2025	First “painfully slow” regression reports
Oct 1–29, 2025	Spike in `env`/`PATH` inheritance issues across platforms
Oct 29, 2025	Emergency `PATH` fix lands (did not catch root cause)
Late Oct 2025	OpenAI’s specialized team investigates, declares there is no root cause, identifies issue as user behavior change
Jan 9, 2026	My fix merged, credited in release notes

Evidence Collected

Platform	Issues	Failure Mode
macOS	#6012, #5679, #5339, #6243, #6218	`DYLD_*` stripping breaking dynamic linking
Linux/WSL2	#4843, #3891, #6200, #5837, #6263	`LD_LIBRARY_PATH` stripping → silent CUDA/MKL degradation

Compiled evidence packages:

Platform-specific failure modes: Reproduction steps with quantifiable performance regressions (11–300×) and benchmarks
Pattern analysis: Cross-referenced 15+ scattered user reports over 3 months, traced process environment inheritance through fork/exec boundaries

Comprehensive Technical Analysis
Investigation Methodology

Why Conventional Debugging Failed

The bug was designed to be invisible:

Pre-main execution: Used #[ctor::ctor] to run before main(), before any logging or instrumentation
Silent stripping: No warnings, no errors — just missing environment variables
Distributed symptoms: Appeared as unrelated issues across different platforms and configurations
User attribution: Everyone assumed they misconfigured something (shell looked fine)
Wrong search space: Team was debugging post-main application code

[!NOTE] Standard debugging tools cannot see pre-main execution. Profilers start at main(). Log hooks are not initialized yet. The code executes, modifies the environment, and vanishes.

The Impact

OpenAI confirmed and merged the fix within 24 hours, explicitly crediting the investigation in v0.80.0 release notes:

"Codex CLI subprocesses again inherit env vars like LD_LIBRARY_PATH/DYLD_LIBRARY_PATH to avoid runtime issues. As explained in #8945, failure to pass along these environment variables to subprocesses that expect them (notably GPU-related ones), was causing 10×+ performance regressions! Special thanks to @johnzfitch for the detailed investigation and write-up in #8945."

Restored:

GPU acceleration	Internal ML/AI dev teams
CUDA/PyTorch	ML researchers
MKL/NumPy	Scientific computing users
Conda environments	Cross-platform compatibility
Enterprise drivers	Database connectivity

When the tools are blind, the system lies, and everyone else has stopped looking for it..

Recent Work

claude-cowork-linux _⭐69: Run Claude Desktop's Cowork mode natively on Linux by reverse-engineering macOS components for direct execution without a VM.
dota: Post-quantum secure secrets manager using hybrid ML-KEM-768 and X25519 encryption with a terminal UI for secure secret management.
claude-wiki: Comprehensive Markdown documentation mirror for Anthropic's Claude, featuring 2000+ articles on APIs, SDKs, agents, and integrations.
pyghidra-lite _⭐33: Token-efficient MCP server for Ghidra, enabling analysis of ELF, Mach-O, and PE binaries with Swift, Objective-C, and Hermes support.
filearchy: Wayland file manager forked from cosmic-files, enhancing workflows with custom MIME icons, extended archive support, and terminal app integration.
privacy-toggles: Control outbound telemetry on macOS with a web dashboard and menu bar app, featuring 61 toggles and DNS sinkhole blocking.
claude-warden _⭐39: Security hooks for Claude Code that optimize token usage, enforce budgets, and provide observability with OTEL tracing and SSRF protection.
llmx: Local-first codebase indexer utilizing BM25 and neural embeddings for efficient semantic search and chunk exports in-browser via WebGPU.

Selected Work

claude-wiki: Comprehensive Anthropic/Claude documentation wiki — 749+ docs across 24 categories
specHO: LLM watermark detection via phonetic/semantic analysis (The Echo Rule) — live demo at definitelynot.ai
codex-patcher: Automated Rust code patching tool leveraging tree-sitter for syntax-aware modifications and reliable LLM-generated updates.
htmx-docs: Curated HTMX documentation in Markdown, including API references, Big Sky repos, and relevant RFCs, organized for easy access.
filearchy: Filearchy is a Wayland file manager forked from cosmic-files, enhancing workflows with custom MIME icons, extended archive support, and terminal integration.
nautilus-plus: Enhanced Nautilus file manager with sub-millisecond search, large animated thumbnail support, and crash prevention features.
indepacer: CLI tool for querying PACER, enabling case searches, docket downloads, and document retrieval from federal court records.

Self-hosting bare metal infrastructure (NixOS) with post-quantum cryptography, authoritative DNS, and containerized services.

Live Demos

Cosmic Code Cleaner @ definitelynot.ai: LLM paste sanitizer with vectorhit algorithm — fix curly quotes, invisible Unicode, confusable punctuation, dedent blocks
LLMX Ingestor @ llm.cat: WebAssembly codebase indexer — private, deterministic chunking and BM25 search for large folders
LINTENIUM FIELD @ internetuniverse.org: Terminal-based ARG experience — interactive mystery with audio visualizations
Observatory @ look.definitelynot.ai: WebGPU deepfake detection running 4 ML models in browser

Featured

dota — Post-Quantum Secrets Manager

Defense of the Artifacts: A secrets manager engineered for cryptographic longevity. While current encryption remains secure, "harvest now, decrypt later" attacks mean secrets stored today may be vulnerable to quantum computers within their lifetime. dota addresses this with hybrid post-quantum encryption that provides security against both classical and quantum adversaries.

Layer	Implementation	Why It Matters
Key Encapsulation	ML-KEM-768 + X25519 hybrid	NIST-standardized lattice crypto with classical fallback — if either is broken, the other protects
Key Derivation	Argon2id (memory-hard)	Resists GPU/ASIC brute-force; tunable time/memory parameters
Storage	SQLCipher (AES-256-CBC)	Encrypted at rest with authenticated pages; survives partial file corruption
Hardware Auth	HMAC-SHA1 challenge-response	YubiKey/SoloKey required for unlock — no master password alone can decrypt

The TUI (Ratatui) provides vim-style navigation, fuzzy search across entries, secure clipboard integration with auto-clear, and TOTP generation for 2FA codes.

Stack: Rust • pqcrypto (ML-KEM) • x25519-dalek • argon2 • SQLCipher • Ratatui

llmx — Codebase Indexer for Local Agents

Live Demo: llm.cat (WebAssembly — runs entirely in browser, no upload)

Local-first codebase indexing with real neural embeddings (Snowflake Arctic) running via WebGPU. No server, no API calls, no data leaving your machine. Hybrid search combines BM25 keyword ranking with vector similarity using RRF for best-of-both-worlds retrieval.

llmx index ~/projects/myapp           # Build trigram + BM25 index
llmx search "authentication middleware" --limit 20
llmx export --format md --max-tokens 8000   # Context-window-aware export
llmx serve --port 8080                # Local HTTP API for agents

Capability	Implementation
Neural Embeddings	Snowflake Arctic vectors with WebGPU acceleration — ~50ms inference, same quality as server-side
Hybrid Search	BM25 + vector similarity fused via RRF — handles exact matches and semantic similarity
Smart Chunking	Deterministic by file type: functions, headings, JSON keys — same input always yields identical chunks
Semantic Exports	Hierarchical outline format (`llm.md`) with function names and heading breadcrumbs for selective retrieval

Proof: 7,147 files(Apple HIG corpus) → 31 MB index() → 1,625 tokens(99.98% savings)
Stack: Rust • tantivy • tree-sitter • WASM • WebGPU

claude-warden — Security Hooks for Claude Code

A defense-in-depth hook system for Claude Code that addresses token efficiency, security boundaries, and observability. Born from months of production use identifying failure modes in LLM coding agents.

The Problem: Claude Code's default behavior can burn tokens on verbose command output, leak internal network topology via SSRF, spawn unbounded subagents, and produce unobservable execution traces.

Hook	Threat Model	Mitigation
`quiet-overrides`	Token exhaustion from `npm install`, `cargo build`, `git log`	Injects `-q`/`--silent`/`--quiet` flags; caps output at configurable byte limit
`ssrf-protection`	Agent fetching `http://169.254.169.254` (cloud metadata) or internal services	Blocks RFC1918/link-local ranges; allowlist for legitimate internal APIs
`mcp-compression`	MCP tool outputs flooding context window	gzip + base64 for large payloads; configurable threshold
`subagent-budget`	Recursive agent spawning exhausting API quota	Per-session spawn limits; depth tracking; cost estimation
`otel-tracing`	Black-box execution; no audit trail	Exports spans to Grafana/Loki with tool calls, durations, token counts

# Example: warden blocks verbose npm and injects quiet flag
$ claude "install dependencies"
# [warden] Intercepted: npm install → npm install --silent
# [warden] Output capped at 4096 bytes (was 847KB)

Stack: Shell • jq • OpenTelemetry • Prometheus • Grafana/Loki

AI / ML / Agent Tooling

claude-wiki ⭐7 — Comprehensive Anthropic documentation wiki — 749+ docs across 24 categories
observatory — WebGPU deepfake detection with 4 ML models — live: look.definitelynot.ai
specHO — LLM watermark detection via phonetic/semantic analysis — live: definitelynot.ai
burn-plugin — Claude Code plugin for the Burn deep learning framework
raley-bot — Automated grocery assistant with F5 bot detection evasion, unit pricing across bizarre measurements, automatic coupon clipping, and MCP server for Claude Desktop

Infrastructure

Primary server: Dedicated bare-metal NixOS host _{(details available on request)}

Security	Post-quantum SSH • Rosenpass VPN • `nftables` firewall
DNS	Unbound resolver with DNSSEC • ad/tracker blocking
Services	FreshRSS • Caddy (HTTPS/HTTP/3) • cPanel/WHM • Podman containers
Network	Local 10Gbps • Authoritative BIND9 with RFC 2136 ACME

Provide feedback

Saved searches

Use saved searches to filter your results more quickly