Skip to content

Spec B: theorem_searcher agent — theorem-statement artifacts for math-theory projects #114

@jeremymanning

Description

@jeremymanning

Parent: #111. Depends on: #113 (Spec A — TheoremSearch backend must land first). Status: DEFERRED — implement only once a math-theory project demonstrates it's needed.

Goal

A new theorem_searcher agent that returns theorem statements themselves — statement text + slogan + theorem type + source paper + source link + a verification flag — as a first-class artifact, written to a ## Related theorems subsection of a math-theory project's idea.md (parallel to the librarian's ## Search trail). This is a genuinely new output type the librarian doesn't produce: the librarian does paper-level citations; this does theorem-level prior-results tracking.

Why this is separate from Spec A

Spec A (#113) wires TheoremSearch as a backend — it uses theorem search to discover which papers contain relevant theorems, then resolves those to paper-level Candidates. Spec B uses TheoremSearch to surface the theorem statements as artifacts in their own right. Different granularity, different output contract, different consumer (a math-theory paper's "prior results" section, not a citations list). Spec A's output JSON is unchanged; Spec B adds a new artifact + a new agent + a new prompt.

Scope (in)

  1. theorem_searcher agent (src/llmxive/agents/theorem_searcher.py) — registered in agents/registry.yaml. Triggered by the librarian (or flesh_out) for field == \"mathematics\" theorem-shaped questions.
  2. Output artifact: a ## Related theorems subsection in the project's idea.md. Per entry:
    • statement_text (the theorem body, LaTeX)
    • slogan (TheoremSearch's plain-English gloss)
    • theorem_type (theorem / lemma / corollary / proposition / ...)
    • source — which paper/reference it's in (arXiv ID, DOI, or a non-arXiv reference URL — ProofWiki, Stacks Project, etc. are in scope here per maintainer answer The Binding Problem in LLMs: Implementing Synchronized Oscillations for Feature Integration #4 on integrate theoremsearch #111, with quality checks)
    • source_link
    • verified — did we fetch the source LaTeX (or the reference page) and confirm the statement text actually appears there? (Quality check — required for all sources, arXiv and non-arXiv alike.)
  3. TheoremSearch primitives: POST /statements, GET /statement/{statement_id}, GET /paper/{paper_id} (the OpenAPI surface at https://api.theoremsearch.com/openapi.json).
  4. Non-arXiv source handling (the part Spec A skips): ProofWiki / Stacks Project / etc. entries — surface them, but every one gets the verified quality check (fetch the reference page, confirm the statement appears). Per maintainer: "any source is worth exploring, with the caveat that quality needs checks in all cases."
  5. Prompt (agents/prompts/theorem_searcher.md) + tests (tests/phase2/test_theorem_searcher.py — parser, real-API smoke, the verification quality check, the ## Related theorems rendering).

Open design questions (resolve at /speckit-clarify time)

  • How does the librarian hand off to the theorem_searcher — does the librarian call it as a sub-agent (like it'll call the math classifier in Spec A), or is it a sibling the pipeline routes to in parallel with the librarian for math projects?
  • Verification depth for non-arXiv sources — fetch + string-match the statement, or something stricter (LaTeX-normalized comparison)?
  • Does the ## Related theorems artifact feed downstream (specifier / paper-writing agents), or is it idea-phase-only context?
  • Dedup vs. the librarian's ## Search trail — if a theorem's source paper is also a librarian-verified citation, do we cross-link?

Acceptance (provisional — finalize at /speckit-clarify)

  • A math-theory project's idea.md gains a ## Related theorems subsection with ≥1 entry whose verified flag is true (statement confirmed against the source).
  • Non-arXiv sources appear with the same verification rigor as arXiv sources.
  • No regression on the librarian (Spec A) or non-math projects.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions