Skip to content

Add live x86/x64 disassembly and parser-error reporting to the API#33

Open
praydog wants to merge 5 commits into
masterfrom
mcp-improve-disassembler
Open

Add live x86/x64 disassembly and parser-error reporting to the API#33
praydog wants to merge 5 commits into
masterfrom
mcp-improve-disassembler

Conversation

@praydog

@praydog praydog commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Disassembly (Zydis):

  • New GET /api/memory/disassemble endpoint and regenny_disassemble MCP tool. Reads target memory and decodes Intel-syntax instructions, with a compact string mode and a detailed JSON mode (bytes/mnemonic).
  • New [[code]] metadata tag on Variable and Pointer nodes renders inline disassembly of the pointed-to code in the memory view.
  • Process::is_64_bit() picks the decoder machine mode/stack width; the Windows override uses IsWow64Process so 32-bit targets decode correctly.
  • Zydis v4.0.0 pulled in via CPM and linked into regenny.

Parser status reporting:

  • ReGenny tracks m_last_parse_error and an atomic m_parse_generation, bumped on every parse attempt so callers can detect completion.
  • /api/genny/content and /api/genny/reload now request a reparse and block until it finishes, returning status ok|error|pending plus the parser diagnostic instead of a silent "ok". New /api/genny/parse_error endpoint and regenny_parse_status tool expose status without reparsing.

Fixes:

  • serialize_struct: parents are laid out at base_offset while a struct's own variables already carry absolute in-struct offsets; the old code folded parent sizes into base_offset, double-counting derived fields.
  • Lua print override now forwards all arguments (variadic, tab-separated) instead of dropping everything after the first value.
  • Expose Process ok() and is_64_bit() to the Lua process binding.

Docs/build:

  • AGENT.md documents disassembly, the [[code]] tag, and @N (absolute) vs +N (relative) field-offset operators.
  • POST_BUILD copies AGENT.md next to the executable so /api/help resolves.

Disassembly (Zydis):
- New GET /api/memory/disassemble endpoint and regenny_disassemble MCP
  tool. Reads target memory and decodes Intel-syntax instructions,
  with a compact string mode and a detailed JSON mode (bytes/mnemonic).
- New [[code]] metadata tag on Variable and Pointer nodes renders inline
  disassembly of the pointed-to code in the memory view.
- Process::is_64_bit() picks the decoder machine mode/stack width; the
  Windows override uses IsWow64Process so 32-bit targets decode correctly.
- Zydis v4.0.0 pulled in via CPM and linked into regenny.

Parser status reporting:
- ReGenny tracks m_last_parse_error and an atomic m_parse_generation,
  bumped on every parse attempt so callers can detect completion.
- /api/genny/content and /api/genny/reload now request a reparse and
  block until it finishes, returning status ok|error|pending plus the
  parser diagnostic instead of a silent "ok". New /api/genny/parse_error
  endpoint and regenny_parse_status tool expose status without reparsing.

Fixes:
- serialize_struct: parents are laid out at base_offset while a struct's
  own variables already carry absolute in-struct offsets; the old code
  folded parent sizes into base_offset, double-counting derived fields.
- Lua print override now forwards all arguments (variadic, tab-separated)
  instead of dropping everything after the first value.
- Expose Process ok() and is_64_bit() to the Lua process binding.

Docs/build:
- AGENT.md documents disassembly, the [[code]] tag, and @n (absolute)
  vs +N (relative) field-offset operators.
- POST_BUILD copies AGENT.md next to the executable so /api/help resolves.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends ReGenny’s embedded HTTP API + MCP toolset with (1) live x86/x64 disassembly backed by Zydis and (2) explicit parser-status/error reporting for .genny reparses, while also fixing a struct serialization offset bug and improving a few UX/binding behaviors.

Changes:

  • Add Zydis as a third-party dependency and introduce a new /api/memory/disassemble endpoint plus regenny_disassemble MCP tool.
  • Track and expose parse status via m_last_parse_error + m_parse_generation, update /api/genny/content and /api/genny/reload to wait for completion, and add /api/genny/parse_error + regenny_parse_status.
  • Add [[code]] metadata rendering in UI nodes (Variable/Pointer) to show inline disassembly of the pointed-to code.

Reviewed changes

Copilot reviewed 14 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
third_party/CMakeLists.txt Adds Zydis dependency via CPM.
CMakeLists.txt Links Zydis and copies AGENT.md next to the built executable for /api/help.
src/ReGenny.hpp Adds parse status state (m_last_parse_error, m_parse_generation) and accessors.
src/ReGenny.cpp Implements parse error capture/generation bump; improves Lua print forwarding; exposes new Lua bindings (ok, is_64_bit).
src/Process.hpp Adds Process::is_64_bit() virtual for decoder selection and bindings.
src/arch/Windows.hpp / src/arch/Windows.cpp Implements Windows-specific bitness detection via IsWow64Process.
src/Api.cpp Adds /api/memory/disassemble; adds parse reparse-and-wait logic and /api/genny/parse_error; fixes serialize_struct parent/derived offset handling.
src/node/Variable.cpp Adds [[code]] metadata handling to render Zydis disassembly in the value string.
src/node/Pointer.cpp Adds [[code]] metadata handling to render Zydis disassembly for pointer nodes.
mcp-server/Tools.cs Adds regenny_disassemble + regenny_parse_status tools and updates docs for reparse status semantics.
AGENT.md Documents disassembly tooling, [[code]], and clarifies @N (absolute) vs +N (relative) offset operators.
src/Project.hpp / src/Project.cpp Formatting-only changes around tab serialization.
src/node/Undefined.cpp Formatting-only changes.
src/Main.cpp Formatting-only changes.
src/LoggerUi.hpp Formatting-only changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/node/Pointer.cpp
Comment thread src/node/Pointer.cpp
Comment thread src/node/Variable.cpp
praydog added 2 commits June 2, 2026 22:41
The previous commit mixed cosmetic clang-format reformatting into the
disassembly feature. This reverts the whitespace-only changes (brace
expansion, line wrapping, trailing-space trimming) on lines unrelated to
the feature, leaving the meaningful changes intact.

Behavior is unchanged: the diff against the previous commit is empty under
`git diff -w`. Five files (LoggerUi.hpp, Main.cpp, Project.cpp, Project.hpp,
node/Undefined.cpp) were pure formatting and are fully restored.
The [[code]] disassembly path uses std::array<uint8_t, 300> but the file
only got <array> transitively. Variable.cpp already includes it directly;
match that so the include doesn't depend on header ordering.

Addresses a PR review comment. The accompanying suggestion to switch the
typed pointer reads to memcpy was intentionally not taken — see PR replies.
@praydog

praydog commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks for the review. Disposition of the three comments (changes in 6e050f7, on top of 93ec95a):

  • Pointer.cpp missing <array> — fixed. Added the explicit #include <array> (it was only arriving transitively; Variable.cpp already includes it directly).
  • Use memcpy instead of typed pointer casts (Pointer.cpp) — not changed. The *(uintptr_t*)mem read is identical to two pre-existing reads in the same function, and the whole node layer uses typed casts on std::byte*. The mixed-bitness concern is a pre-existing node-layer assumption, and aliasing/alignment only matters off x86/x64 (the only thing Zydis decodes here).
  • Use memcpy instead of typed pointer casts (Variable.cpp) — not changed. Same consistency reason; and this path already handles pointer width via its m_size switch (4 → uint32_t, 8 → uint64_t), so the 32-bit case is already correct.

Per-thread replies inline. If you'd prefer the node layer to be strict-aliasing-clean, that's worth a separate file-wide pass rather than touching only the new lines.

Unrelated: an earlier follow-up commit 02f6d09 strips the clang-format whitespace churn that slipped into the first commit, so the feature diff is now clean.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Comment thread src/Api.cpp
Comment thread src/arch/Windows.cpp
Two issues from PR review:

is_64_bit(): IsWow64Process() reports FALSE both for a native 64-bit
process and for a 32-bit process on 32-bit Windows, so the old code
reported 64-bit for every target on a 32-bit OS (wrong Zydis decoder
mode). Check the OS architecture via GetNativeSystemInfo() first; only
then does WOW64==FALSE reliably mean a native 64-bit target.

request_reparse_and_wait(): it waited on parse_generation(), which bumps
on ANY parse_file() call -- including the 1s mtime auto-reload and other
UI-driven parses. An unrelated parse could satisfy the wait early, leave
m_reparse_requested set (spurious extra parse next frame), and report a
stale last_parse_error(). Replace it with an Api-owned completion counter
(m_reparse_completed) that ReGenny::update() bumps only after consuming
the API's reparse request and finishing parse_file(), so the caller
observes the result of its own reparse. parse_generation() is now unused
and removed (along with the <atomic> include it required).
@praydog

praydog commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

Second review round addressed in 8186431 — both accepted:

  • is_64_bit() wrong on 32-bit WindowsIsWow64Process() reports FALSE for a native 64-bit process and a 32-bit process on a 32-bit OS. Now I check OS bitness via GetNativeSystemInfo() first, so WOW64==FALSE only means "native 64-bit" on a 64-bit OS.
  • request_reparse_and_wait() racing unrelated parses — it waited on parse_generation(), which bumps on any parse (incl. the 1s mtime auto-reload). Now it waits on an Api-owned completion counter that ReGenny::update() bumps only after consuming the API's own reparse request, so the caller observes the result of its parse. The now-unused parse_generation() was removed.

Good catches, thanks.

A process's bitness is fixed for its lifetime, but is_64_bit() ran
GetNativeSystemInfo + IsWow64Process on every call, and the disassembly
paths invoke it repeatedly. Compute it once in the constructor (after
OpenProcess succeeds) and store it in m_is_64_bit; is_64_bit() is now a
plain accessor. The member defaults to sizeof(void*) == 8, preserving the
old fallback when OpenProcess fails (the ctor returns before detection).

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.

Comment thread src/Api.cpp
Comment on lines +222 to +237
auto request_reparse_and_wait = [this, rg]() -> std::pair<bool, std::string> {
const auto completed_before = m_reparse_completed.load();
m_reparse_requested.store(true);

// Poll for up to ~3s for the main thread to consume the request and finish parsing.
for (int i = 0; i < 300; ++i) {
if (m_reparse_completed.load() != completed_before) {
std::shared_lock lk{rg->state_mtx()};
return {true, rg->last_parse_error()};
}
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}

std::shared_lock lk{rg->state_mtx()};
return {false, rg->last_parse_error()};
};
Comment thread src/Api.cpp
Comment on lines +589 to +592
char formatted_insn[256];
ZydisFormatterFormatInstruction(&formatter, &insn, operands, insn.operand_count_visible, formatted_insn,
sizeof(formatted_insn), addr + offset, ZYAN_NULL);

Comment thread src/node/Variable.cpp
Comment on lines +154 to +161
uintptr_t code_addr = 0;
if (m_size == 4) {
code_addr = *(uint32_t*)mem;
} else if (m_size == 8) {
code_addr = *(uint64_t*)mem;
} else {
code_addr = *(uintptr_t*)mem;
}
Comment thread src/node/Variable.cpp
Comment on lines +189 to +193
char formatted_insn[256];
ZydisFormatterFormatInstruction(&formatter, &insn, operands,
insn.operand_count_visible, formatted_insn, sizeof(formatted_insn),
code_addr + offset, ZYAN_NULL);

Comment thread src/node/Variable.cpp
Comment on lines +211 to +213
}
}
}
Comment thread src/node/Pointer.cpp
Comment on lines +230 to +233
char formatted_insn[256];
ZydisFormatterFormatInstruction(&formatter, &insn, operands, insn.operand_count_visible,
formatted_insn, sizeof(formatted_insn), code_addr + offset, ZYAN_NULL);

Comment thread src/node/Pointer.cpp
Comment on lines +249 to +253
} else {
m_value_str += fmt::format("-> instruction decode failed @ 0x{:X} ", code_addr);
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants