fix(c-parser): reject call sites inside function bodies; support nginx-style split-line defs (#331)#352
fix(c-parser): reject call sites inside function bodies; support nginx-style split-line defs (#331)#352
Conversation
…t-line defs Fixes #331. Two C parser bugs are addressed: 1. **False positives**: call sites like `fprintf(stderr, ...)` were indexed as function definitions when they appeared inside a function body. Fix: track brace depth (`c_brace_depth`) as we parse C/C++ lines. For C files, reject any function candidate at depth > 0; for C++ allow depth 1 (class/struct body). Brace counting uses `countBracesDelta` which skips string and char literals, avoiding false increments from `const char *s = "fake() {"`. 2. **Recall gap (nginx-style)**: definitions where the return type is on the previous line (`ngx_int_t\nngx_http_init_connection(...)`) had `before_name.len == 0` and were dropped. Fix: when `before_name` is empty and `at_col0` is true, accept if the previous non-blank line looks like a type identifier (not a statement or brace). New tests: - `issue-331: C parser does not index indented call sites as functions` - `issue-331: C parser finds nginx-style split-line definitions` Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 246e072ea1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (extractIdent(std.mem.trimStart(u8, prev_trimmed, " \t*")) == null) return null; | ||
| return name; | ||
| } | ||
| if (!at_col0 and !looksLikeCMethodDef(before_name)) return null; |
There was a problem hiding this comment.
Allow indented keyword-prefixed C/C++ definitions
Remove or relax this gate: it rejects any non-column-0 definition unless looksLikeCMethodDef passes, and that helper returns false when the first token is a C keyword like static, inline, extern, or const. As a result, indented definitions such as static/inline methods inside C++ class bodies (and indented static C functions) are no longer indexed, which is a regression in symbol recall introduced by this commit.
Useful? React with 👍 / 👎.
| } else if (ch == '{') { | ||
| delta += 1; | ||
| } else if (ch == '}') { | ||
| delta -= 1; |
There was a problem hiding this comment.
Ignore inline block comments in brace-depth accounting
Brace depth is updated by counting { and } directly on the parsed line, but only string/char literals are skipped; /* ... */ spans are not. This causes braces inside inline block comments (for example /* } */ within a function body) to corrupt c_brace_depth, which can incorrectly return to file scope and reintroduce false-positive function indexing for subsequent call sites.
Useful? React with 👍 / 👎.
Summary
Fixes #331 — two C parser correctness bugs.
False positives: call sites indexed as functions
fprintf(stderr, ...),curl_easy_perform(curl), and similar call sites inside function bodies were being indexed as function definitions.Fix: Track
c_brace_depthas we scan C/C++ lines:brace_depth > 0(any depth ≥ 1 is inside a function body)brace_depth == 1(class/struct method bodies at depth 1 are valid definitions); reject at depth ≥ 2Brace counting uses
countBracesDelta()which skips string and char literal contents to avoid false increments from lines likeconst char *s = "fake() {".Recall gap: nginx-style split-line definitions
Functions written with the return type on its own line:
had
before_name.len == 0and were silently dropped.Fix: When
before_nameis empty and the function name is at column 0 (at_col0), accept the name if the previous non-blank line looks like a type identifier (not a statement, brace, semicolon, or call expression).New tests
issue-331: C parser does not index indented call sites as functionsissue-331: C parser finds nginx-style split-line definitionsAll 411 existing tests continue to pass (
zig build testexits 0).Test plan
zig build test— all 411 tests passGenerated with Devin