gh-144759: Fix undefined behavior from NULL pointer arithmetic in lexer#144788
Merged
pablogsal merged 3 commits intopython:mainfrom Feb 15, 2026
Merged
gh-144759: Fix undefined behavior from NULL pointer arithmetic in lexer#144788pablogsal merged 3 commits intopython:mainfrom
pablogsal merged 3 commits intopython:mainfrom
Conversation
…in lexer Guard against NULL pointer arithmetic in `_PyLexer_remember_fstring_buffers` and `_PyLexer_restore_fstring_buffers`. When `start` or `multi_line_start` are NULL (uninitialized in tok_mode_stack[0]), performing `NULL - tok->buf` is undefined behavior. Add explicit NULL checks to store -1 as sentinel and restore NULL accordingly.
588d391 to
0b18bc0
Compare
…tions Replace :c:func: references with double-backtick markup since these are internal functions without documentation entries.
Contributor
|
@raminfp Could you add a regression test? I suspect And please avoid force pushes to the PR so we preserve history. |
…exer Add test_lexer_buffer_realloc_with_null_start to test_repl.py that exercises the code path where the lexer buffer is reallocated while tok_mode_stack[0] has NULL start/multi_line_start pointers. This triggers _PyLexer_remember_fstring_buffers and verifies the NULL checks prevent undefined behavior.
pablogsal
approved these changes
Feb 15, 2026
Member
|
Great catch! |
|
Thanks @raminfp for the PR, and @pablogsal for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14. |
miss-islington
pushed a commit
to miss-islington/cpython
that referenced
this pull request
Feb 15, 2026
…in lexer (pythonGH-144788) Guard against NULL pointer arithmetic in `_PyLexer_remember_fstring_buffers` and `_PyLexer_restore_fstring_buffers`. When `start` or `multi_line_start` are NULL (uninitialized in tok_mode_stack[0]), performing `NULL - tok->buf` is undefined behavior. Add explicit NULL checks to store -1 as sentinel and restore NULL accordingly. Add test_lexer_buffer_realloc_with_null_start to test_repl.py that exercises the code path where the lexer buffer is reallocated while tok_mode_stack[0] has NULL start/multi_line_start pointers. This triggers _PyLexer_remember_fstring_buffers and verifies the NULL checks prevent undefined behavior. (cherry picked from commit e6110ef) Co-authored-by: Ramin Farajpour Cami <ramin.blackhat@gmail.com>
|
Sorry, @raminfp and @pablogsal, I could not cleanly backport this to |
|
GH-144834 is a backport of this pull request to the 3.14 branch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix undefined behavior in
_PyLexer_remember_fstring_buffersand_PyLexer_restore_fstring_bufferscaused by performing pointer arithmetic onNULLpointers (NULL - tok->buf).When
tok_mode_stack[0]is initialized, thestartandmulti_line_startfields are not explicitly set and remainNULL(fromPyMem_Calloc). Later, when the lexer buffer is reallocated, the remember/restore functions performNULL - valid_pointerandvalid_pointer + negative_offset, both of which are undefined behavior in C.The fix adds explicit
NULLchecks: store-1as a sentinel offset when the pointer isNULL, and restoreNULLwhen the offset is negative.Detected with
--with-undefined-behavior-sanitizer:Fixes #144759