Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Fix a race condition in :mod:`tracemalloc`: it no longer detachs the attached
thread state to acquire its internal lock. Patch by Victor Stinner.
24 changes: 17 additions & 7 deletions Python/tracemalloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ static int _PyTraceMalloc_TraceRef(PyObject *op, PyRefTracerEvent event,
the GIL held from PyMem_RawFree(). It cannot acquire the lock because it
would introduce a deadlock in _PyThreadState_DeleteCurrent(). */
#define tables_lock _PyRuntime.tracemalloc.tables_lock
#define TABLES_LOCK() PyMutex_Lock(&tables_lock)
#define TABLES_LOCK() PyMutex_LockFlags(&tables_lock, _Py_LOCK_DONT_DETACH)
#define TABLES_UNLOCK() PyMutex_Unlock(&tables_lock)


Expand Down Expand Up @@ -224,13 +224,20 @@ tracemalloc_get_frame(_PyInterpreterFrame *pyframe, frame_t *frame)
assert(PyStackRef_CodeCheck(pyframe->f_executable));
frame->filename = &_Py_STR(anon_unknown);

int lineno = PyUnstable_InterpreterFrame_GetLine(pyframe);
int lineno = -1;
PyCodeObject *code = _PyFrame_GetCode(pyframe);
// PyUnstable_InterpreterFrame_GetLine() cannot but used, since it uses
// a critical section which can trigger a deadlock.
Copy link
Contributor

@kumaraditya303 kumaraditya303 Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that critical sections requires an active thread state and this code can be called with detached thread state iirc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tracemalloc_get_frame() is called with an attached thread state, see the caller traceback_get_frames() which has the code:

    PyThreadState *tstate = _PyThreadState_GET();
    assert(tstate != NULL);

Example of deadlock when running #144763 (comment) reproducer on a free-threaded build:

  • Thread A: _PyCode_GetTLBC() => ... => tracemalloc_alloc(): TABLES_LOCK()
  • Thread B: _PyTraceMalloc_TraceRef() => ... => tracemalloc_get_frame() => PyUnstable_InterpreterFrame_GetLine() => PyCode_Addr2Line(): Py_BEGIN_CRITICAL_SECTION(co)

Locks:

  • Thread A is in a Py_BEGIN_CRITICAL_SECTION(co) critical section (_PyCode_GetTLBC()) and waits for TABLES_LOCK().
  • Thread B has TABLES_LOCK() lock and waits for Py_BEGIN_CRITICAL_SECTION(co) critical section.

Thread A and thread B want to use Py_BEGIN_CRITICAL_SECTION(co) on the same code object (0x20000844810).

=> deadlock :-(

int lasti = _PyFrame_SafeGetLasti(pyframe);
if (lasti >= 0) {
lineno = _PyCode_SafeAddr2Line(code, lasti);
}
if (lineno < 0) {
lineno = 0;
}
frame->lineno = (unsigned int)lineno;

PyObject *filename = _PyFrame_GetCode(pyframe)->co_filename;
PyObject *filename = code->co_filename;
if (filename == NULL) {
#ifdef TRACE_DEBUG
tracemalloc_error("failed to get the filename of the code object");
Expand Down Expand Up @@ -863,7 +870,8 @@ _PyTraceMalloc_Stop(void)
TABLES_LOCK();

if (!tracemalloc_config.tracing) {
goto done;
TABLES_UNLOCK();
return;
}

/* stop tracing Python memory allocations */
Expand All @@ -880,10 +888,12 @@ _PyTraceMalloc_Stop(void)
raw_free(tracemalloc_traceback);
tracemalloc_traceback = NULL;

(void)PyRefTracer_SetTracer(NULL, NULL);

done:
TABLES_UNLOCK();

// Call it after TABLES_UNLOCK() since it calls _PyEval_StopTheWorldAll()
// which would lead to a deadlock with TABLES_LOCK() which doesn't release
// the GIL.
(void)PyRefTracer_SetTracer(NULL, NULL);
}


Expand Down
Loading