ci: use `setup-python` install so codspeed builds flamegraphs correctly by davidhewitt · Pull Request #5997 · PyO3/pyo3

davidhewitt · 2026-04-23T16:16:19Z

I noticed some of our benchmarks have weirdly recursive traces like this:

I asked to codspeed team, apparently this is a limitation when using the uv-provided Python installs, so instead let's use setup-python to install Python for the benchmarks.

codspeed-hq · 2026-04-23T16:48:04Z

Merging this PR will degrade performance by 32.76%

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
❌ 47 regressed benchmarks
✅ 57 untouched benchmarks
⏩ 1 skipped benchmark¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
❌	`dirty_attach`	3.7 µs	4.2 µs	-12.34%
❌	`clean_attach`	2.1 µs	2.6 µs	-18.32%
❌	`identify_object_type`	15.2 µs	17.3 µs	-12.23%
❌	`err_new_restore_and_fetch`	6.7 µs	7.7 µs	-13.37%
❌	`extract_bigint_extract_fail`	8.7 µs	10.1 µs	-13.66%
❌	`call`	526.6 µs	598.2 µs	-11.97%
❌	`call_0`	157.7 µs	223.3 µs	-29.37%
❌	`extract_float_extract_fail`	8.1 µs	9.8 µs	-17.03%
❌	`call_method_0`	572 µs	712.4 µs	-19.71%
❌	`call_1`	190.4 µs	264.1 µs	-27.92%
❌	`call_method`	783.1 µs	951.4 µs	-17.69%
❌	`call_method_1`	290.9 µs	414.8 µs	-29.86%
❌	`call_method_one_arg`	262 µs	389.6 µs	-32.76%
❌	`extract_int_extract_fail`	8.4 µs	9.7 µs	-13.57%
❌	`call_one_arg`	161.8 µs	236.4 µs	-31.54%
❌	`decimal_via_extract`	11.9 µs	13.5 µs	-11.41%
❌	`bench_str`	3.7 µs	4.1 µs	-10.5%
⚡	`drop_many_objects`	11 µs	7.4 µs	+48.62%
❌	`getattr_intern`	3.3 µs	3.8 µs	-13.4%
❌	`test_class_method`	14.5 µs	18.7 µs	-22.29%
...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

_{Comparing davidhewitt:codspeed-flamegraphs (b282a90) with main (4712a0a)}

1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports. ↩

jjhelmus · 2026-04-23T20:41:00Z

I'd by very interested in additional information about the limitation of the uv provided python. I'm a maintainer of this python, python-build-standalone, and can work to address these limitation.

davidhewitt · 2026-04-24T10:36:34Z

My understanding from @GuillaumeLagrange was that codspeed's reliance of valgrind hits some limitation of valgrind being able to understand call stacks deep inside CPython's main eval loop.

It seemed like the fix was to exclude symbols from libpython in some form, but this exclusion only works with setup-python installs.

That said, this call stack seems better than the one in the original OP (for the same benchmark), however there's still some recursion going on which doesn't look right to me.

GuillaumeLagrange · 2026-04-24T13:50:36Z

Hello @jjhelmus, and thanks for the ping @davidhewitt

Without boring you all with the details, valgrind creates execution graphs that can contain cycles, instead of a top down tree like one would expect when thinking about profiling data, where the callstack only grows downwards (or upwards depending on how you visualize it 🤓 )
For this reason, valgrind, and by extension us at codspeed, are very sensitive to cycles when it come to attributing costs.

The easy way around this is to provide an explicit list of *.so, to ignore to valgrind, which includes libpython.so. This squashes most of the python-introduced cycles that we've observed. Unfortunately, the python standalone builds provided by uv had libpython.so statically linked, or at least it had when we last had a closer look at the issue. That makes it really hard to get good flamegraphs out of valgrind.

I'm not 100% sure forcing a non statically linked build out of python-standalone is the actual fix for this. We'd rather patch valgrind in a way cycles do not outright make the profiling useless. We have plans to do so and a few experiments on the way, but have not yet been able to output something that would be production ready to tackle this.

…ly (PyO3#5997)

ci: use setup-python install so codspeed builds flamegraphs correctly

b282a90

bschoenmaeckers approved these changes Apr 23, 2026

View reviewed changes

Tpt approved these changes Apr 23, 2026

View reviewed changes

Tpt enabled auto-merge April 23, 2026 16:27

Tpt added this pull request to the merge queue Apr 23, 2026

Merged via the queue into PyO3:main with commit e883df1 Apr 23, 2026
43 of 45 checks passed

davidhewitt added a commit to davidhewitt/pyo3 that referenced this pull request May 1, 2026

ci: use setup-python install so codspeed builds flamegraphs correct…

d2d78f7

…ly (PyO3#5997)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: use `setup-python` install so codspeed builds flamegraphs correctly#5997

ci: use `setup-python` install so codspeed builds flamegraphs correctly#5997
Tpt merged 1 commit intoPyO3:mainfrom
davidhewitt:codspeed-flamegraphs

davidhewitt commented Apr 23, 2026

Uh oh!

codspeed-hq Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

jjhelmus commented Apr 23, 2026

Uh oh!

davidhewitt commented Apr 24, 2026

Uh oh!

GuillaumeLagrange commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

davidhewitt commented Apr 23, 2026

Uh oh!

codspeed-hq Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will degrade performance by 32.76%

Performance Changes

Footnotes

Uh oh!

Uh oh!

jjhelmus commented Apr 23, 2026

Uh oh!

davidhewitt commented Apr 24, 2026

Uh oh!

GuillaumeLagrange commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codspeed-hq Bot commented Apr 23, 2026 •

edited

Loading