add `PyLong*` API (3.14+) by chirizxc · Pull Request #6016 · PyO3/pyo3

chirizxc · 2026-05-05T14:38:51Z

main (3.13):

into_u128_zero          time:   [33.711 ns 33.971 ns 34.252 ns]
into_u128_small         time:   [34.602 ns 34.859 ns 35.157 ns]
into_u128_u32_max       time:   [34.700 ns 35.231 ns 35.684 ns]
into_u128_u64_max       time:   [40.138 ns 40.795 ns 41.375 ns]
into_u128_max           time:   [39.668 ns 40.320 ns 40.891 ns]
into_i128_zero          time:   [33.628 ns 33.947 ns 34.335 ns]
into_i128_small_pos     time:   [34.595 ns 34.900 ns 35.250 ns]
into_i128_small_neg     time:   [35.660 ns 36.223 ns 36.743 ns]
into_i128_pos_max       time:   [39.891 ns 40.513 ns 41.041 ns]
into_i128_neg_min       time:   [43.970 ns 44.582 ns 45.093 ns]
extract_u128_zero       time:   [20.255 ns 20.341 ns 20.439 ns]
extract_u128_small      time:   [20.208 ns 20.330 ns 20.466 ns]
extract_u128_u32_max    time:   [26.269 ns 26.458 ns 26.674 ns]
extract_u128_u64_max    time:   [30.125 ns 30.322 ns 30.549 ns]
extract_u128_max        time:   [35.379 ns 35.644 ns 35.933 ns]
extract_i128_zero       time:   [20.521 ns 20.602 ns 20.702 ns]
extract_i128_small_pos  time:   [20.630 ns 20.736 ns 20.858 ns]
extract_i128_small_neg  time:   [20.357 ns 20.442 ns 20.539 ns]
extract_i128_pos_max    time:   [34.338 ns 34.568 ns 34.837 ns]
extract_i128_neg_min    time:   [49.427 ns 49.754 ns 50.169 ns]

this PR (3.14):

into_u128_zero          time:   [17.348 ns 17.475 ns 17.617 ns]
into_u128_small         time:   [18.758 ns 18.950 ns 19.163 ns]
into_u128_u32_max       time:   [27.941 ns 28.377 ns 28.762 ns]
into_u128_u64_max       time:   [33.655 ns 34.173 ns 34.633 ns]
into_u128_max           time:   [34.166 ns 34.739 ns 35.296 ns]
into_i128_zero          time:   [17.017 ns 17.154 ns 17.301 ns]
into_i128_small_pos     time:   [19.636 ns 19.818 ns 20.019 ns]
into_i128_small_neg     time:   [29.753 ns 30.247 ns 30.675 ns]
into_i128_pos_max       time:   [34.856 ns 35.374 ns 35.835 ns]
into_i128_neg_min       time:   [34.675 ns 35.254 ns 35.785 ns]
extract_u128_zero       time:   [18.302 ns 18.434 ns 18.571 ns]
extract_u128_small      time:   [18.654 ns 18.771 ns 18.904 ns]
extract_u128_u32_max    time:   [19.758 ns 19.915 ns 20.084 ns]
extract_u128_u64_max    time:   [23.769 ns 23.975 ns 24.193 ns]
extract_u128_max        time:   [23.988 ns 24.128 ns 24.284 ns]
extract_i128_zero       time:   [19.126 ns 19.261 ns 19.405 ns]
extract_i128_small_pos  time:   [19.004 ns 19.157 ns 19.316 ns]
extract_i128_small_neg  time:   [18.819 ns 18.933 ns 19.058 ns]
extract_i128_pos_max    time:   [24.424 ns 24.593 ns 24.776 ns]
extract_i128_neg_min    time:   [24.007 ns 24.147 ns 24.299 ns]

cmd:

cargo bench --bench bench_int128 -- --warm-up-time 3 --measurement-time 5 2>& | grep -E '^(into|extract)_[a-z0-9_]+\s+time:'

benchmark	3.13 (ns)	3.14 (ns)	speedup
`into_u128_zero`	33.97	17.48	1.94×
`into_u128_small`	34.86	18.95	1.84×
`into_u128_u32_max`	35.23	28.38	1.24×
`into_u128_u64_max`	40.80	34.17	1.19×
`into_u128_max`	40.32	34.74	1.16×
`into_i128_zero`	33.95	17.15	1.98×
`into_i128_small_pos`	34.90	19.82	1.76×
`into_i128_small_neg`	36.22	30.25	1.20×
`into_i128_pos_max`	40.51	35.37	1.15×
`into_i128_neg_min`	44.58	35.25	1.26×
`extract_u128_zero`	20.34	18.43	1.10×
`extract_u128_small`	20.33	18.77	1.08×
`extract_u128_u32_max`	26.46	19.91	1.33×
`extract_u128_u64_max`	30.32	23.98	1.26×
`extract_u128_max`	35.64	24.13	1.48×
`extract_i128_zero`	20.60	19.26	1.07×
`extract_i128_small_pos`	20.74	19.16	1.08×
`extract_i128_small_neg`	20.44	18.93	1.08×
`extract_i128_pos_max`	34.57	24.59	1.41×
`extract_i128_neg_min`	49.75	24.15	2.06×

ngoldbaum

Did a quick pass and spotted a couple issues.

ngoldbaum · 2026-05-06T16:24:39Z

I looked over the FFI changes.

While we're touching longobject.rs, I'd appreciate it if you could make sure that your updates are ordered following CPython's Include/cpython/longobject.h as much as possible. For example, there are comments about skipping PyUnstable_Long_IsCompact and PyUnstable_Long_CompactValue, but those comments should follow the definition for PyLong_FromUnicodeObject because that's how the corresponding CPython header is organized in 3.14. Our bindings are also missing Py_ASNATIVEBYTES_ALLOW_INDEX, PyLong_IsPositive, PyLong_IsNegative, and PyLong_IsZero, which we should add.

It also looks like the PyLongLayout and PyLongExport structs are in cpython/longintrepr.h, so the definitions you're adding here should go into a new cpython/longintrepr.rs rust module, along with the C API definitions that are in the corresponding header.

The principle you should keep in mind when touching PyO3's FFI bindings is that it should be organized exactly like the upstream headers, as of the newest Python we support (3.14 right now, but it'll be 3.15 in the next few weeks).

chirizxc · 2026-05-06T19:35:38Z

I looked over the FFI changes.

While we're touching longobject.rs, I'd appreciate it if you could make sure that your updates are ordered following CPython's Include/cpython/longobject.h as much as possible. For example, there are comments about skipping PyUnstable_Long_IsCompact and PyUnstable_Long_CompactValue, but those comments should follow the definition for PyLong_FromUnicodeObject because that's how the corresponding CPython header is organized in 3.14. Our bindings are also missing Py_ASNATIVEBYTES_ALLOW_INDEX, PyLong_IsPositive, PyLong_IsNegative, and PyLong_IsZero, which we should add.

It also looks like the PyLongLayout and PyLongExport structs are in cpython/longintrepr.h, so the definitions you're adding here should go into a new cpython/longintrepr.rs rust module, along with the C API definitions that are in the corresponding header.

The principle you should keep in mind when touching PyO3's FFI bindings is that it should be organized exactly like the upstream headers, as of the newest Python we support (3.14 right now, but it'll be 3.15 in the next few weeks).

In branch 3.14, PyLong_FromNativeBytes, Py_ASNATIVEBYTES_BIG_ENDIAN, etc. , were moved from cpython/Include/longobject.h to cpython/longobject.h (see: 3.13* | 3.14*). Should I move them as well? I mean, do I need to fully synchronize the current files with branch 3.14?

ngoldbaum · 2026-05-06T19:37:54Z

Should I move them as well? I mean, do I need to fully synchronize the current files with branch 3.14?

Yes, that's generally what we do. It's hard to keep the FFI bindings perfectly up-to-date, so it's a chore for whoever needs to touch them.

chirizxc · 2026-05-06T20:28:36Z

I split up some of the extern_libpython! blocks so that the order remains the same as in the *.h files

ngoldbaum

I spotted some issues, see below.

ngoldbaum · 2026-05-07T14:31:18Z

+                            value
+                        };
+                        let bits = 128 - abs.leading_zeros() as usize;
+                        let n_digits = if bits == 0 { 1 } else { bits.div_ceil(30) };


This hard-codes 30-bit digits, but CPython also supports 15 bit digits, it's just a non-default build configuration. I think if we added a 32 bit Windows or Linux build to CI, this test would fail.

If you don't want to implement a fast path for 15 bit digits, you cold also check e.g. sys.int_info.bits_per_digit or the C API equivalent at runtime and fall back to the 3.13 and earlier path which doesn't need to care about how many bits per digit CPython uses.

I think if we added a 32 bit Windows or Linux build to CI, this test would fail.

It seems we have compile Python yourself with the --enable-big-digits=15 flag

codspeed-hq · 2026-05-07T16:45:45Z

Merging this PR will improve performance by 14.53%

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
✅ 104 untouched benchmarks
🆕 20 new benchmarks
⏩ 1 skipped benchmark¹

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
🆕	`extract_i128_neg_min`	N/A	1.4 µs	N/A
🆕	`extract_i128_pos_max`	N/A	1.4 µs	N/A
🆕	`extract_i128_small_neg`	N/A	1.1 µs	N/A
🆕	`extract_i128_small_pos`	N/A	1.1 µs	N/A
🆕	`extract_i128_zero`	N/A	1.1 µs	N/A
🆕	`extract_u128_max`	N/A	1.4 µs	N/A
🆕	`extract_u128_small`	N/A	1.1 µs	N/A
🆕	`extract_u128_u32_max`	N/A	1.2 µs	N/A
🆕	`extract_u128_u64_max`	N/A	1.3 µs	N/A
🆕	`extract_u128_zero`	N/A	1.1 µs	N/A
🆕	`into_i128_neg_min`	N/A	1.6 µs	N/A
🆕	`into_i128_pos_max`	N/A	1.6 µs	N/A
🆕	`into_i128_small_neg`	N/A	1.3 µs	N/A
🆕	`into_i128_small_pos`	N/A	1.8 µs	N/A
🆕	`into_i128_zero`	N/A	1.8 µs	N/A
🆕	`into_u128_max`	N/A	1.6 µs	N/A
🆕	`into_u128_small`	N/A	1.8 µs	N/A
🆕	`into_u128_u32_max`	N/A	1.6 µs	N/A
🆕	`into_u128_u64_max`	N/A	1.6 µs	N/A
🆕	`into_u128_zero`	N/A	1.9 µs	N/A
...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

_{Comparing chirizxc:PyLongWriter (f35bb45) with main (f57bda7)}

1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports. ↩

ngoldbaum

Overall looking much better. I have some comments inline.

Looking further ahead (does not need to be landed in this PR), it probably makes sense to add conversions using this API for BigInt, rust_decimal::Decimal, and big_decimal::BigDecimal, so IMO it might bear some effort to come up with a slightly more general API to avoid duplicating logic. Right now the logic is only duplicated twice between i128 and u128, but if we add more types there will be more duplication.

Consider defining two generic functions:

// builds an int from an iterator of 30-bit digits
pub(crate) fn pylong_from_digits<I: ExactSizeIterator<Item = u32>>(py: Python<'_>, negative: bool, digits: I) -> PyResult<Bound<'_, PyInt>>

// Visits 30-bit digits LSB-first and deals with freeing the export
pub(crate) fn pylong_visit_digits<R>(
    obj: Borrowed<'_, '_, PyAny>, 
    f: impl FnOnce(bool /*negative*/, i64 /*compact*/, Option<&[u32]> /*digits*/
) -> PyResult<R>

Also once this is merged you should open a followup issue to add more conversions that can benefit from this API.

ngoldbaum · 2026-05-08T21:09:31Z

+
+        *DIGITS.get_or_init(|| {
+            let layout = unsafe { &*ffi::PyLong_GetNativeLayout() };
+            layout.bits_per_digit == PYLONG_BITS_IN_DIGIT as u8


your implementation implicitly assumes digits_size=4, digits_order=-1, and digit_endianness set to native byteorder. You should check those invariants here, in case Python ever changes them in the future or a different Python implementation does things a different way.

ngoldbaum · 2026-05-08T21:10:56Z

+                                )?;
+                            }
+                            let long_export_ref = unsafe { long_export.assume_init_ref() };
+                            if long_export_ref.digits.is_null() {


For consistency you should call PyLong_FreeExport in this path as well.

ngoldbaum · 2026-05-08T21:20:50Z

+                                    ffi::PyLong_Export(
+                                        num.as_ptr().cast(),
+                                        long_export.as_mut_ptr(),
+                                    ),


Suggested change

ffi::PyLong_Export(

num.as_ptr().cast(),

long_export.as_mut_ptr(),

),

ffi::PyLong_Export(num.as_ptr(), long_export.as_mut_ptr()),

Calling cast() is unnecessary - as_ptr() returns a *mut PyObject.

ngoldbaum · 2026-05-08T21:23:35Z

 }

 #[cfg(all(Py_3_13, not(Py_LIMITED_API)))]
+#[allow(dead_code)]


Couldn't you change the cfg gating to avoid adding dead_code here? I'd prefer that.

chirizxc added 9 commits May 5, 2026 17:37

add PyLong* API (3.14+)

7f623ee

add newsfragments

59a6818

cargo fmt

dce9996

try fix

5e4018e

fix typo (inr -> int)

035b40e

try fix (x2)

f2cebdc

fix clippy

0fb6575

try fix clippy

05bfe1f

add bench int 128

ebbb7da

This comment was marked as resolved.

Sign in to view

add tests

003647a

chirizxc closed this May 5, 2026

chirizxc reopened this May 5, 2026

ngoldbaum reviewed May 5, 2026

View reviewed changes

Comment thread src/conversions/std/num.rs Outdated

Comment thread src/conversions/std/num.rs Outdated

chirizxc added 4 commits May 5, 2026 23:43

digits_ptr -> ptr

66b00a1

review

b8b9597

cargo fmt

a089d89

fix

e55d67e

sync 3.14 (python/cpython@af65a8b)

a937d91

add PyLong_Is*

5e049e5

ngoldbaum reviewed May 7, 2026

View reviewed changes

chirizxc added 5 commits May 7, 2026 17:52

remove PyUnstable_Long_IsCompact

9ad8e5d

bench new line

e2c2caa

PyLong_FromUnicodeObject

bf00a58

remove #[cfg(not(Py_LIMITED_API))]

14d54f3

review

64248a1

chirizxc added 2 commits May 7, 2026 19:15

add 6016.changed.md

884cd64

update

9618b36

chirizxc added 4 commits May 7, 2026 19:47

update

4d4a0b0

update

d96ca58

fix

9976b33

fix

f35bb45

ngoldbaum reviewed May 8, 2026

View reviewed changes

Conversation

chirizxc commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ngoldbaum commented May 6, 2026

Uh oh!

chirizxc commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngoldbaum commented May 6, 2026

Uh oh!

chirizxc commented May 6, 2026

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ngoldbaum May 7, 2026

Choose a reason for hiding this comment

Uh oh!

chirizxc May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codspeed-hq Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 14.53%

Performance Changes

Footnotes

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

ngoldbaum May 8, 2026

Choose a reason for hiding this comment

Uh oh!

ngoldbaum May 8, 2026

Choose a reason for hiding this comment

Uh oh!

ngoldbaum May 8, 2026

Choose a reason for hiding this comment

Uh oh!

ngoldbaum May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chirizxc commented May 5, 2026 •

edited

Loading

chirizxc commented May 6, 2026 •

edited

Loading

codspeed-hq Bot commented May 7, 2026 •

edited

Loading