Stabilize c-variadic function definitions#155697
Stabilize c-variadic function definitions#155697folkertdev wants to merge 1 commit intorust-lang:mainfrom
Conversation
This comment has been minimized.
This comment has been minimized.
d1f9d2c to
d7ccbf8
Compare
This comment has been minimized.
This comment has been minimized.
d7ccbf8 to
f5739d8
Compare
|
There's a typo above: "The VaList type implements Copy and Drop" should be "Clone and Drop" |
This comment has been minimized.
This comment has been minimized.
f5739d8 to
50f6221
Compare
|
|
There was a problem hiding this comment.
(Using a random file for a thread)
The rust implementation requires that the caller and callee agree on the exact type. This is slightly more conservative than C, which does allow e.g. conversion between signed and unsigned integers, or to/from void pointers.
I wonder, is this too strict? And does this restriction apply across FFI?
I'm thinking about a printf implementation where it is common and valid (as far as I know) to use %x with both signed and unsigned integers.
There was a problem hiding this comment.
It is too strict, and the reference PR does not state it this strongly.
There is a difference here between what C says (in section 7.16.1.1):
If type is not compatible with the type of the actual next argument (as promoted according to the default argument promotions), the behavior is undefined, except for the following cases:
- both types are pointers to qualified or unqualified versions of compatible types;
- one type is compatible with a signed integer type, the other type is compatible with the
corresponding unsigned integer type, and the value is representable in both types;- one type is pointer to qualified or unqualified void and the other is a pointer to a qualified or
unqualified character type;- or, the type of the next argument is nullptr_t and type is a pointer type that has the same representation and alignment requirements as a pointer to a character type
So signedness is irrelevant, and my interpretation of the other rules is that any pointer type is compatible with any other pointer type (in the same address space).
That is not what Miri currently implements, it instead uses strict type equality because @RalfJung did not have much appetite for (from memory, the third, but in any case) another notion of type equivalence. I went with the more restrictive formulation because we can always relax it later, the inverse is not true.
Perhaps T-opsem has suggestions for a better way to phrase this.
There was a problem hiding this comment.
If we can relax the signedness requirements to match C then I think that is preferable, so we don't need to worry about soundness across FFI. It makes sense that Miri should match up with whatever behavior is decided, pinging @rust-lang/miri for thoughts here.
Is "compatible types" in the context of varargs defined anywhere? My read is that if they mean va-compatible then you could consider int * and unsigned * to be compatible, and you could consider void * and char * to be compatible, but you couldn't consider int * and long * to be compatible. Though I don't think Ive ever seen anybody cast pointers to void before using %p.
There was a problem hiding this comment.
If it does mean va-compatible then unsafe impl<T> VaArgSafe for *mut T {} may not be correct
There was a problem hiding this comment.
Is "compatible types" in the context of varargs defined anywhere?
Hmm no, I think your read is correct.
If it does mean va-compatible then
unsafe impl<T> VaArgSafe for *mut T {}may not be correct
Based on the last rule I think *mut T always
has the same representation and alignment requirements as a pointer to a character type
but then the only valid value you can provide there is the NULL pointer... So then unsafe impl<T: VaArgSafe> VaArgSafe for *mut T {} might be more accurate.
That's really cumbersome, and I've similarly never seen people cast to void * when using %p. I don't really see a practical reason for it either.
There was a problem hiding this comment.
https://www.gnu.org/software/c-intro-and-ref/manual/html_node/Compatible-Types.html spells out compatible types in a bit clearer form. edit In particular:
In C, two different primitive types are never compatible
There was a problem hiding this comment.
Aha! In 7.21.6.
pThe argument shall be a pointer to void.
So the common use in C is UB. Nice.
Found via https://stackoverflow.com/questions/24867814/printfp-and-casting-to-void
| #[unstable( | ||
| feature = "c_variadic", | ||
| issue = "44930", | ||
| reason = "the `c_variadic` feature has not been properly tested on all supported platforms" | ||
| )] | ||
| mod va_list; | ||
| #[stable(feature = "c_variadic", since = "CURRENT_RUSTC_VERSION")] | ||
| pub use self::va_list::{VaArgSafe, VaList}; | ||
|
|
||
| #[unstable( | ||
| feature = "c_variadic", | ||
| issue = "44930", | ||
| reason = "the `c_variadic` feature has not been properly tested on all supported platforms" | ||
| )] | ||
| pub mod va_list; |
There was a problem hiding this comment.
It would be good to do any surface area change like the module in a separate PR so we see the effects in the nightly docs
There was a problem hiding this comment.
Right, I only noticed that the module was pub when making this PR. T-libs-api decided to not make the module pub but just export the type and trait from core::ffi.
| //@ check-pass | ||
|
|
||
| #![feature(c_variadic)] | ||
| #![feature(custom_inner_attributes)] |
There was a problem hiding this comment.
Any idea why custom_inner_attributes wasn't needed before?
There was a problem hiding this comment.
it's needed for top-level #![rustfmt::skip], there is an in-flight PR I believe to accept this (or anyway this got talked about recently)
The file is sensitive to formatting (the formatter deletes the attributes in some places), hence the rustfmt::skip.
There was a problem hiding this comment.
Ah so not directly related to the removal here. Makes sense, probably just worth a note
50f6221 to
5edeb59
Compare
|
☔ The latest upstream changes (presumably #155755) made this pull request unmergeable. Please resolve the merge conflicts. |
tracking issue: #44930
reference PR: rust-lang/reference#2177
There are some minor things still in flight, but I think we're far enough along now.
Stabilization report
Summary
In C, functions can use a variable argument list
...to accept an arbitrary number of untyped arguments. Rust is already able to call such functions (e.g.libc::printf), thec_variadicfeature adds the ability to define them.A rust c-variadic function looks like this:
This function accepts a variable arguments list
args: ..., from which it is able to read arguments using thenext_argmethod.The main goal of defining c-variadic functions in rust is interaction with C code. Therefore it is a design goal that the rust types map directly to their C counterparts. Additionally, we disallow interaction between c-variadic functions and certain rust features that don't make much sense in an FFI context (e.g.
extern "Rust" fnorasync fn).How variadics work in C
The authoritative source for how variadics (also known as "variable arguments") work in C is the C specification. In this document we'll use section 7.16 of the final draft of the C23 standard.
Earlier versions of C furthermore required that the
...argument is not the first argument of the parameter type list (so at least one other argument was required). Starting in C23 this requirement has been lifted.C API surface
va_list: an opaque type that stores the information needed to read variadic arguments, or copy the list of variadic arguments. Typically values of this type get the nameap.va_start: ava_listmust be initialized with theva_startmacro before it can be used.va_copy: theva_copymacro copies ava_list. The copy starts at the position in the argument list of the original (so not at the first variadic argument to the function), and both can be moved forward independently. This means the same argument can be read multiple times.va_arg: reads the next argument from theva_ist.va_end: deinitializes ava_list.Important notes
Not calling
va_endis UBSection 7.16.1
Section 7.16.1.3:
We believe that this behavior is this strict because some early C implementations chose to implement
va_listlike so (source):To our knowledge no remotely-modern implementation actually implements
va_endas anything but a no-op.A
va_listmay be movedSection 7.16
and
So
va_listcan be moved into another function, butva_endmust still run in the frame that initialized (withva_startorva_copy) theva_list.Representation of
va_listThe representation of
va_listis platform-specific. There are three flavors that are used by current rust targets:va_listis an opaque pointerva_listis a structva_listis a single-element array, containing a structThe opaque pointer approach is the simplest to implement: the pointer just points to an array of arguments on the caller's stack.
The struct and single-element array variants are more complex, but potentially more efficient because the additional state makes it possible to pass c-variadic arguments via registers.
array-to-pointer decay
If the
va_listis of the single-element array flavor, it is subject to array-to-pointer decay: in C, arrays are passed not by-value, but as pointers. Hence, from an FFI perspective, these two functions are equivalent.Indeed, they generate the same assembly, see https://godbolt.org/z/n8c4aq5hM.
other calling conventions
Both
clangandgccrefuse to compile a function that uses variadic arguments and a non-default calling convention. See also #141618, in particular #141618 (comment).Hence the calling convention of a c-variadic function definition implicitly uses the default C ABI on the current platform.
va_argand argument promotionWith some exceptions, the return type of a
va_argcall must match the type of the supplied argument:Section 7.16.1
These default argument promotions are specified in section 6.5.3.3:
There are a couple of additional conversions that are allowed, such as casting signed to/from unsigned integers.
A concrete example of what is not allowed is to use
va_argto read (signed or unsigned)charorshort, orfloatarguments. Reading such types is UB. See also #61275 (comment).How c-variadics work in rust
Rust API surface
The rust API has similar capabilities to C but uses rust names and concepts.
The
VaListstruct has a lifetime parameter that ensures that theVaListcannot outlive the function that created it. SemanticallyVaListcontains mutable references (to the caller's and/or callee's stack), so the lifetime is covariant.The
#[rustc_pass_indirectly_in_non_rustic_abis]attribute is applied to the definition ifva_listis a single-element array on the target platform. This attribute simulates the array-to-pointer decay that the Cva_listtype is subject to on that target. By using this attribute, Cva_listand RustVaListare FFI-compatible.This attribute works as desired because
VaListis a by-move type: in rust this is enforced by the type system, in C the specification requires it. Mutations to the passed object are not observable in the caller (because the value is semantically moved and hence inaccessible).On targets where
va_listis just a pointer or a struct the C and rust types are already FFI-compatible, so no special attribute is needed there.The
VaList::argmethod can be used to read the next argument. The implementation usesva_arg(though in most cases we re-implement the logic in rustc itself, see below). The return type is constrained byVaArgSafeso that only valid argument types can be read. In particular this mechanism prevents subtle issues around implicit numeric promotion in C. Reading an argument is unsafe because reading more arguments than were supplied is UB.The
VaArgSafetrait is guaranteed to be implemented for:c_int,c_longandc_longlongc_uint,c_ulongandc_ulonglongc_double*const Tand*mut TImplementations for other types are not guaranteed to be portable, so portable programs should not rely on e.g.
usizeorf64implementing this trait directly.C argument types are considered to have the rust type that corresponds to
core::ffi::*, so a Cintis mapped toc_intand so on. We don't consider the C_BitIntor_Float32types here, rather we (on most platforms) mapf64todoubleetc._BitInt(32)andintare distinct types._BitIntis furthermore special because it does not participate in integer promotion.The rust implementation requires that the caller and callee agree on the exact type. This is slightly more conservative than C, which does allow e.g. conversion between signed and unsigned integers, or to/from void pointers. Reading a value as a different type than it was passed as is undefined behavior.
The
VaListtype implementsCloneandDrop:The
Cloneimplementation can be used to duplicate aVaList. The copy has the same position as the original, but both can be incremented independently. While all current targets could also implementCopyforVaList, a future target might not, so for nowCopyis not implemented.The
Dropimplementation is a no-op. This choice is based on the assumption that in rust it is safe to not run a destructor. Additionally,va_endis a no-op for all current LLVM targets. In C,va_endmust run in the frame where theva_listwas initialized. BecauseVaListcan be moved (like the Cva_list), the frame in which aVaListis dropped may not be the frame in which it was initialized.The
VaListtype is available on all targets, even on targets that don't actually support c-variadic definitions. On such targets, it is impossible to get a validVaListvalue, because attempting to define a c-variadic function (using the...argument) will throw an error.Syntax
In rust, a C-variadic function looks like this:
The special
args: ...argument stands in for an arbitrary number of arguments that the caller may pass. The...argument must be the last argument in the parameter list of a function. Like in C23 and later,...may be the only argument. The...syntax is already stable in foreign functions,c_variadicadditionally allows it in function definitions.In function definitions, the
...argument must have a pattern. The argument can be ignored by using_: .... In foreign function declarations the pattern can be omitted.A function definition with a
...argument must be anunsafefunction. Passing an incorrect number of arguments, or arguments of the wrong type, is UB, and hence every call site has to satisfy the safety conditions. A special case is a function that ignores itsVaListentirely using_: ...: we may decide to allow such functions to be safe. At the time of writing we see insufficient benefits relative to the additional complexity that this entails.A function with the
...argument must be anextern "C"orextern "C-unwind"function. In the future we want to extend the set of accepted ABIs to include all ABIs for which we allow calling a c-variadic function (including e.g.sysv64andwin64).The
...argument can occur in definitions of functions, inherent methods, and trait methods. When any method on a trait uses a c-variadic argument, the trait is no longer dyn-compatible. The technical reason is that there is no sound way to generate aReifyShimthat passes on the c-variadic arguments.Desugaring
In a function like this:
The
args: ...is internally desugared into a call to LLVM'sva_startthat initializesargsas aVaList, and a call to LLVM'sva_endon every return path. TheVaListgets the lifetime of a local variable onfoo's stack, so that theVaListcannot outlive the function that created it.The desugaring will fail with an error when the current target does not support c-variadic definitions. Currently this is the case for
sprivandbpf:A note on LLVM
va_argThe LLVM
va_argintrinsic is known to silently miscompile. A comment in the implementation notes:Hence, like
clang,rustcimplementsva_argfor the vast majority of targets (specifically including all tier-1 targets) inva_arg.rs. Note that the match on the architecture is exhaustive. If no custom implementation is provided, the LLVM implementation is used as a fallback. This fallback is likely to work, but may silently miscompile the program.Future extensions
C-variadics and
const fnSupport for c-variadic
const fn(and by extension, support in Miri) is implemented in#150601 and gated by
const_c_variadic. Practical usage requiresconst_destructtoo becauseVaListhas a customDropimplementation.Naked variadic functions
Currently only
CandC-unwindare valid ABIs for all c-variadic function definitions. With naked functions it is possible to define e.g. awin64c-variadic function in a program wheresysv64is the default. This feature is tracked asc_variadic_naked_functions.C-variadics and coroutines
An
async fnor any other type of coroutine cannot be c-variadic. We see no reason to support this.Defining safe C-variadic functions
In
externblocks, it is valid to mark C-variadic functions as safe, under the assumption that the function completely ignores the variable arguments list:Normally, C-variadic function definitions must be unsafe, because calling the function with unexpected (in type or number) elements is UB. We could relax this constraint on C-variadic functions that ignore their C variable argument list, e.g.:
At the moment we don't have a good reason to add this behavior. It is completely backwards compatible, so if a need arises in the future we can revisit this.
Accepting more
va_argreturn typesDiscussed in #44930 (comment).
We only want to implement
VaArgSafefor types that have a clear counterpart in C. That rules out types likeOption<NonNull>orNonZeroI32. We might addMaybeUninit<c_int>and so on in the future if a use case comes up: this would map to a Cunion.The only planned extension right now is support for
i128andu128where they can be mapped to (unsigned)__int128. Howeverrustccurrently does not know on what targets this type is available, and hence we leave it out of the stabilization for now.Adding 128-bit support is in progress in #155429.
Multiple C-variadic ABIs in the same program
#141618
Both
clangandgccreject using...in functions with a non-default ABI for the target. That makes the layout ofVaListand expansion ofva_start,va_argetc. unambiguous. For now we impose a similar restriction for the rust implementation. This restriction could be lifted in the future, but this would requite thatVaListsomehow "stores" its ABI.One approach is to add a type parameter to
VaListthat default's to the platform's default ABI. Each c-variadic argument would then desugar to use the ABI of the c-variadic function that creates it.History
RFC 2137 proposes to "support defining C-compatible variadic functions in rust" in 2017, and it is still the core of the implementation today. The text lays out a basic rust API and highlights potential issues (e.g. some solution is needed to match C's array-to-pointer decay), but does not always provide concrete solutions.
In 2019 #59625 introduces a wrapper type to simulate array-to-pointer decay. With this API the C semantics can be matched, but doing so correctly takes a great deal of care. The
VaListtype also has two lifetime arguments in this version, which is inelegant.Then, little seems to have happened for 6 years, until the recent burst of activity that resulted in the current proposal.
c_variadicproposal #141524implementation history
The list of PRs is long, but they have all been labled with
F-c_variadic.Unresolved Questions
VaArgSafeand function pointers#153646
Currently
VaArgSafeis not implemented for function pointers, and doing so would be tricky. There is the practical issue of not being able to be generic over the number of arguments, but there are also some complex constraints on the signature, see https://www.gnu.org/software/c-intro-and-ref/manual/html_node/Compatible-Types.html.Handling niche tier-3 targets
See #44930 (comment).
There are some tier-3 targets where we're not 100% confident that the implementation is correct.
There are no known issues, and to our knowledge we match clang in all cases, but these targets are niche and hard to run (
no_std, compilers are not easily available, etc.), and the target maintainers have not confirmed that the current implementation works. How should we handle this?va_arg) and hope for the best. these are all tier-3 targets, so shrugbug!to trigger an ICE on those targetsc_variadic_experimental_archMy personal preference is the feature gate, because it unblocks users, we're actually quite confident in the implementation being correct, and to find any lingering issues we'd need to be able to try it.
Thanks
Many people have worked on this feature over the years, and many more have provided input. I'd like to credit here the people that have been especially involved in this push for stabilization: @workingjubilee, @RalfJung, @beetrees, @joshtriplett and @tgross35.
r? @tgross35