- Feature Name:
repr_simple - Start Date: 2021-02-15
- RFC PR: rust-lang/rfcs#0000
- Rust Issue: rust-lang/rust#0000
This RFC
- clarifies the behavior of
repr(C), - defines the term equivalent C type,
- soft-deprecates
repr(packed)onrepr(C)types, - adds
repr(pragma_pack)onrepr(C)types, and - adds
repr(simple)whose behavior is the behavior ofrepr(C)today.
This is a breaking change because
- it changes the layout of certain
repr(C)types on certain Windows targets and - it removes support for
repr(C, align)enums.
repr(C) was originally conceived to give structs the same layout as the
equivalent struct in C. In 2017, language was added to the reference that
specifies the behavior of repr(C) in terms of algorithms. This change was made
without an RFC and the behavior described by these algorithms is not the
behavior of MSVC. This violates the original intent of repr(C).
Since then, these algorithms have also found their way into the standard library
in form of the Layout::extend function which makes guarantees about
the layout of repr(C) structs.
The changes in this RFC restore the original design of repr(C), increasing the
portability of repr(C), allowing us to fix the behavior of repr(C) on MSVC
targets. For users that rely on the precise layout currently described in the
reference, it provides an upgrade path in the form of repr(simple).
This section describes the changes in terms of changes to the reference and standard library.
Changes to Layout::extend
All mentions of repr(C) in the doc comment are replaced by repr(simple).
Changes to the current The C Representation section
The section The C Representation in the reference is renamed to The
simple Representation. The following changes are made to its body:
The C representation is designed for dual purposes. One purpose is for creating types that are interoperable with the C Language. The second purpose is to create types that you can soundly perform operations on that rely on data layout such as reinterpreting values as a different type.
Because of this dual purpose, it is possible to create types that are not useful for interfacing with the C programming language.The layout of a container type annotated with
repr(simple)depends only on the layout of its fields but not on the target platform or the Rust version.This representation can be applied to structs, unions, and enums.
The exception is zero-variant enums for which the C representation is an error.
...
Note: This algorithm can produce zero-sized structs. In C, an empty struct declaration likestruct Foo { }is illegal. However, both gcc and clang support options to enable such structs, and assign them size zero. C++, in contrast, gives empty structs a size of 1, unless they are inherited from or they are fields that have the[[no_unique_address]]attribute, in which case they do not increase the overall size of the struct.
A union declared with#[repr(C)]will have the same size and alignment as an equivalent C union declaration in the C language for the target platform.The unionA union declared with#[repr(simple)]will have a size of the maximum size of all of its fields rounded to its alignment, and an alignment of the maximum alignment of all of its fields. These maximums may come from different fields.
...
For [field-less enums], theCrepresentation has the size and alignment of the defaultenumsize and alignment for the target platform's C ABI.For [field-less enums], the layout of the
simplerepresentation depends on the range of the discriminants. The layout is chosen from the layouts of the following types, choosing the first type that can represent all discriminants:
- i32, u32
- i64, u64
- i128, u128
NOTE: While repr(C) on field-less enums was designed to use the representation
used by C, the current implementation always uses i32 as the base type.
Note: The enum representation in C is implementation defined, so this is really a "best guess". In particular, this may be incorrect when the C code of interest is compiled with certain flags.
Warning: There are crucial differences between anenumin the C language and Rust's [field-less enums] with this representation. Anenumin C is mostly atypedefplus some named constants; in other words, an object of anenumtype can hold any integer value. For example, this is often used for bitflags inC. In contrast, Rust’s [field-less enums] can only legally hold the discrimnant values, everything else is [undefined behavior]. Therefore, using a field-less enum in FFI to model a Cenumis often wrong.The representation of a
repr(C)repr(simple)enum with fields is arepr(C)repr(simple)struct with two fields,also called a "tagged union" in C:
- a
repr(simple)version of the enum with all fields removed ("the tag")- a
repr(simple)union ofrepr(simple)structs for the fields of each variant that had them ("the payload")Note: Due to the representation of
repr(C)repr(simple)structs and unions, if a variant has a single field there is no difference between putting that field directly in the union or wrapping it in a struct; any system which wishes to manipulate such anenum's representation may therefore use whichever form is more convenient or consistent for them.
...
NOTE: At the end of the body are examples. Here all mentions of repr(C) are
replaced by repr(simple) and references to C and C++ are removed.
A new repr(C) section is added in place of the old one. The body text is as
follows:
The
Crepresentation is designed for interoperability with the C Language, including various vendor extensions. The layout of a type annotated withrepr(C)is defined in terms of its equivalent C type and the target the program is compiled for. For this purpose, for each target, we define a normative compiler whose layout algorithm will be used to determine the layout of the equivalent C type. Here, layout means the size and alignment of the type and, for record types, the offset and size of its fields.NOTE: GCC, Clang, and MSVC allow the user to specify flags on the command line that change the default layout of types. Such custom default representations are currently not supported. (For example:
-fshort-enumsand/Zp1.)Whether the equivalent C type exists depends on the normative compiler. The algorithm used to compute the equivalent C type is described in Appendix B. If a Rust type annotated with
repr(C)does not have an equivalent C type, then the layout of the Rust type is unspecified.This representation can be applied to structs, unions, and enums.
Warning: There are crucial differences between an enum in the C language and Rust's [field-less enums] with this representation. An enum in C is mostly a typedef plus some named constants; in other words, an object of an enum type can hold any integer value. For example, this is often used for bitflags in C. In contrast, Rust’s [field-less enums] can only legally hold the discrimnant values, everything else is [undefined behavior]. Therefore, using a field-less enum in FFI to model a C enum is often wrong.
On structs and unions, the
Crepresentation can be combined with the following alignment modifiers:
repr(align(N)): This is equivalent to__declspec(align(N))and__attribute__((aligned(N)))in C.repr(pragma_pack(N)): This is equivalent to#pragma pack(N)in C.repr(packed): This is an alias forrepr(pragma_pack(1)).repr(packed(N)): This is an alias forrepr(pragma_pack(N)).NOTE:
repr(packed)andrepr(packed(N))onrepr(C)types is deprecated. A future version of rustc will emit a warning when these annotations are used together.
Changes to the Primitive representations section
All mentions of repr(C) in this section are replaced by repr(simple).
References to C and C++ are removed.
Changes to the The alignment modifiers section
The following paragraph is added:
This section describes the behavior for representations other than the
Crepresentation. For the behavior of these modifiers forrepr(C)types, see the section describing theCrepresentation.
Changes to the The transparent Representation section
All mentions of repr(C) in this section are replaced by repr(simple).
TODO: Almost everything is already described in the section above. Here we add some details and talk about lints etc.
- This changes the layout of
repr(C)types on some Windows targets. - This adds an additional type representation.
- This adds a new alignment modifier on
repr(C)types that behaves exactly like one of the existing modifiers. - This defines the layout of
repr(C)types in terms of specific C compiler implementations.
TODO
-
In Clang and GCC there are two kinds of packing attributes:
__attribute__((packed))#pragma pack
These behave significantly differently. In anticipation of us adding support for
__attribute__((packed))in the future, we deprecate the ambiguously namedrepr(packed).
TODO
TODO
TODO
This appendix contains a map from Rust targets to their normative C compilers.
aarch64-unknown-linux-gnuaarch64-unknown-linux-muslaarch64-wrs-vxworksarm-unknown-linux-gnueabiarm-unknown-linux-gnueabihfarm-unknown-linux-musleabiarm-unknown-linux-musleabihfarmv4t-unknown-linux-gnueabiarmv5te-unknown-linux-gnueabiarmv5te-unknown-linux-musleabiarmv5te-unknown-linux-uclibceabiarmv7-unknown-linux-gnueabiarmv7-unknown-linux-gnueabihfarmv7-unknown-linux-musleabiarmv7-unknown-linux-musleabihfarmv7-wrs-vxworks-eabihfavr-unknown-gnu-atmega328i586-unknown-linux-gnui586-unknown-linux-musli686-pc-windows-gnui686-unknown-linux-gnui686-unknown-linux-musli686-uwp-windows-gnui686-wrs-vxworksmips64el-unknown-linux-gnuabi64mips64el-unknown-linux-muslabi64mips64-unknown-linux-gnuabi64mips64-unknown-linux-muslabi64mipsel-unknown-linux-gnumipsel-unknown-linux-muslmipsel-unknown-linux-uclibcmipsisa32r6el-unknown-linux-gnumipsisa32r6-unknown-linux-gnumipsisa64r6el-unknown-linux-gnuabi64mipsisa64r6-unknown-linux-gnuabi64mips-unknown-linux-gnumips-unknown-linux-muslmips-unknown-linux-uclibcpowerpc64le-unknown-linux-gnupowerpc64le-unknown-linux-muslpowerpc64-unknown-linux-gnupowerpc64-unknown-linux-muslpowerpc64-wrs-vxworkspowerpc-unknown-linux-gnupowerpc-unknown-linux-muslpowerpc-wrs-vxworksriscv32gc-unknown-linux-gnuriscv64gc-unknown-linux-gnus390x-unknown-linux-gnusparc64-unknown-linux-gnusparc-unknown-linux-gnuthumbv7neon-unknown-linux-gnueabihfthumbv7neon-unknown-linux-musleabihfx86_64-linux-kernelx86_64-pc-windows-gnux86_64-unknown-linux-gnux86_64-unknown-linux-gnux32x86_64-unknown-linux-muslx86_64-uwp-windows-gnux86_64-wrs-vxworks
aarch64-apple-darwinaarch64-apple-iosaarch64-apple-ios-macabiaarch64-apple-tvosaarch64-fuchsiaaarch64-linux-androidaarch64-unknown-freebsdaarch64-unknown-hermitaarch64-unknown-netbsdaarch64-unknown-noneaarch64-unknown-none-softfloataarch64-unknown-openbsdaarch64-unknown-redoxarmebv7r-none-eabiarmebv7r-none-eabihfarm-linux-androideabiarmv6-unknown-freebsdarmv6-unknown-netbsd-eabihfarmv7a-none-eabiarmv7a-none-eabihfarmv7-apple-iosarmv7-linux-androideabiarmv7r-none-eabiarmv7r-none-eabihfarmv7s-apple-iosarmv7-unknown-freebsdarmv7-unknown-netbsd-eabihfasmjs-unknown-emscriptenhexagon-unknown-linux-musli386-apple-iosi686-apple-darwini686-linux-androidi686-unknown-freebsdi686-unknown-haikui686-unknown-netbsdi686-unknown-openbsdmipsel-sony-pspmipsel-unknown-nonemsp430-none-elfpowerpc64-unknown-freebsdpowerpc-unknown-linux-gnuspepowerpc-unknown-netbsdpowerpc-wrs-vxworks-speriscv32imac-unknown-none-elfriscv32imc-unknown-none-elfriscv32i-unknown-none-elfriscv64gc-unknown-none-elfriscv64imac-unknown-none-elfsparc64-unknown-netbsdsparc64-unknown-openbsdsparcv9-sun-solaristhumbv4t-none-eabithumbv6m-none-eabithumbv7em-none-eabithumbv7em-none-eabihfthumbv7m-none-eabithumbv7neon-linux-androideabithumbv8m.base-none-eabithumbv8m.main-none-eabithumbv8m.main-none-eabihfwasm32-unknown-emscriptenwasm32-unknown-unknownwasm32-wasix86_64-apple-darwinx86_64-apple-iosx86_64-apple-ios-macabix86_64-apple-tvosx86_64-fortanix-unknown-sgxx86_64-fuchsiax86_64-linux-androidx86_64-pc-solarisx86_64-rumprun-netbsdx86_64-sun-solarisx86_64-unknown-dragonflyx86_64-unknown-freebsdx86_64-unknown-haikux86_64-unknown-hermitx86_64-unknown-hermit-kernelx86_64-unknown-illumosx86_64-unknown-l4re-uclibcx86_64-unknown-netbsdx86_64-unknown-openbsdx86_64-unknown-redox
aarch64-pc-windows-msvcaarch64-uwp-windows-msvci586-pc-windows-msvci686-pc-windows-msvci686-unknown-uefii686-uwp-windows-msvcthumbv7a-pc-windows-msvcthumbv7a-uwp-windows-msvcx86_64-pc-windows-msvcx86_64-unknown-uefix86_64-uwp-windows-msvc
This appendix describes the algorithm used to compute the equivalent C type of a Rust type. The equivalent C type is only used for layout computations.
Let R be a Rust type.
-
If
Ris one ofu8,i8,u16,i16,u32,i32,u64,i64,isize, orusize, then the equivalent C type is the first of the following C types that has the same size asR:char,short,int,long,long long.
NOTE: On all targets supported by Rust, if two of these C types have the same size, then they also have the same alignment.
-
If
Ris one off32orf64, then the equivalent C type is the first of the following C types that has the same size asR:float,double,long double,
NOTE: On
avr-unknown-unknown,doubleis a 32-bit type. -
If
Risbool, then the equivalent C type is_Bool. -
If
Risu128ori128and the__int128type is supported by the normative compiler, then the equivalent C type is__int128. -
If
Ris a pointer to a sized type, or a reference to a sized type, or a function pointer, or anOptionof a reference to a sized type, or anOptionof a function pointer, then the equivalent C type isvoid *. -
If
Ris an array whose element type has an equivalent C type, then the equivalent C type ofRis the array with the same size whose element type is the equivalent C type ofR's element type.NOTE: In positions where the normative compilers accept both zero-sized array members and flexible array members, the containing records have the same layout.
-
If
Ris a fieldless enum with arepr(C)annotation and without arepr(align)annotation:- Let
Ebe the C enum with the same enumeration constants and constant expressions. (This is a purley syntactic construction as the resulting syntax need not be accepted by the normative compiler.) - If
Eis accepted by the normative compiler and the enumeration constants are the same as the discriminant values ofR, thenEis the equivalent C type. - Otherwise
Rhas no equivalent C type.
NOTE: MSVC truncates enumeration constants to
int. - Let
-
If
Ris an enum with fields with arepr(C)annotation and without arepr(align)annotation:- Let
Fbe the equivalent fieldless Rust enum. - Let
Ube the union containing one field for each variant of the enum that has at least one field. The type of the field is the tuple or braced struct having the same fields as the enum variant body. - Let
Ebe the structstruct { d: F, u: U }. - The equivalent C type of the enum is the equivalent C type of
E, if any.
NOTE: There are no fields in
Ufor variants without fields because the normative compiler might not accept structs without fields. SinceRis an enum with fields,Uitself has at least one field.NOTE: It is significant that the discriminant is stored outside the union. This can affect the layout of the overall type.
- Let
-
If
Ris a tuple struct with arepr(C)annotation:- Let
Sbe the braced struct with the same number of fields whose fields are calledfield1,field2, etc. and whose field types are the types of the fields in the tuple struct. - The equivalent C type is the equivalent C type of
S, if any.
- Let
-
If
Ris a braced struct or a unit-like struct or a union with arepr(C)annotation:- If one of the types of the fields does not have an equivalent C type,
then
Rhas no equivalent C type. - Let
Sbe the C struct or union (depending on the type ofR) with the same field names and equivalent C field types. (This is a purley syntactic construction as the resulting syntax need not be accepted by the normative compiler.) - If
Ris annotated withrepr(align(N)):- If the normative compiler is MSVC, annotate
Swith__declspec(align(N)). - If the normative compiler is GCC or Clang, annotate
Swith__attribute__((aligned(N))).
- If the normative compiler is MSVC, annotate
- If
Ris annotated withrepr(pragma_pack(N)), annotateSwith#pragma pack(N). - If
Sis accepted by the normative compiler, thenEis the equivalent C type ofR.
- If one of the types of the fields does not have an equivalent C type,
then
-
Otherwise
Rhas no equivalent C type.