In C++20 we will hopefully get bit_cast see the proposal and reference implementation. This utility should give a simple and safe way to type pun.
The one issue I ran into is that is requires the size of the To and From type to be the same, as well as checking To and From are trivially copyable, the static_assert version is below:
# define BIT_CAST_STATIC_ASSERTS(TO, FROM) do { \
static_assert(sizeof(TO) == sizeof(FROM)); \
static_assert(std::is_trivially_copyable<TO>::value); \
static_assert(std::is_trivially_copyable<FROM>::value); \
} while (false)
This is not an unreasonable constraint but there may be cases we would like to type pun let’s say an array of char into a primitive type like unsigned int.
After discussing this with the JF Bastien the proposal's author as well as the author of the reference implementation, one way around this restriction is to copy the chunk we want to pun into a struct with the same size as the primitive we are punning to. Let’s see how this would work.
struct four_chars {
unsigned char arr[4] = {} ;
} ;
unsigned int foo( unsigned char *p ) {
four_chars f ;
std::memcpy( f.arr, p, 4) ;
unsigned int result = bit_cast<unsigned int>(f) ;
return result ;
}
What is great about this is that the optimizer is smart enough to recognize the memcpy and bit_cast can be reduce to a single mov directly into a register see gobolt:
foo(unsigned char*): # @foo(unsigned char*)
mov eax, dword ptr [rdi]
ret
It is worth it to point the interesting aspect of the constexpr case. First this requires compiler support since memcpy() is not marked constexpr and reinterpret_cast is not allowed in a constant expression, likely via a builtin.
This mainly works since the underlying assumption is that the type puns allowed by bit_cast can be implemented as a mov to a register. This feature is interesting because it now allows type punning at compile time. This also means no undefined behavior since undefined behavior is not allowed in a constant expression and we expect any attempt to invoke UB will be caught at compile time.
Some cases this could pop up are a bit_cast from a type whose underlying representation has no value in the To type. For example the standard does not specify the underlying representation of bool therefore a bit_cast to bool could invoke undefined behavior. e.g.
bool b = bit_cast<bool>('a') ; // UBsan catches this case: https://wandbox.org/permlink/P7hlo7AZDx2t0PoY
// runtime error: load of value 97, which is not a valid value for type 'bool'
We also have unspecified behavior for cases where the To type could have multiple possibe values for a given From value e.g.:
bit_cast<char>(true) // We are not guaranteed any specific value here.
// Although we may have certain expectations.
bit_cast<uintptr_t>(nullptr) // We would expect zero but it is not specified what the underlying value is