Last active
February 7, 2022 02:29
-
-
Save jnguyen1098/63e549d9965f93e0c80352066272f3a6 to your computer and use it in GitHub Desktop.
Re-interpret `int` as four bytes without casts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <stdio.h> | |
#include <limits.h> | |
/** | |
* A `union` is defined like a `struct`, but rather than have every member | |
* exist simultaneously (in the case of the latter), each member exists | |
* exclusively. That is, every top-level member of a `union` defines a | |
* separate way to "interpret" its bytes. | |
* | |
* In this particular example, there are two top-level members: | |
* | |
* - `int` whole, which represents interpreting the entire thing as an `int` | |
* - an anonymous `struct` containing `char`s `c0`, `c1`, `c2`, and `c3`, all | |
* of which representing a single "byte" of the 4-byte `union`. Now, this | |
* particular example is implementation-dependent, as the size of an `int` | |
* isn't fixed within the standard, but on my system, an `int` is 4 bytes. | |
* | |
* The `struct` is needed in this case as we want to interpret the four | |
* `unsigned char`s as one, single top-level member of the `union`. If we | |
* were to, instead, declare the `union` as: | |
* | |
* union my_int { | |
* int whole; | |
* unsigned char c0; | |
* unsigned char c1; | |
* unsigned char c2; | |
* unsigned char c3; | |
* } | |
* | |
* then the `union` could only be interpreted as an `int` or as a `char`, | |
* as opposed to an `int` and a block (`struct`) containing four `chars`, | |
* which would each hold a particular position relative to the variable's | |
* data values. Pretty neat! | |
*/ | |
union my_int { | |
int whole; | |
struct { | |
unsigned char c0; | |
unsigned char c1; | |
unsigned char c2; | |
unsigned char c3; | |
}; | |
}; | |
int main(void) | |
{ | |
// 00 00 00 01 | |
union my_int test = { 1 }; | |
printf("%d\n", test.whole); | |
printf("%x %x %x %x\n", test.c3, test.c2, test.c1, test.c0); | |
// 7F FF FF FF | |
union my_int test2 = { INT_MAX }; | |
printf("%d\n", test2.whole); | |
printf("%x %x %x %x\n", test2.c3, test2.c2, test2.c1, test2.c0); | |
return 0; | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Aksually, aliasing via union is a violation of the strict aliasing rule. This only "works" in GCC and clang.