Skip to content

Instantly share code, notes, and snippets.

@Nihhaar
Forked from shakna-israel/LetsDestroyC.md
Created January 31, 2020 06:24

Revisions

  1. @shakna-israel shakna-israel created this gist Jan 30, 2020.
    230 changes: 230 additions & 0 deletions LetsDestroyC.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,230 @@
    # Let's Destroy C

    I have a pet project I work on, every now and then. [CNoEvil](https://git.sr.ht/~shakna/cnoevil3/).

    The concept is simple enough.

    What if, for a moment, we forgot all the rules we know. That we ignore every good idea, and accept all the terrible ones. That nothing is off limits. Can we turn C into a new language? Can we do what Lisp and Forth let the over-eager programmer do, but in C?

    ---

    ## Some concepts

    We're going to point out some definitions in other files - they're too big to inline into a blog post.

    You can assume that all of these header definitions get collapsed into a single file, called `evil.h`.

    We won't dwell on many C features. If they're not obvious to you, there's a lot of information at your fingertips to explain them. The idea here isn't to explain how C has moved on. It's to abuse it.

    ---

    First of all, let's fix up a simple program:


    ```C

    #include <stdio.h>

    int main(int argc, char* argv[]) {

    printf("%s\n", "Hello, World!");

    }


    ```
    That's an awful lot of symbolic syntax.
    Let's try and get rid of a little of that.
    ## Format
    Format specifiers are incredibly useful in C. Allowing you to specify how many decimal places to put after a float, where to use commas when outputting numbers. Whether to use the locale specifier to get the right ```,``` or ```.``` syntax, etc.
    But, for the general case, we don't need it. So we can make it disappear.
    We can do this, thanks to a C11 feature, called `_Generic`, which is sort of like a type-based switch. It'll match against the first compatible type.
    If we define `display_format` as a `_Generic` switch, like you can see in `evil_io.h`, then we can replace our printf with a very simple set of defines:
    #define display(x) printf(display_format(x), x)
    #define displayln(x) printf(display_format(x), x); printf("%s", "\r\n")
    Now we can rewrite our program like this:
    ```C
    #include "evil.h"
    int main(int argc, char* argv[]) {
    displayln("Hello, World!");
    }
    ```

    There. That's a lot more high level. And it works correctly for a whole bunch of things other than strings, too.

    ## Main

    We've got a fairly typical `main` definition here. But we can do better. We can hide `argc` and `argv`, and just assume the programmer knows they're implicitly available. Because there is nothing worse than implicit values.

    In fact, we'll also silence the compiler that might complain if we don't end up using them to inspect commandline flags.

    #define Main int main(int __attribute__((unused)) argc, char __attribute__((unused)) **argv

    Unfortunately, just defining our `Main` isn't enough. We need a couple more defines, which will come in extremely handy in the future. Just a couple symbol replacements.

    #define then ){
    #define end }

    Now. That's better. We can now rewrite our program:

    ```C
    #include "evil.h"

    Main then
    displayln("Hello, World!");
    end
    ```
    Brilliant. Now it doesn't look like C. It still compiles like C. In fact, it should compile without warnings.
    (Have a glance at `evil_flow.h` for a few more useful defines that mean we can escape the brace syntax and pretend that C works like Lua's syntax.)
    ## High Level Constructs
    We've got a `Hello, World` that looks simple. It wasn't a hard path to get here. But we can do even better than that.
    We can add in things people don't expect to exist in C at all.
    Then we can start pretending our poor, abused little program is actually a higher level language than it is. And we haven't even broken any C syntax, which means we can safely and easily link against any other C library, even if it is a header-only library.
    ### Lambda
    With a GNU-extension (it may or may not work under other compilers), we can easily write a `lambda`, and give C the ability to have anonymous functions. We still need to use C's function-pointer syntax, but that doesn't turn out too bad in practice.
    #define lambda(ret_type, _body) ({ ret_type _ _body _; })
    There! Simple, isn't it? Well, maybe not entirely obvious how it works. (See `evil_lambda.h` for our full implementation.)
    ```C
    #define EVIL_LAMBDA
    #include "evil.h"
    Main then
    int (*max)(int, int) = lambda(int,
    (int x, int y) {
    return x > y ? x : y;
    });
    displayln(max(1, 2));
    end
    ```

    We create a function pointer called max, which returns an int, and takes two int arguments. The lambda assigned to it matches. It returns the bigger of the two values with a simple one-liner.

    You use it like you might expect, but `max` only exists inside main, and is ready to be passed to another function so you can start building up your functional tools.

    ### Coroutines

    You can write proper coroutine systems for C. They tend to be big, and complicated and extremely helpful.

    But we're doing the wrong thing.

    So, apart from emitting some compile-time warnings, the crux of `evil_coroutine.h` is this magnificent madness:

    ```C
    // Original macro hack by Robert Elder (c) 2016. Used against their advice, but with their permission.
    #define coroutine() static int state=0; switch(state) { case 0:
    #define co_return(x) { state=__LINE__; return x; case __LINE__:; }
    #define co_end() }
    ```
    By storing state and using a switch as a computer `GOTO`, you can now write functions that appear to be resumeable.
    Like so:
    ```C
    #define EVIL_COROUTINE
    #include "evil.h"
    int example() {
    static int i = 0;
    coroutine();
    While true then
    co_return(++i);
    end
    co_end();
    return i;
    }
    Main then
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    end
    ```

    Despite being dangerous, and poorly thought through if you're insane enough to put this anywhere near production code, we are looking like we have coroutines.
    Unfortunately, those damn braces are back again.
    ## Procs
    Technically speaking, C doesn't have functions. Because functions are pure and have no side-effects, and C is one giant stinking pile of a side-effect.

    What C has, is properly known as `procedures`. So let's reflect that when we redefine how we make them, to get ride of the braces:
    ```C
    #define declare(_name, _ret, ...) _ret _name(__VA_ARGS__)
    #define proc(_name, _ret, ...) _ret _name(__VA_ARGS__){
    ```
    This fits in nicely with our existing `then` and `end` macros.
    We put the return type right before any listing of arguments, and after the name, which can make it easier when reading over the definition or decleration.
    It let's us change the above example into this marvelous little beauty:

    ```C
    #define EVIL_COROUTINE
    #include "evil.h"

    declare(example, int);

    proc(example, int)
    static int i = 0;
    coroutine();
    While true then
    co_return(++i);
    end
    co_end();
    return i;
    end

    Main then
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    displayln(example());
    end

    ```
    That's better. It looks more consistent with the rest of our syntax, whilst still not breaking how C works at all.
    We've practically abolished symbols in the final syntax. They're still there, but minimal. We haven't introduced any whitespace sensitivity, but we have simplified how it looks. Made it feel like a scripting language.
    ---
    [CNoEvil](https://git.sr.ht/~shakna/cnoevil3/) goes a lot further than this. It adds introspection, a new assert library with it's own stacktrace format, hash routines and so on.
    But this is a taste of how well you can screw up the C language with just a handful of overpowered macros.
    20 changes: 20 additions & 0 deletions evil_coroutine.h
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,20 @@

    #ifdef EVIL_COROUTINE
    // This lovely hack makes use of switch statements,
    // And the __LINE__ C macro
    // It tracks the current state, and switches case.
    // But... I imagine awful things may happen with an extra semi-colon.
    // Which would be hard to debug.
    #if defined(EVIL_LAMBDA) && !defined(EVIL_NO_WARN)
    // And bad things happen with expression statements.
    #warning "Lambda's don't play well with Coroutines. Avoid using them in the body of a coroutine."
    #endif
    #ifndef EVIL_NO_WARN
    #warning "Coroutine's don't support nesting. It may work sometimes, other times it may explode."
    #endif

    // Original macro hack by Robert Elder (c) 2016. Used against their advice, but with their permission.
    #define coroutine() static int state=0; switch(state) { case 0:
    #define co_return(x) { state=__LINE__; return x; case __LINE__:; }
    #define co_end() }
    #endif
    14 changes: 14 additions & 0 deletions evil_flow.h
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,14 @@

    #ifndef EVIL_NO_FLOW
    // Included by default

    #define then ){
    #define end }
    #define If if(
    #define Else } else {
    #define For for(
    #define While while(
    #define Do do{
    #define Switch(x) switch(x){
    #define Case(x) case x:
    #endif
    183 changes: 183 additions & 0 deletions evil_io.h
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,183 @@
    #ifndef EVIL_NO_IO
    // The IO Module.
    // Included by default. To pretend C is high-level.


    // User wants IO, give the all the IO.
    #include <stdio.h>

    // Yes, Generics. (aka type-switch). It's C11 only,
    // but who cares.
    // stdint identifiers (inttypes.h) should be catered for by the below.
    // Original display_format macro by Robert Gamble, (c) 2012
    // used with permission.
    // Expanded upon to incorporate const, volatile and const volatile types,
    // as they don't get selected for. (static does for obvious reasons).

    // Whilst volatile types can change between accesses, technically using a
    // _Generic _shouldn't_ access it, but compile to the right choice.

    #define display_format(x) _Generic((x), \
    char: "%c", \
    signed char: "%hhd", \
    unsigned char: "%hhu", \
    signed short: "%hd", \
    unsigned short: "%hu", \
    signed int: "%d", \
    unsigned int: "%u", \
    long int: "%ld", \
    unsigned long int: "%lu", \
    long long int: "%lld", \
    unsigned long long int: "%llu", \
    float: "%f", \
    double: "%f", \
    long double: "%Lf", \
    char *: "%s", \
    void *: "%p", \
    volatile char: "%c", \
    volatile signed char: "%hhd", \
    volatile unsigned char: "%hhu", \
    volatile signed short: "%hd", \
    volatile unsigned short: "%hu", \
    volatile signed int: "%d", \
    volatile unsigned int: "%u", \
    volatile long int: "%ld", \
    volatile unsigned long int: "%lu", \
    volatile long long int: "%lld", \
    volatile unsigned long long int: "%llu", \
    volatile float: "%f", \
    volatile double: "%f", \
    volatile long double: "%Lf", \
    volatile char *: "%s", \
    volatile void *: "%p", \
    const char: "%c", \
    const signed char: "%hhd", \
    const unsigned char: "%hhu", \
    const signed short: "%hd", \
    const unsigned short: "%hu", \
    const signed int: "%d", \
    const unsigned int: "%u", \
    const long int: "%ld", \
    const unsigned long int: "%lu", \
    const long long int: "%lld", \
    const unsigned long long int: "%llu", \
    const float: "%f", \
    const double: "%f", \
    const long double: "%Lf", \
    const char *: "%s", \
    const void *: "%p", \
    const volatile char: "%c", \
    const volatile signed char: "%hhd", \
    const volatile unsigned char: "%hhu", \
    const volatile signed short: "%hd", \
    const volatile unsigned short: "%hu", \
    const volatile signed int: "%d", \
    const volatile unsigned int: "%u", \
    const volatile long int: "%ld", \
    const volatile unsigned long int: "%lu", \
    const volatile long long int: "%lld", \
    const volatile unsigned long long int: "%llu", \
    const volatile float: "%f", \
    const volatile double: "%f", \
    const volatile long double: "%Lf", \
    const volatile char *: "%s", \
    const volatile void *: "%p", \
    default: "%d")

    // The main printing function.
    #define display(x) printf(display_format(x), x)
    #define displayf(f, x) fprintf(f, display_format(x), x)

    // Windows has a different line ending.
    #if defined(_WIN32) || defined(__WIN32) || defined(WIN32) || defined(__WIN32__) || defined(_WIN64) || defined(__WIN64) || defined(WIN64) || defined(__WIN64__) || defined(__WINNT) || defined(__WINNT__) || defined(WINNT)
    #define displayln(x) printf(display_format(x), x); printf("%s", "\r\n")
    #define displayfln(f, x) fprintf(f, display_format(x), x); printf("%s", "\r\n")
    #else
    #define displayln(x) printf(display_format(x), x); printf("%c", '\n')
    #define displayfln(f, x) fprintf(f, display_format(x), x); printf("%c", '\n')
    #endif

    // Basically a _Generic.
    #define repr_type(x) _Generic((0,x), \
    char: "char", \
    signed char: "signed char", \
    unsigned char: "unsigned char", \
    signed short: "signed short", \
    unsigned short: "unsigned short", \
    signed int: "signed int", \
    unsigned int: "unsigned int", \
    long int: "long int", \
    unsigned long int: "unsigned long int", \
    long long int: "long long int", \
    unsigned long long int: "unsigned long long int", \
    float: "float", \
    double: "double", \
    long double: "long double", \
    char *: "char pointer", \
    void *: "void pointer", \
    volatile char: "volatile char", \
    volatile signed char: "volatile signed char", \
    volatile unsigned char: "volatile unsigned char", \
    volatile signed short: "volatile signed short", \
    volatile unsigned short: "volatile unsigned short", \
    volatile signed int: "volatile signed int", \
    volatile unsigned int: "volatile unsigned int", \
    volatile long int: "volatile long int", \
    volatile unsigned long int: "volatile unsigned long int", \
    volatile long long int: "volatile long long int", \
    volatile unsigned long long int: "volatile unsigned long long int", \
    volatile float: "volatile float", \
    volatile double: "volatile double", \
    volatile long double: "volatile long double", \
    volatile char *: "volatile char pointer", \
    volatile void *: "volatile void pointer", \
    const char: "const char", \
    const signed char: "const signed char", \
    const unsigned char: "const unsigned char", \
    const signed short: "const signed short", \
    const unsigned short: "const unsigned short", \
    const signed int: "const signed int", \
    const unsigned int: "const unsigned int", \
    const long int: "const long int", \
    const unsigned long int: "const unsigned long int", \
    const long long int: "const long long int", \
    const unsigned long long int: "const unsigned long long int", \
    const float: "const float", \
    const double: "const double", \
    const long double: "const long double", \
    const char *: "const char pointer", \
    const void *: "const void pointer", \
    const volatile char: "const volatile char", \
    const volatile signed char: "const volatile signed char", \
    const volatile unsigned char: "const volatile unsigned char", \
    const volatile signed short: "const volatile signed short", \
    const volatile unsigned short: "const volatile unsigned short", \
    const volatile signed int: "const volatile signed int", \
    const volatile unsigned int: "const volatile unsigned int", \
    const volatile long int: "const volatile long int", \
    const volatile unsigned long int: "const volatile unsigned long int", \
    const volatile long long int: "const volatile long long int", \
    const volatile unsigned long long int: "const volatile unsigned long long int", \
    const volatile float: "const volatile float", \
    const volatile double: "const volatile double", \
    const volatile long double: "const volatile long double", \
    const volatile char *: "const volatile char pointer", \
    const volatile void *: "const volatile void pointer", \
    default: "Unknown")


    // endl, just a symbol that can be used to produce the normal
    // line ending.
    // endlf can take a file to print to.
    // e.g. ```display(x); display(y); endl;```
    // ```endlf(FILE* x);```
    // Windows has a different line ending.
    #if defined(_WIN32) || defined(__WIN32) || defined(WIN32) || defined(__WIN32__) || defined(_WIN64) || defined(__WIN64) || defined(WIN64) || defined(__WIN64__) || defined(__WINNT) || defined(__WINNT__) || defined(WINNT)
    #define endl printf("%s", "\r\n")
    #define endlf(f) fprintf(f, "%s", "\r\n")
    #else
    #define endl printf("%c", '\n')
    #define endlf(f) fprintf(f, "%c", '\n')
    #endif

    #endif
    20 changes: 20 additions & 0 deletions evil_lambda.h
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,20 @@

    #ifdef EVIL_LAMBDA
    // This requires nested functions to be allowed.
    // Only GCC supports it.
    // ... Unconfirmed if Clang does. It might.
    #if defined(__clang__) || !defined(__GNUC__)
    #error "Lambda requires a GNU compiler."
    #endif
    // A cleaner, but slightly more cumbersome lambda:
    #define lambda(ret_type, _body) ({ ret_type _ _body _; })
    // e.g. int (*max)(int, int) = lambda (int, (int x, int y) { return x > y ? x : y; });
    // Pros:
    // * Woot, easier to pass, as the user has to know the signature anyway.
    // * Name not part of lambda definition. More lambda-y.
    // * Body of function inside macro, feels more like a lambda.
    // * Uses expression disgnator (GCC-only), which creates a properly constructed function pointer.
    // * It *may* work under Clang too!
    // Cons:
    // * The signature isn't constructed for the user, they have to both know and understand it.
    #endif
    6 changes: 6 additions & 0 deletions evil_main.h
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,6 @@


    #ifndef EVIL_NO_MAIN
    // Included by default
    #define Main int main(int __attribute__((unused)) argc, char __attribute__((unused)) **argv
    #endif
    7 changes: 7 additions & 0 deletions evil_proc.h
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,7 @@

    #ifndef EVIL_NO_PROC
    // Included by default

    #define declare(_name, _ret, ...) _ret _name(__VA_ARGS__)
    #define proc(_name, _ret, ...) _ret _name(__VA_ARGS__){
    #endif