Skip to content

Instantly share code, notes, and snippets.

@namandixit
Last active March 15, 2025 15:06
Show Gist options
  • Save namandixit/9436864da6c9cf1c02627c6bb19848bd to your computer and use it in GitHub Desktop.
Save namandixit/9436864da6c9cf1c02627c6bb19848bd to your computer and use it in GitHub Desktop.
A jugaadu C23 parser
/*
* Creator: Naman Dixit
* Notice: © Copyright 2024 Naman Dixit
* License: BSD Zero Clause License
* SPDX: 0BSD (https://spdx.org/licenses/0BSD.html)
*/
/*
* This is an single-header zero-dependency easily-embeddable C23 parser and associated pretty-printer.
*
* Characteristics:
* 1. Designed for meta-programming: The parser is designed for the purpose of meta-programming (including but not limited to
* code generation, reflection, etc.). Thus, the parser is more liberal than it needs to be and accepts constructs that
* would otherwise be considered non-compliant. In other words, false positives are okay (unless too catastrophic) but
* false negatives are a bug.
* 2. Not production ready: Some corners were cut to get this done ASAP. So, tokens are less than 108 characters, memory is only allocated
* but never freed (use arenas), etc.
* 3. Not performant: For simplicity's sake, the simplest data structures and algorithms were used. So, instead of a hash table,
* we linearly scan a linked list, etc. No optimizations what-so-ever.
* 4. Not well-tested: There is a small test harness (which depends on SDL3, yeah I know), but no exhaustive testing or fuzzing.
* 5. Missing Features: There is no preprocessor and all pre-processing directives are treated as comments. Additionally, some
* small features regarding types like _Atomic, typeof, etc. are missing (pretty easy to add, I just never found the time).
* Search "TODO" for more info.
*
* Usage:
* HC_Parser* hcParserCreate (char *src, void *mem_userdata, void *err_userdata)
* Create a parser object with the source code and the userdata for the memory allocator (if any).
*
* void hcParserAddGlobalType (HC_Parser *parser, char *typename)
* Add any pre-defined types (int, float, etc.) into the type table (due to C's infamous ambiguous syntax around types).
*
* HC_Syntax_Tree hcParse (HC_Parser *parser)
* Parse the source code provided to hcParserCreate() and return an AST
*
* void hcPrintSyntaxTree (HC_Syntax_Tree syntax_tree, void *fmt_userdata, void *err_userdata)
* Pretty-print the syntax tree back into C code, (potentially after some modifications for meta-programming).
* fmt_userdata is passed to the HC_SOURCE_OUTPUR_PRINTF macro (see below)
*
* Configuration:
* Required: These macros need to be defined in one translation unit
* HYPERC_IMPLEMENTATION
* HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, format_string, ...)
* HC_MALLOC(mem_userdata, size)
* Optional:
* HC_ERROR_EXIT(err_userdata, err_code)
* HC_ERROR_PRINTF(format, ...)
* HC_STRLEN(str)
* HC_MEMCMP(lhs_ptr, rhs_ptr, byte_count)
* HC_MEMSET(dst_ptr, byte, byte_count)
* HC_MEMCPY(dst_ptr, src_ptr, byte_count)
* HC_FUNC_DECORATOR: defaults to static inline
* HC_BOOL, HC_TRUE, HC_FALSE: Boolean types (since bool is not always available)
* HYPERC_TEST: Run tests (adds a main function, only useful during development)
*/
#if 0 || defined(HYPERC_TEST)
/* Use the following command to test the compiler (requires SDL):
clang -x c -DHYPERC_TEST assets/code/hyperc.h --std=c23 -g3 -Wall -Wextra -I assets/code/ -I provisions/sources/ -o artifacts/code/win64/hyperc.test.exe -L provisions/binaries/win64/ -lSDL3 && artifacts\code\win64\hyperc.test.exe
*/
#include "SDL3/SDL.h"
typedef struct SDL_strformat_Data {char *str;} SDL_strformat_Data;
static inline SDL_PRINTF_VARARG_FUNC(2) void SDL_strformat (SDL_strformat_Data *data, SDL_PRINTF_FORMAT_STRING char *fmt, ...);
# define HYPERC_IMPLEMENTATION
# define HC_ERROR_EXIT(...) SDL_assert_release(false)
# define HC_ERROR_PRINT(...) SDL_Log(__VA_ARGS__)
# define HC_SOURCE_OUTPUR_PRINTF(_userdata, ...) SDL_strformat(_userdata, __VA_ARGS__)
# define HC_MALLOC(_, _size) SDL_malloc(_size)
# define HC_BOOL bool
#else
#pragma once
#endif
#if !defined(HC_ERROR_EXIT)
# include <stdlib.h>
# define HC_ERROR_EXIT(...) abort()
#endif
#if !defined(HC_ERROR_PRINTF)
# include <stdio.h>
# define HC_ERROR_PRINTF(...) fprintf(stderr, __VA_ARGS__)
#endif
#if !defined(HC_STRLEN)
# include <string.h>
# define HC_STRLEN(...) strlen(__VA_ARGS__)
#endif
#if !defined(HC_MEMCMP)
# include <string.h>
# define HC_MEMCMP(...) memcmp(__VA_ARGS__)
#endif
#if !defined(HC_MEMSET)
# include <string.h>
# define HC_MEMSET(...) memset(__VA_ARGS__)
#endif
#if !defined(HC_MEMCPY)
# include <string.h>
# define HC_MEMCPY(...) memcpy(__VA_ARGS__)
#endif
#if !defined(HC_FUNC_DECORATOR)
# define HC_FUNC_DECORATOR static inline
#endif
#if !defined(HC_BOOL) || !defined(HC_TRUE) || !defined(HC_FALSE)
# if defined(HC_BOOL)
# undef HC_BOOL
# endif
# define HC_BOOL char
# if defined(HC_TRUE)
# undef HC_TRUE
# endif
# define HC_TRUE 1
# if defined(HC_FALSE)
# undef HC_FALSE
# endif
# define HC_FALSE 0
#endif
typedef enum HC_Token_Kind {
HC_Token_Kind_EOF = 0,
// NOTE(naman): Single letter non-alphanumeric tokens (like '+', '{', etc.) with ASCII value <=127 are stord as their own kind.
HC_Token_Kind_EQUALITY = 128,
HC_Token_Kind_NOTEQUAL,
HC_Token_Kind_LESSEQ,
HC_Token_Kind_GREATEQ,
HC_Token_Kind_LOGICAL_OR,
HC_Token_Kind_LOGICAL_AND,
HC_Token_Kind_SHIFT_LEFT,
HC_Token_Kind_SHIFT_RIGHT,
HC_Token_Kind_ADD_ASSIGN,
HC_Token_Kind_SUB_ASSIGN,
HC_Token_Kind_MULTIPLY_ASSIGN,
HC_Token_Kind_DIVIDE_ASSIGN,
HC_Token_Kind_MODULUS_ASSIGN,
HC_Token_Kind_XOR_ASSIGN,
HC_Token_Kind_OR_ASSIGN,
HC_Token_Kind_AND_ASSIGN,
HC_Token_Kind_SHIFT_LEFT_ASSIGN,
HC_Token_Kind_SHIFT_RIGHT_ASSIGN,
HC_Token_Kind_INCREMENT,
HC_Token_Kind_DECREMENT,
HC_Token_Kind_PTR_DEREF,
HC_Token_Kind_ELLIPSIS,
HC_Token_Kind_ATTRIBUTE_NAMESPACE,
HC_Token_Kind_ATTRIBUTE_BEGIN,
HC_Token_Kind_ATTRIBUTE_END,
HC_Token_Kind_CONCATENATE,
HC_Token_Kind_IDENTIFIER = 175,
HC_Token_Kind_CONSTANT_INTEGER,
HC_Token_Kind_CONSTANT_FLOATING,
HC_Token_Kind_CONSTANT_CHARACTER,
HC_Token_Kind_CONSTANT_STRING,
HC_Token_Kind_ALIGNAS = 190,
HC_Token_Kind_ALIGNOF,
HC_Token_Kind_AUTO,
HC_Token_Kind_BOOL,
HC_Token_Kind_BREAK,
HC_Token_Kind_CASE,
HC_Token_Kind_CHAR,
HC_Token_Kind_CONST,
HC_Token_Kind_CONSTEXPR,
HC_Token_Kind_CONTINUE,
HC_Token_Kind_DEFAULT,
HC_Token_Kind_DO,
HC_Token_Kind_DOUBLE,
HC_Token_Kind_ELSE,
HC_Token_Kind_ENUM,
HC_Token_Kind_EXTERN,
HC_Token_Kind_FALSE,
HC_Token_Kind_FLOAT,
HC_Token_Kind_FOR,
HC_Token_Kind_GOTO,
HC_Token_Kind_IF,
HC_Token_Kind_INLINE,
HC_Token_Kind_INT,
HC_Token_Kind_LONG,
HC_Token_Kind_NULLPTR,
HC_Token_Kind_REGISTER,
HC_Token_Kind_RESTRICT,
HC_Token_Kind_RETURN,
HC_Token_Kind_SHORT,
HC_Token_Kind_SIGNED,
HC_Token_Kind_SIZE_OF,
HC_Token_Kind_STATIC,
HC_Token_Kind_STATIC_ASSERT,
HC_Token_Kind_STRUCT,
HC_Token_Kind_SWITCH,
HC_Token_Kind_THREAD_LOCAL,
HC_Token_Kind_TRUE,
HC_Token_Kind_TYPEDEF,
HC_Token_Kind_TYPEOF,
HC_Token_Kind_TYPEOF_UNQUAL,
HC_Token_Kind_UNION,
HC_Token_Kind_UNSIGNED,
HC_Token_Kind_VOID,
HC_Token_Kind_VOLATILE,
HC_Token_Kind_WHILE,
HC_Token_Kind_GENERIC,
HC_Token_Kind_NORETURN,
HC_Token_Kind_TOTAL,
} HC_Token_Kind;
typedef struct HC_Token {
char str[108];
HC_Token_Kind kind;
uint32_t line, column;
size_t length;
} HC_Token;
typedef struct HC_Syntax_Attr {
HC_Token namespace;
HC_Token name;
struct HC_Syntax_Attr *child;
struct HC_Syntax_Attr *next;
} HC_Syntax_Attr;
typedef struct HC_Syntax_Decl_Specifiers {
struct {
HC_BOOL s_auto;
HC_BOOL s_constexpr;
HC_BOOL s_extern;
HC_BOOL s_register;
HC_BOOL s_static;
HC_BOOL s_thread_local;
} storage;
struct {
HC_BOOL f_inline;
HC_BOOL f_noreturn;
} function;
} HC_Syntax_Decl_Specifiers;
typedef struct HC_Syntax_Decl HC_Syntax_Decl;
typedef struct HC_Syntax_Stmt HC_Syntax_Stmt;
typedef struct HC_Syntax_Expr HC_Syntax_Expr;
typedef struct HC_Syntax_Enumerator {
struct HC_Syntax_Enumerator *next;
HC_Token name;
HC_Syntax_Expr *value;
HC_Syntax_Attr *attr;
} HC_Syntax_Enumerator;
typedef enum HC_Syntax_Type_Kind {
HC_Syntax_Type_Kind_BASE,
HC_Syntax_Type_Kind_PTR,
HC_Syntax_Type_Kind_ARRAY,
HC_Syntax_Type_Kind_FUNC,
HC_Syntax_Type_Kind_STRUCT,
HC_Syntax_Type_Kind_UNION,
HC_Syntax_Type_Kind_ENUM,
} HC_Syntax_Type_Kind;
typedef struct HC_Syntax_Type {
HC_Token name;
HC_Syntax_Type_Kind kind;
HC_Syntax_Attr *attr;
union {
struct HC_Syntax_Type_Array_Pointer {
struct HC_Syntax_Type *deref;
HC_Syntax_Expr *elems;
} array_ptr;
struct HC_Syntax_Type_Struct_Union {
HC_Syntax_Decl *members;
} struct_union;
struct HC_Syntax_Type_Enumeration {
struct HC_Syntax_Type *type;
HC_Syntax_Enumerator *enumerators;
} enumeration;
struct HC_Syntax_Type_Function {
HC_Syntax_Decl *args;
struct HC_Syntax_Type *return_type;
} function;
};
struct {
HC_BOOL s_signed;
HC_BOOL s_unsigned;
} specifier;
struct {
HC_BOOL q_const;
HC_BOOL q_restrict;
HC_BOOL q_volatile;
} qualifier;
HC_Syntax_Decl_Specifiers _decl_spec; // Temporary storage, to be later stored in the HC_Syntax_Decl->spec
} HC_Syntax_Type;
struct HC_Syntax_Decl {
HC_Token name;
HC_Syntax_Type *type;
HC_Syntax_Attr *attr;
HC_Syntax_Expr *bitfield_size;
struct HC_Syntax_Decl *next;
HC_Syntax_Decl_Specifiers spec;
};
typedef struct HC_Syntax_Dsig { // Desigated Initializer
HC_Token dotted;
HC_Syntax_Expr *indexed;
struct HC_Syntax_Dsig *next;
} HC_Syntax_Dsig;
typedef enum HC_Syntax_Expr_Kind {
HC_Syntax_Expr_Kind_LITERAL,
HC_Syntax_Expr_Kind_INFIX,
HC_Syntax_Expr_Kind_PREFIX,
HC_Syntax_Expr_Kind_POSTFIX,
HC_Syntax_Expr_Kind_CALL,
HC_Syntax_Expr_Kind_INDEX,
HC_Syntax_Expr_Kind_TERNARY,
HC_Syntax_Expr_Kind_DESIG_INIT,
HC_Syntax_Expr_Kind_TYPE_CAST,
HC_Syntax_Expr_Kind_SIZE_OF,
} HC_Syntax_Expr_Kind;
typedef struct HC_Syntax_Expr {
HC_Syntax_Expr_Kind kind;
struct HC_Syntax_Expr *next_sibling;
union {
struct { // HC_Syntax_Expr_Kind_LITERAL
HC_Token token;
} literal;
struct { // HC_Syntax_Expr_Kind_INFIX
struct HC_Syntax_Expr *lhs, *rhs;
HC_Token op;
} infix;
struct { // HC_Syntax_Expr_Kind_PREFIX
struct HC_Syntax_Expr *rhs;
HC_Token op;
} prefix;
struct { // HC_Syntax_Expr_Kind_POSTFIX
struct HC_Syntax_Expr *lhs;
HC_Token op;
} postfix;
struct { // HC_Syntax_Expr_Kind_CALL
HC_Syntax_Expr *func;
struct HC_Syntax_Expr *first_arg;
} call;
struct { // HC_Syntax_Expr_Kind_INDEX
struct HC_Syntax_Expr *array;
struct HC_Syntax_Expr *index;
} index;
struct { // HC_Syntax_Expr_Kind_TERNARY
struct HC_Syntax_Expr *cond;
struct HC_Syntax_Expr *then;
struct HC_Syntax_Expr *or_else;
} ternary;
struct { // HC_Syntax_Expr_Kind_DESIG_INIT
HC_Syntax_Type *type;
struct HC_Syntax_Expr *first_init;
HC_Syntax_Dsig *desig;
struct HC_Syntax_Expr *init;
} desig_init;
struct { // HC_Syntax_Expr_Kind_TYPE_CAST
HC_Syntax_Type *type;
struct HC_Syntax_Expr *expr;
} type_cast;
struct { // HC_Syntax_Expr_Kind_SIZE_OF
HC_Syntax_Type *arg_type;
struct HC_Syntax_Expr *arg_expr;
} size_of;
};
} HC_Syntax_Expr;
typedef enum HC_Syntax_Stmt_Kind {
HC_Syntax_Stmt_Kind_TOP_LEVEL,
HC_Syntax_Stmt_Kind_STMT_EMPTY,
HC_Syntax_Stmt_Kind_STMT_BLOCK,
HC_Syntax_Stmt_Kind_STMT_RETURN,
HC_Syntax_Stmt_Kind_STMT_IF,
HC_Syntax_Stmt_Kind_STMT_FOR,
HC_Syntax_Stmt_Kind_STMT_WHILE,
HC_Syntax_Stmt_Kind_STMT_FUNDECL,
HC_Syntax_Stmt_Kind_STMT_VARDECL,
HC_Syntax_Stmt_Kind_STMT_TYPEDEF,
HC_Syntax_Stmt_Kind_STMT_EXPR,
} HC_Syntax_Stmt_Kind;
struct HC_Syntax_Stmt {
HC_Syntax_Stmt_Kind kind;
HC_Syntax_Attr *attribute;
struct HC_Syntax_Stmt *next_sibling;
union {
struct HC_Syntax_Stmt_Top_Level { // HC_Syntax_Stmt_Kind_TOP_LEVEL
struct HC_Syntax_Stmt *first_stmt;
} top_level;
struct HC_Syntax_Stmt_Block { // HC_Syntax_Stmt_Kind_STMT_BLOCK
struct HC_Syntax_Stmt *first_stmt;
} stmt_block;
struct HC_Syntax_Stmt_Return { // HC_Syntax_Stmt_Kind_STMT_RETURN
struct HC_Syntax_Expr *expr;
} stmt_return;
struct HC_Syntax_Stmt_If { // HC_Syntax_Stmt_Kind_STMT_IF
struct HC_Syntax_Stmt *first_branch;
struct HC_Syntax_Expr *cond;
struct HC_Syntax_Stmt *body;
} stmt_if;
struct HC_Syntax_Stmt_For { // HC_Syntax_Stmt_Kind_STMT_FOR
// Only one init will be non-NULL
struct HC_Syntax_Expr *init_expr;
struct HC_Syntax_Stmt *init_decl;
struct HC_Syntax_Expr *cond;
struct HC_Syntax_Expr *updt;
struct HC_Syntax_Stmt *body;
} stmt_for;
struct HC_Syntax_Stmt_While { // HC_Syntax_Stmt_Kind_STMT_WHILE
struct HC_Syntax_Expr *cond;
struct HC_Syntax_Stmt *body;
} stmt_while;
struct HC_Syntax_Stmt_Fundecl { // HC_Syntax_Stmt_Kind_STMT_FUNDECL
HC_Syntax_Decl *decl;
struct HC_Syntax_Stmt *body;
} stmt_fundecl;
struct HC_Syntax_Stmt_Vardecl { // HC_Syntax_Stmt_Kind_STMT_VARDECL
struct HC_Syntax_Stmt *first_var;
HC_Syntax_Decl *decl;
struct HC_Syntax_Expr *expr;
} stmt_vardecl;
struct HC_Syntax_Stmt_Typedef { // HC_Syntax_Stmt_Kind_STMT_TYPEDEF
HC_Syntax_Decl *first_def;
} stmt_typedef;
struct HC_Syntax_Stmt_Expr { // HC_Syntax_Stmt_Kind_STMT_EXPR
struct HC_Syntax_Expr *expr;
} stmt_expr;
};
};
typedef struct HC_Syntax_Tree {
HC_Syntax_Stmt *top_level; // Type should be HC_Syntax_Stmt_Kind_TOP_LEVEL
} HC_Syntax_Tree;
typedef struct HC_Parser HC_Parser;
HC_FUNC_DECORATOR HC_Parser* hcParserCreate (char *src, void *mem_userdata, void *err_userdata);
HC_FUNC_DECORATOR void hcParserAddGlobalType (HC_Parser *parser, char *typename);
HC_FUNC_DECORATOR HC_Syntax_Tree hcParse (HC_Parser *parser);
HC_FUNC_DECORATOR void hcPrintSyntaxTree (HC_Syntax_Tree syntax_tree, void *fmt_userdata, void *err_userdata);
#if defined(HYPERC_IMPLEMENTATION)
// Macro would be called with HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, format_string, ...)
# if !defined(HC_SOURCE_OUTPUR_PRINTF)
# error HC_SOURCE_OUTPUR_PRINTF needs to be defined
# endif
// Macro would be called with HC_MALLOC(mem_userdata, size)
# if !defined(HC_MALLOC)
# error HC_MALLOC needs to be defined
# endif
HC_FUNC_DECORATOR
void* hcMalloc (void* mem_userdata, size_t size)
{
(void)mem_userdata;
void *mem = HC_MALLOC(mem_userdata, size);
HC_MEMSET(mem, 0, size);
return mem;
}
/* ***********************************************
* Tokens ****************************************
* ***********************************************/
HC_FUNC_DECORATOR
void hcTokenError (HC_Token tok, char *msg, void *err_userdata)
{
(void)tok;
(void)msg;
(void)err_userdata;
HC_ERROR_PRINTF("ERROR @ [%d-%d]: %s {Found Token = %s}\n", tok.line, tok.column, msg, tok.str);
HC_ERROR_EXIT(err_userdata, -1);
}
HC_FUNC_DECORATOR
HC_Token hcTokenMake (HC_Token_Kind kind, char *begin, size_t len, uint32_t line, uint32_t column, void *err_userdata)
{
HC_Token tok = {
.kind = kind,
.line = line,
.column = column,
.length = len,
};
if (len >= (sizeof(tok.str) - 1)) {
hcTokenError(tok, "Token length too long to fit in the HC_Token type", err_userdata);
}
HC_MEMCPY(tok.str, begin, len);
return tok;
}
HC_FUNC_DECORATOR
HC_Token_Kind hcTokenKind (HC_Token tok)
{
return tok.kind;
}
HC_FUNC_DECORATOR
HC_Token_Kind hcTokenIsKind (HC_Token tok, HC_Token_Kind kind)
{
return tok.kind == kind;
}
HC_FUNC_DECORATOR
HC_BOOL hcTokenIsStrL (HC_Token tok, char *str, size_t slen)
{
if (slen != tok.length) return HC_FALSE;
HC_BOOL eq = HC_MEMCMP((char*)tok.str, str, slen) == 0;
return eq;
}
HC_FUNC_DECORATOR
HC_BOOL hcTokenIsStr (HC_Token tok, char *str)
{
size_t slen = HC_STRLEN(str);
HC_BOOL eq = hcTokenIsStrL(tok, str, slen);
return eq;
}
HC_FUNC_DECORATOR
HC_BOOL hcTokensAreEqual (HC_Token tok1, HC_Token tok2)
{
if (tok1.length != tok2.length) return HC_FALSE;
return hcTokenIsStrL(tok1, tok2.str, tok2.length);
}
/* ***********************************************
* Typer *****************************************
* ***********************************************/
typedef struct HC_Syntax_Typer_List {
HC_Token name;
struct HC_Syntax_Typer_List *next;
} HC_Syntax_Typer_List;
typedef struct HC_Syntax_Typer_Scope {
HC_Syntax_Typer_List *first_type;
struct HC_Syntax_Typer_Scope *prev_scope;
} HC_Syntax_Typer_Scope;
typedef struct HC_Syntax_Typer {
void *mem_userdata;
void *err_userdata;
HC_Syntax_Typer_Scope *top_scope;
HC_Syntax_Typer_Scope *global_scope;
} HC_Syntax_Typer;
HC_FUNC_DECORATOR void hcTyperPushTypeScope (HC_Syntax_Typer *typer);
HC_FUNC_DECORATOR
void hcTyperMake (HC_Syntax_Typer *typer, void *mem_userdata, void *err_userdata)
{
*typer = (HC_Syntax_Typer) {
.mem_userdata = mem_userdata,
.err_userdata = err_userdata,
};
hcTyperPushTypeScope(typer);
typer->global_scope = typer->top_scope;
}
HC_FUNC_DECORATOR
void hcTyperAddTypeToScope (void *mem_userdata, HC_Syntax_Typer_Scope *scope, HC_Token typename)
{
HC_Syntax_Typer_List *type = hcMalloc(mem_userdata, sizeof(*type));
*type = (HC_Syntax_Typer_List){.name = typename};
type->next = scope->first_type;
scope->first_type = type;
}
HC_FUNC_DECORATOR
void hcTyperAddGlobalType (HC_Syntax_Typer *typer, char *typename)
{
HC_Token tok = hcTokenMake(HC_Token_Kind_IDENTIFIER, typename, HC_STRLEN(typename), 0, 0, typer->err_userdata);
hcTyperAddTypeToScope(typer->mem_userdata, typer->global_scope, tok);
}
HC_FUNC_DECORATOR
void hcTyperAddType (HC_Syntax_Typer *typer, HC_Token typename)
{
hcTyperAddTypeToScope(typer->mem_userdata, typer->top_scope, typename);
}
HC_FUNC_DECORATOR
void hcTyperPushTypeScope (HC_Syntax_Typer *typer)
{
HC_Syntax_Typer_Scope *scope = hcMalloc(typer->mem_userdata, sizeof(*scope));
scope->prev_scope = typer->top_scope;
typer->top_scope = scope;
}
HC_FUNC_DECORATOR
void hcTyperPopTypeScope (HC_Syntax_Typer *typer)
{
typer->top_scope = typer->top_scope->prev_scope;
}
HC_FUNC_DECORATOR
HC_Syntax_Typer_Scope* hcTyperIterRevScopesInParser (HC_Syntax_Typer *typer, HC_Syntax_Typer_Scope *scope)
{
if (scope == NULL) {
return typer->top_scope;
} else {
return scope->prev_scope;
}
}
HC_FUNC_DECORATOR
HC_Syntax_Typer_List* hcTyperIterTypesInScope (HC_Syntax_Typer_Scope *sc, HC_Syntax_Typer_List *tl)
{
if (tl == NULL) {
return sc->first_type;
} else {
return tl->next;
}
}
HC_FUNC_DECORATOR
HC_Token hcTyperGetName (HC_Syntax_Typer_List *tl)
{
return tl->name;
}
/* ***********************************************
* Tokenizer *************************************
* ***********************************************/
typedef struct HC_Tokenizer {
HC_Token read_token;
char *src;
size_t src_len;
uint32_t line;
uint32_t column;
uint64_t cursor;
void *err_userdata;
} HC_Tokenizer;
HC_FUNC_DECORATOR void hcTokenizerMoveForward (HC_Tokenizer *toker);
HC_FUNC_DECORATOR
void hcTokenizerMake (HC_Tokenizer *toker, char *src, void *err_userdata)
{
*toker = (HC_Tokenizer) {
.src = src,
.src_len = HC_STRLEN(src),
.line = 1,
.column = 1,
.err_userdata = err_userdata,
};
hcTokenizerMoveForward(toker);
}
HC_FUNC_DECORATOR
void hcTokenenizerError (HC_Tokenizer *toker, char *msg)
{
HC_ERROR_PRINTF("ERROR @ [%d-%d]: %s\n", toker->line, toker->column, msg);
HC_ERROR_EXIT(toker->err_userdata, -1);
}
HC_FUNC_DECORATOR
void hcTokenizercharAdvance (HC_Tokenizer *toker, uint32_t count)
{
for (size_t i = 0; i < count; i++) {
if (toker->src[toker->cursor] == '\n') {
toker->line++;
toker->column = 1;
} else {
toker->column++;
}
toker->cursor++;
}
}
HC_FUNC_DECORATOR
void hcTokenizerEatWhitespace (HC_Tokenizer *toker)
{
HC_BOOL ate_something_this_loop = HC_TRUE;
while (ate_something_this_loop) {
ate_something_this_loop = HC_FALSE;
while ((toker->src[toker->cursor] == ' ') ||
(toker->src[toker->cursor] == '\t') ||
(toker->src[toker->cursor] == '\r') ||
(toker->src[toker->cursor] == '\n')) {
ate_something_this_loop = HC_TRUE;
hcTokenizercharAdvance(toker, 1);
}
if ((toker->src[toker->cursor] == '/') && (toker->src[toker->cursor+1] == '/')) {
ate_something_this_loop = HC_TRUE;
while (true) {
if (toker->src[toker->cursor] == '\n') {
if ((toker->cursor) > 0 && toker->src[toker->cursor-1] == '\\') {
// do nothing
} else {
break;
}
}
hcTokenizercharAdvance(toker, 1);
}
hcTokenizercharAdvance(toker, 1);
}
if ((toker->src[toker->cursor] == '/') && (toker->src[toker->cursor+1] == '*')) {
ate_something_this_loop = HC_TRUE;
hcTokenizercharAdvance(toker, 2);
while (!((toker->src[toker->cursor] == '*') && (toker->src[toker->cursor+1] == '/'))) hcTokenizercharAdvance(toker, 1);
hcTokenizercharAdvance(toker, 2);
}
// TODO(naman): Replace this with a real pre-processor
if (toker->src[toker->cursor] == '#') {
ate_something_this_loop = HC_TRUE;
while (true) {
if (toker->src[toker->cursor] == '\n') {
if ((toker->cursor) > 0 && toker->src[toker->cursor-1] == '\\') {
// do nothing
} else {
break;
}
}
hcTokenizercharAdvance(toker, 1);
}
hcTokenizercharAdvance(toker, 1);
}
}
}
HC_FUNC_DECORATOR
void hcTokenizerMoveForward (HC_Tokenizer *toker)
{
hcTokenizerEatWhitespace (toker);
Uint64 begin = toker->cursor;
HC_Token_Kind kind = 0;
switch (toker->src[toker->cursor]) {
case '\0': {
toker->read_token = (HC_Token){0};
return;
} break;
case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9':
case '.': {
if ((toker->src[toker->cursor] == '.') && !((toker->src[toker->cursor+1] <= '9') && (toker->src[toker->cursor+1] >= '0'))) {
hcTokenizercharAdvance(toker, 1);
if ((toker->src[toker->cursor + 0] == '.') && (toker->src[toker->cursor + 1] == '.')) {
hcTokenizercharAdvance(toker, 2);
kind = HC_Token_Kind_ELLIPSIS;
} else {
kind = '.';
}
} else {
HC_BOOL floating = HC_FALSE;
{
size_t cursor = toker->cursor + 1;
if (toker->src[cursor] == 'x') {
cursor += 2;
while (((toker->src[cursor] <= '9') && (toker->src[cursor] >= '0')) ||
((toker->src[cursor] <= 'F') && (toker->src[cursor] >= 'A')) ||
((toker->src[cursor] <= 'f') && (toker->src[cursor] >= 'a')) ||
(toker->src[cursor] == '\'')) {
cursor++;
}
if ((toker->src[cursor] == '.') || (toker->src[cursor] == 'p') || (toker->src[cursor] == 'P')) floating = HC_TRUE;
} else {
while (((toker->src[cursor] <= '9') && (toker->src[cursor] >= '0')) ||
(toker->src[cursor] == '\'')) {
cursor++;
}
if ((toker->src[cursor] == '.') || (toker->src[cursor] == 'e') || (toker->src[cursor] == 'E')) floating = HC_TRUE;
}
}
if (floating) {
kind = HC_Token_Kind_CONSTANT_FLOATING;
hcTokenizercharAdvance(toker, 1);
if ((toker->src[toker->cursor] == 'x') || (toker->src[toker->cursor] == 'X')) {
hcTokenizercharAdvance(toker, 1);
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
((toker->src[toker->cursor] <= 'F') && (toker->src[toker->cursor] >= 'A')) ||
((toker->src[toker->cursor] <= 'f') && (toker->src[toker->cursor] >= 'a')) ||
(toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
if (toker->src[toker->cursor] == '.') {
hcTokenizercharAdvance(toker, 1);
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
((toker->src[toker->cursor] <= 'F') && (toker->src[toker->cursor] >= 'A')) ||
((toker->src[toker->cursor] <= 'f') && (toker->src[toker->cursor] >= 'a')) ||
(toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
}
if ((toker->src[toker->cursor] == 'p') || (toker->src[toker->cursor] == 'P')) {
hcTokenizercharAdvance(toker, 1);
if ((toker->src[toker->cursor] == '-') || (toker->src[toker->cursor] == '+')) {
hcTokenizercharAdvance(toker, 1);
}
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
(toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
}
} else {
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
(toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
if (toker->src[toker->cursor] == '.') {
hcTokenizercharAdvance(toker, 1);
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
(toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
}
if ((toker->src[toker->cursor] == 'e') || (toker->src[toker->cursor] == 'E')) {
hcTokenizercharAdvance(toker, 1);
if ((toker->src[toker->cursor] == '-') || (toker->src[toker->cursor] == '+')) {
hcTokenizercharAdvance(toker, 1);
}
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
(toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
}
}
/* */ if (toker->src[toker->cursor] == 'f') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'l') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'F') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'L') {
hcTokenizercharAdvance(toker, 1);
} else if ((toker->src[toker->cursor] == 'd') && (toker->src[toker->cursor+1] == 'f')) {
hcTokenizercharAdvance(toker, 2);
} else if ((toker->src[toker->cursor] == 'd') && (toker->src[toker->cursor+1] == 'd')) {
hcTokenizercharAdvance(toker, 2);
} else if ((toker->src[toker->cursor] == 'd') && (toker->src[toker->cursor+1] == 'l')) {
hcTokenizercharAdvance(toker, 2);
} else if ((toker->src[toker->cursor] == 'D') && (toker->src[toker->cursor+1] == 'F')) {
hcTokenizercharAdvance(toker, 2);
} else if ((toker->src[toker->cursor] == 'D') && (toker->src[toker->cursor+1] == 'D')) {
hcTokenizercharAdvance(toker, 2);
} else if ((toker->src[toker->cursor] == 'D') && (toker->src[toker->cursor+1] == 'L')) {
hcTokenizercharAdvance(toker, 2);
}
} else {
kind = HC_Token_Kind_CONSTANT_INTEGER;
if (toker->src[toker->cursor] == '0') {
hcTokenizercharAdvance(toker, 1);
if ((toker->src[toker->cursor] == 'x') || (toker->src[toker->cursor] == 'X')) {
hcTokenizercharAdvance(toker, 1);
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
((toker->src[toker->cursor] <= 'F') && (toker->src[toker->cursor] >= 'A')) ||
((toker->src[toker->cursor] <= 'f') && (toker->src[toker->cursor] >= 'a')) ||
(toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
} else if ((toker->src[toker->cursor] == 'b') || (toker->src[toker->cursor] == 'B')) {
hcTokenizercharAdvance(toker, 1);
while ((toker->src[toker->cursor] == '0') || (toker->src[toker->cursor] == '1') || (toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
} else {
while (((toker->src[toker->cursor] <= '7') && (toker->src[toker->cursor] >= '0')) ||
(toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
}
} else {
hcTokenizercharAdvance(toker, 1);
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
(toker->src[toker->cursor] == '\'')) {
hcTokenizercharAdvance(toker, 1);
}
}
if ((toker->src[toker->cursor] == 'u') || (toker->src[toker->cursor] == 'U')) {
hcTokenizercharAdvance(toker, 1);
if (toker->src[toker->cursor] == 'l') {
hcTokenizercharAdvance(toker, 1);
if (toker->src[toker->cursor] == 'l') {
hcTokenizercharAdvance(toker, 1);
}
} else if (toker->src[toker->cursor] == 'L') {
hcTokenizercharAdvance(toker, 1);
if (toker->src[toker->cursor] == 'L') {
hcTokenizercharAdvance(toker, 1);
}
} else if (toker->src[toker->cursor] == 'w') {
hcTokenizercharAdvance(toker, 1);
if (toker->src[toker->cursor] == 'b') {
hcTokenizercharAdvance(toker, 1);
}
} else if (toker->src[toker->cursor] == 'W') {
hcTokenizercharAdvance(toker, 1);
if (toker->src[toker->cursor] == 'B') {
hcTokenizercharAdvance(toker, 1);
}
}
} else if (toker->src[toker->cursor] == 'l') {
hcTokenizercharAdvance(toker, 1);
if (toker->src[toker->cursor] == 'l') {
hcTokenizercharAdvance(toker, 1);
}
if ((toker->src[toker->cursor] == 'u') || (toker->src[toker->cursor] == 'U')) {
hcTokenizercharAdvance(toker, 1);
}
} else if (toker->src[toker->cursor] == 'L') {
hcTokenizercharAdvance(toker, 1);
if (toker->src[toker->cursor] == 'L') {
hcTokenizercharAdvance(toker, 1);
}
if ((toker->src[toker->cursor] == 'u') || (toker->src[toker->cursor] == 'U')) {
hcTokenizercharAdvance(toker, 1);
}
}
}
}
} break;
case 'A': case 'a': case 'B': case 'b': case 'C': case 'c':
case 'D': case 'd': case 'E': case 'e': case 'F': case 'f':
case 'G': case 'g': case 'H': case 'h': case 'I': case 'i':
case 'J': case 'j': case 'K': case 'k': case 'L': case 'l':
case 'M': case 'm': case 'N': case 'n': case 'O': case 'o':
case 'P': case 'p': case 'Q': case 'q': case 'R': case 'r':
case 'S': case 's': case 'T': case 't': case 'U': case 'u':
case 'V': case 'v': case 'W': case 'w': case 'X': case 'x':
case 'Y': case 'y': case 'Z': case 'z': case '_':
case '\'': case '"': {
if ((toker->src[toker->cursor] == '\'') ||
((toker->src[toker->cursor] == 'u') && (toker->src[toker->cursor+1] == '8') && (toker->src[toker->cursor+2] == '\'')) ||
((toker->src[toker->cursor] == 'u') && (toker->src[toker->cursor+1] == '\'')) ||
((toker->src[toker->cursor] == 'U') && (toker->src[toker->cursor+1] == '\'')) ||
((toker->src[toker->cursor] == 'L') && (toker->src[toker->cursor+1] == '\''))) {
kind = HC_Token_Kind_CONSTANT_CHARACTER;
if ((toker->src[toker->cursor] == 'u') && (toker->src[toker->cursor+1] == '8')) {
hcTokenizercharAdvance(toker, 2);
} else if (toker->src[toker->cursor] == 'u') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'U') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'L') {
hcTokenizercharAdvance(toker, 1);
}
if (toker->src[toker->cursor] != '\'') {
hcTokenenizerError(toker, "Open quote missing in character literal");
} else {
hcTokenizercharAdvance(toker, 1); // '
}
if (toker->src[toker->cursor] == '\'') {
hcTokenenizerError(toker, "character literal can not be empty");
}
if (toker->src[toker->cursor] == '\n') {
hcTokenenizerError(toker, "character literal can not be a newline");
}
if (toker->src[toker->cursor] == '\\') {
hcTokenizercharAdvance(toker, 1);
/* */ if (toker->src[toker->cursor] == '\'') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == '"') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == '?') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == '\\') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'a') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'b') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'f') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'n') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'r') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 't') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'v') {
hcTokenizercharAdvance(toker, 1);
} else if ((toker->src[toker->cursor] <= '7') && (toker->src[toker->cursor] >= '0')) {
for (size_t i = 0; i < 3; i++) {
if ((toker->src[toker->cursor] <= '7') && (toker->src[toker->cursor] >= '0')) {
hcTokenizercharAdvance(toker, 1);
}
}
} else if (toker->src[toker->cursor] == 'x') {
hcTokenizercharAdvance(toker, 1);
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
((toker->src[toker->cursor] <= 'F') && (toker->src[toker->cursor] >= 'A')) ||
((toker->src[toker->cursor] <= 'f') && (toker->src[toker->cursor] >= 'a'))) {
hcTokenizercharAdvance(toker, 1);
}
} else if (toker->src[toker->cursor] == 'u') {
hcTokenizercharAdvance(toker, 1);
for (size_t i = 0; i < 4; i++) {
if (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
((toker->src[toker->cursor] <= 'F') && (toker->src[toker->cursor] >= 'A')) ||
((toker->src[toker->cursor] <= 'f') && (toker->src[toker->cursor] >= 'a'))) {
hcTokenizercharAdvance(toker, 1);
} else {
hcTokenenizerError(toker, "Universal character incomplete, needs four hex digits");
}
}
} else if (toker->src[toker->cursor] == 'U') {
hcTokenizercharAdvance(toker, 1);
for (size_t i = 0; i < 8; i++) {
if (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
((toker->src[toker->cursor] <= 'F') && (toker->src[toker->cursor] >= 'A')) ||
((toker->src[toker->cursor] <= 'f') && (toker->src[toker->cursor] >= 'a'))) {
hcTokenizercharAdvance(toker, 1);
} else {
hcTokenenizerError(toker, "Universal character incomplete, needs eight hex digits");
}
}
} else {
hcTokenenizerError(toker, "Unknown escape sequence in character");
}
} else {
hcTokenizercharAdvance(toker, 1);
}
if (toker->src[toker->cursor] != '\'') {
hcTokenenizerError(toker, "Closing quote missing in character literal");
} else {
hcTokenizercharAdvance(toker, 1); // '
}
} else if ((toker->src[toker->cursor] == '\"') ||
((toker->src[toker->cursor] == 'u') && (toker->src[toker->cursor+1] == '8') && (toker->src[toker->cursor+2] == '"')) ||
((toker->src[toker->cursor] == 'u') && (toker->src[toker->cursor+1] == '"')) ||
((toker->src[toker->cursor] == 'U') && (toker->src[toker->cursor+1] == '"')) ||
((toker->src[toker->cursor] == 'L') && (toker->src[toker->cursor+1] == '"'))) {
kind = HC_Token_Kind_CONSTANT_STRING;
if ((toker->src[toker->cursor] == 'u') && (toker->src[toker->cursor+1] == '8')) {
hcTokenizercharAdvance(toker, 2);
} else if (toker->src[toker->cursor] == 'u') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'U') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'L') {
hcTokenizercharAdvance(toker, 1);
}
if (toker->src[toker->cursor] != '"') {
hcTokenenizerError(toker, "Open quote missing in toker->srcing literal");
} else {
hcTokenizercharAdvance(toker, 1); // "
}
if (toker->src[toker->cursor] == '"') {
hcTokenizercharAdvance(toker, 1); // "
} else {
while (toker->src[toker->cursor] != '"') {
if (toker->src[toker->cursor] == '\n') {
hcTokenenizerError(toker, "Newline not allowed in a toker->srcing literal");
}
if (toker->src[toker->cursor] == '\\') {
hcTokenizercharAdvance(toker, 1);
/* */ if (toker->src[toker->cursor] == '\'') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == '"') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == '?') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == '\\') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'a') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'b') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'f') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'n') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'r') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 't') {
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == 'v') {
hcTokenizercharAdvance(toker, 1);
} else if ((toker->src[toker->cursor] <= '7') && (toker->src[toker->cursor] >= '0')) {
for (size_t i = 0; i < 3; i++) {
if ((toker->src[toker->cursor] <= '7') && (toker->src[toker->cursor] >= '0')) {
hcTokenizercharAdvance(toker, 1);
}
}
} else if (toker->src[toker->cursor] == 'x') {
hcTokenizercharAdvance(toker, 1);
while (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
((toker->src[toker->cursor] <= 'F') && (toker->src[toker->cursor] >= 'A')) ||
((toker->src[toker->cursor] <= 'f') && (toker->src[toker->cursor] >= 'a'))) {
hcTokenizercharAdvance(toker, 1);
}
} else if (toker->src[toker->cursor] == 'u') {
hcTokenizercharAdvance(toker, 1);
for (size_t i = 0; i < 4; i++) {
if (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
((toker->src[toker->cursor] <= 'F') && (toker->src[toker->cursor] >= 'A')) ||
((toker->src[toker->cursor] <= 'f') && (toker->src[toker->cursor] >= 'a'))) {
hcTokenizercharAdvance(toker, 1);
} else {
hcTokenenizerError(toker, "Universal character in toker->srcing incomplete, needs four hex digits");
}
}
} else if (toker->src[toker->cursor] == 'U') {
hcTokenizercharAdvance(toker, 1);
for (size_t i = 0; i < 8; i++) {
if (((toker->src[toker->cursor] <= '9') && (toker->src[toker->cursor] >= '0')) ||
((toker->src[toker->cursor] <= 'F') && (toker->src[toker->cursor] >= 'A')) ||
((toker->src[toker->cursor] <= 'f') && (toker->src[toker->cursor] >= 'a'))) {
hcTokenizercharAdvance(toker, 1);
} else {
hcTokenenizerError(toker, "Universal character in string incomplete, needs eight hex digits");
}
}
} else {
hcTokenenizerError(toker, "Unknown escape sequence in string");
}
} else {
hcTokenizercharAdvance(toker, 1);
}
}
if (toker->src[toker->cursor] != '"') {
hcTokenenizerError(toker, "Closing quote missing in string literal");
} else {
hcTokenizercharAdvance(toker, 1); // "
}
}
} else {
hcTokenizercharAdvance(toker, 1);
char *ident = toker->src + toker->cursor;
size_t count = 0;
while (HC_TRUE) {
char c = ident[count+0];
if ((c >= 'a') && (c <= 'z')) { count++; continue; }
if ((c >= 'A') && (c <= 'Z')) { count++; continue; }
if ((c >= '0') && (c <= '9')) { count++; continue; }
if (c == '_') { count++; continue; }
break;
}
for (size_t i = 0; i < count; i++) {
hcTokenizercharAdvance(toker, 1);
}
HC_Token token = hcTokenMake(kind, toker->src + begin, toker->cursor - begin, toker->line, toker->column, toker->err_userdata);
/* */ if (hcTokenIsStr(token, "alignas")) {
kind = HC_Token_Kind_ALIGNAS;
} else if (hcTokenIsStr(token, "alignof")) {
kind = HC_Token_Kind_ALIGNOF;
} else if (hcTokenIsStr(token, "auto")) {
kind = HC_Token_Kind_AUTO;
} else if (hcTokenIsStr(token, "break")) {
kind = HC_Token_Kind_BREAK;
} else if (hcTokenIsStr(token, "bool")) {
kind = HC_Token_Kind_BOOL;
} else if (hcTokenIsStr(token, "case")) {
kind = HC_Token_Kind_CASE;
} else if (hcTokenIsStr(token, "char")) {
kind = HC_Token_Kind_CHAR;
} else if (hcTokenIsStr(token, "const")) {
kind = HC_Token_Kind_CONST;
} else if (hcTokenIsStr(token, "constexpr")) {
kind = HC_Token_Kind_CONSTEXPR;
} else if (hcTokenIsStr(token, "continue")) {
kind = HC_Token_Kind_CONTINUE;
} else if (hcTokenIsStr(token, "default")) {
kind = HC_Token_Kind_DEFAULT;
} else if (hcTokenIsStr(token, "do")) {
kind = HC_Token_Kind_DO;
} else if (hcTokenIsStr(token, "double")) {
kind = HC_Token_Kind_DOUBLE;
} else if (hcTokenIsStr(token, "else")) {
kind = HC_Token_Kind_ELSE;
} else if (hcTokenIsStr(token, "enum")) {
kind = HC_Token_Kind_ENUM;
} else if (hcTokenIsStr(token, "extern")) {
kind = HC_Token_Kind_EXTERN;
} else if (hcTokenIsStr(token, "false")) {
kind = HC_Token_Kind_FALSE;
} else if (hcTokenIsStr(token, "float")) {
kind = HC_Token_Kind_FLOAT;
} else if (hcTokenIsStr(token, "for")) {
kind = HC_Token_Kind_FOR;
} else if (hcTokenIsStr(token, "goto")) {
kind = HC_Token_Kind_GOTO;
} else if (hcTokenIsStr(token, "if")) {
kind = HC_Token_Kind_IF;
} else if (hcTokenIsStr(token, "inline")) {
kind = HC_Token_Kind_INLINE;
} else if (hcTokenIsStr(token, "int")) {
kind = HC_Token_Kind_INT;
} else if (hcTokenIsStr(token, "long")) {
kind = HC_Token_Kind_LONG;
} else if (hcTokenIsStr(token, "nullptr")) {
kind = HC_Token_Kind_NULLPTR;
} else if (hcTokenIsStr(token, "register")) {
kind = HC_Token_Kind_REGISTER;
} else if (hcTokenIsStr(token, "restrict")) {
kind = HC_Token_Kind_RESTRICT;
} else if (hcTokenIsStr(token, "return")) {
kind = HC_Token_Kind_RETURN;
} else if (hcTokenIsStr(token, "short")) {
kind = HC_Token_Kind_SHORT;
} else if (hcTokenIsStr(token, "signed")) {
kind = HC_Token_Kind_SIGNED;
} else if (hcTokenIsStr(token, "sizeof")) {
kind = HC_Token_Kind_SIZE_OF;
} else if (hcTokenIsStr(token, "static")) {
kind = HC_Token_Kind_STATIC;
} else if (hcTokenIsStr(token, "static_assert")) {
kind = HC_Token_Kind_STATIC_ASSERT;
} else if (hcTokenIsStr(token, "struct")) {
kind = HC_Token_Kind_STRUCT;
} else if (hcTokenIsStr(token, "switch")) {
kind = HC_Token_Kind_SWITCH;
} else if (hcTokenIsStr(token, "thread_local")) {
kind = HC_Token_Kind_THREAD_LOCAL;
} else if (hcTokenIsStr(token, "true")) {
kind = HC_Token_Kind_TRUE;
} else if (hcTokenIsStr(token, "typedef")) {
kind = HC_Token_Kind_TYPEDEF;
} else if (hcTokenIsStr(token, "typeof")) {
kind = HC_Token_Kind_TYPEOF;
} else if (hcTokenIsStr(token, "typeof_unqual")) {
kind = HC_Token_Kind_TYPEOF_UNQUAL;
} else if (hcTokenIsStr(token, "union")) {
kind = HC_Token_Kind_UNION;
} else if (hcTokenIsStr(token, "unsigned")) {
kind = HC_Token_Kind_UNSIGNED;
} else if (hcTokenIsStr(token, "void")) {
kind = HC_Token_Kind_VOID;
} else if (hcTokenIsStr(token, "volatile")) {
kind = HC_Token_Kind_VOLATILE;
} else if (hcTokenIsStr(token, "while")) {
kind = HC_Token_Kind_WHILE;
} else if (hcTokenIsStr(token, "_Generic")) {
kind = HC_Token_Kind_GENERIC;
} else if (hcTokenIsStr(token, "_Noreturn")) {
kind = HC_Token_Kind_NORETURN;
} else {
kind = HC_Token_Kind_IDENTIFIER;
}
}
} break;
case '(': case ')':
case '{': case '}': case '~': case '?':
case ';': case ',': {
kind = toker->src[toker->cursor];
hcTokenizercharAdvance(toker, 1);
} break;
#define TOKEN_CASE_1_2(_ch0, _ch1, _kind1) \
_ch0: { \
hcTokenizercharAdvance(toker, 1); \
if (toker->src[toker->cursor] == _ch1) { \
kind = _kind1; \
hcTokenizercharAdvance(toker, 1); \
} else { \
kind = _ch0; \
} \
} break
case TOKEN_CASE_1_2('=', '=', HC_Token_Kind_EQUALITY);
case TOKEN_CASE_1_2('!', '=', HC_Token_Kind_NOTEQUAL);
case TOKEN_CASE_1_2('#', '#', HC_Token_Kind_CONCATENATE);
case TOKEN_CASE_1_2(':', ':', HC_Token_Kind_ATTRIBUTE_NAMESPACE);
case TOKEN_CASE_1_2('*', '=', HC_Token_Kind_MULTIPLY_ASSIGN);
case TOKEN_CASE_1_2('/', '=', HC_Token_Kind_DIVIDE_ASSIGN);
case TOKEN_CASE_1_2('%', '=', HC_Token_Kind_MODULUS_ASSIGN);
case TOKEN_CASE_1_2('^', '=', HC_Token_Kind_XOR_ASSIGN);
case TOKEN_CASE_1_2('[', '[', HC_Token_Kind_ATTRIBUTE_BEGIN);
case TOKEN_CASE_1_2(']', ']', HC_Token_Kind_ATTRIBUTE_END);
#undef TOKEN_CASE_1_2
#define TOKEN_CASE_1_2_2(_ch0, _ch1, _kind1, _ch2, _kind2) \
_ch0: { \
hcTokenizercharAdvance(toker, 1); \
if (toker->src[toker->cursor] == _ch1) { \
kind = _kind1; \
hcTokenizercharAdvance(toker, 1); \
} else if (toker->src[toker->cursor] == _ch2) { \
kind = _kind2; \
hcTokenizercharAdvance(toker, 1); \
} else { \
kind = _ch0; \
} \
} break
case TOKEN_CASE_1_2_2('|', '|', HC_Token_Kind_LOGICAL_OR, '=', HC_Token_Kind_OR_ASSIGN);
case TOKEN_CASE_1_2_2('&', '&', HC_Token_Kind_LOGICAL_AND, '=', HC_Token_Kind_AND_ASSIGN);
case TOKEN_CASE_1_2_2('+', '+', HC_Token_Kind_INCREMENT, '=', HC_Token_Kind_ADD_ASSIGN);
#undef TOKEN_CASE_1_2_2
case '-': {
hcTokenizercharAdvance(toker, 1);
if (toker->src[toker->cursor] == '>') {
kind = HC_Token_Kind_PTR_DEREF;
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == '-') {
kind = HC_Token_Kind_DECREMENT;
hcTokenizercharAdvance(toker, 1);
} else if (toker->src[toker->cursor] == '=') {
kind = HC_Token_Kind_SUB_ASSIGN;
hcTokenizercharAdvance(toker, 1);
} else {
kind = '-';
}
} break;
#define TOKEN_CASE_1_2_2_3(_ch0, _ch1, _kind1, _ch2, _kind2, _ch3, _kind3) \
_ch0: { \
hcTokenizercharAdvance(toker, 1); \
if (toker->src[toker->cursor] == _ch1) { \
hcTokenizercharAdvance(toker, 1); \
kind = _kind1; \
} else if (toker->src[toker->cursor] == _ch2) { \
hcTokenizercharAdvance(toker, 1); \
if (toker->src[toker->cursor] == _ch3) { \
hcTokenizercharAdvance(toker, 1); \
kind = _kind3; \
} else { \
kind = _kind2; \
} \
} else { \
kind = _ch0; \
} \
} break
case TOKEN_CASE_1_2_2_3('<', '=', HC_Token_Kind_LESSEQ, '<', HC_Token_Kind_SHIFT_LEFT, '=', HC_Token_Kind_SHIFT_LEFT_ASSIGN);
case TOKEN_CASE_1_2_2_3('>', '=', HC_Token_Kind_GREATEQ, '<', HC_Token_Kind_SHIFT_RIGHT, '=', HC_Token_Kind_SHIFT_RIGHT_ASSIGN);
#undef TOKEN_CASE_1_2_2_3
default: {
hcTokenizercharAdvance(toker, 1);
hcTokenenizerError(toker, "Invalid token");
} break;
}
HC_Token token = hcTokenMake(kind, toker->src + begin, toker->cursor - begin, toker->line, toker->column, toker->err_userdata);
toker->read_token = token;
}
/* ***********************************************
* Parser ****************************************
* ***********************************************/
struct HC_Parser {
void *mem_userdata;
void *err_userdata;
HC_Tokenizer toker;
HC_Syntax_Typer typer;
};
HC_FUNC_DECORATOR
HC_Syntax_Expr* hcParserCreateExpr (HC_Parser *parser, HC_Syntax_Expr_Kind kind)
{
(void)parser;
HC_Syntax_Expr *result = HC_MALLOC(parser->mem_userdata, sizeof(*result));
*result = (HC_Syntax_Expr){0};
result->kind = kind;
return result;
}
HC_FUNC_DECORATOR
HC_Syntax_Stmt* hcParserCreateStmt (HC_Parser *parser, HC_Syntax_Stmt_Kind kind)
{
(void)parser;
HC_Syntax_Stmt *result = HC_MALLOC(parser->mem_userdata, sizeof(*result));
*result = (HC_Syntax_Stmt){0};
result->kind = kind;
return result;
}
HC_FUNC_DECORATOR
HC_Syntax_Decl* hcParserCreateDecl (void *mem_userdata, HC_Syntax_Type *type, HC_Syntax_Attr *attr)
{
HC_Syntax_Decl *decl = hcMalloc(mem_userdata, sizeof(*decl));
*decl = (HC_Syntax_Decl) {
.type = type,
.spec = type->_decl_spec,
.attr = attr,
};
return decl;
}
HC_FUNC_DECORATOR
HC_Token hcParserGetToken (HC_Parser *p)
{
return p->toker.read_token;
}
HC_FUNC_DECORATOR
void hcParserSwallowToken (HC_Parser *p)
{
if (!hcTokenIsKind(hcParserGetToken(p), HC_Token_Kind_EOF)) {
hcTokenizerMoveForward(&p->toker);
}
}
// If match, return HC_TRUE; otherwise, HC_FALSE;
HC_FUNC_DECORATOR
HC_BOOL hcParserCheckToken (HC_Parser *p, HC_Token_Kind kind)
{
if (hcTokenIsKind(hcParserGetToken(p), kind)) {
return HC_TRUE;
} else {
return HC_FALSE;
}
}
// If match, move forward and return HC_TRUE; otherwise, return HC_FALSE;
HC_FUNC_DECORATOR
HC_BOOL hcParserConsumeToken (HC_Parser *p, HC_Token_Kind kind)
{
if (hcParserCheckToken(p, HC_Token_Kind_EOF)) {
if (kind == HC_Token_Kind_EOF) {
// We return HC_TRUE right here since we don't want to call lexerMoveForward once
// the end of stream has been reached.
return HC_TRUE;
} else {
return HC_FALSE;
}
}
if (hcParserCheckToken(p, kind)) {
hcTokenizerMoveForward(&p->toker);
return HC_TRUE;
} else {
return HC_FALSE;
}
}
HC_FUNC_DECORATOR
HC_Token hcParserExpectToken (HC_Parser *p, HC_Token_Kind kind)
{
HC_Token tok = hcParserGetToken(p);
if (!hcParserConsumeToken(p, kind)) {
hcTokenError(hcParserGetToken(p), "Expected token not found", p->err_userdata);
}
return tok;
}
HC_FUNC_DECORATOR
void hcParserAddGlobalType (HC_Parser *p, char *typename)
{
hcTyperAddGlobalType(&p->typer, typename);
}
/* ***********************************************
* Parse *****************************************
* ***********************************************/
HC_FUNC_DECORATOR
HC_Syntax_Type* hcGetPtrOfType (HC_Syntax_Typer *typer, HC_Syntax_Type *base)
{
HC_Syntax_Type *type = base;
if (type) {
HC_Syntax_Type *ptrtype = hcMalloc(typer->mem_userdata, sizeof(*ptrtype));
*ptrtype = (HC_Syntax_Type) {
.kind = HC_Syntax_Type_Kind_PTR,
.array_ptr = {
.deref = type,
},
};
type = ptrtype;
} else {
type = hcMalloc(typer->mem_userdata, sizeof(*type));
*type = (HC_Syntax_Type) {
.kind = HC_Syntax_Type_Kind_PTR,
};
}
return type;
}
HC_FUNC_DECORATOR
HC_Syntax_Type* hcGetArrayOfType (HC_Syntax_Typer *typer, HC_Syntax_Type *base, HC_Syntax_Expr *elems)
{
HC_Syntax_Type *type = base;
if (type) {
HC_Syntax_Type *arrtype = hcMalloc(typer->mem_userdata, sizeof(*arrtype));
*arrtype = (HC_Syntax_Type) {
.kind = HC_Syntax_Type_Kind_ARRAY,
.array_ptr = {
.elems = elems,
.deref = type,
},
};
type = arrtype;
} else {
type = hcMalloc(typer->mem_userdata, sizeof(*type));
*type = (HC_Syntax_Type) {
.kind = HC_Syntax_Type_Kind_ARRAY,
.array_ptr = {
.elems = elems,
},
};
}
return type;
}
HC_FUNC_DECORATOR
HC_Syntax_Type* hcGetFunctionThatReturnsType (HC_Syntax_Typer *typer, HC_Syntax_Type *return_type)
{
HC_Syntax_Type *funtype = hcMalloc(typer->mem_userdata, sizeof(*funtype));
*funtype = (HC_Syntax_Type) {
.kind = HC_Syntax_Type_Kind_FUNC,
.function = {
.return_type = return_type,
},
};
return funtype;
}
HC_FUNC_DECORATOR HC_Syntax_Type* hcParseBasicType (HC_Parser *p);
HC_FUNC_DECORATOR HC_Syntax_Decl* hcParseDeclarator (HC_Parser *p, HC_Syntax_Type *base_type, HC_Syntax_Attr *attr);
HC_FUNC_DECORATOR HC_Syntax_Expr* hcParseExpr (HC_Parser *p);
HC_FUNC_DECORATOR HC_Syntax_Expr* hcParseExprWithoutComma (HC_Parser *p);
HC_FUNC_DECORATOR HC_Syntax_Stmt* hcParseStmt (HC_Parser *p);
HC_FUNC_DECORATOR HC_Syntax_Attr *hcParseAttrHelper_Recursive (HC_Parser *p, HC_BOOL toplevel);
HC_FUNC_DECORATOR
HC_Syntax_Attr* hcParserAttrHelper_Nested (HC_Parser *p)
{
HC_Syntax_Attr *attr = NULL;
HC_Syntax_Attr *last_child = NULL;
if (hcParserCheckToken(p, '(') ||
hcParserCheckToken(p, '{') ||
hcParserCheckToken(p, '[')) {
attr = hcParseAttrHelper_Recursive(p, HC_FALSE);
} else {
while (!hcParserCheckToken(p, '(') &&
!hcParserCheckToken(p, ')') &&
!hcParserCheckToken(p, '{') &&
!hcParserCheckToken(p, '}') &&
!hcParserCheckToken(p, '[') &&
!hcParserCheckToken(p, ']')) {
HC_Syntax_Attr *child = hcParseAttrHelper_Recursive(p, HC_FALSE);
if (attr == NULL) {
attr = child;
}
if (last_child) {
last_child->next = child;
}
last_child = child;
}
}
return attr;
}
HC_FUNC_DECORATOR
HC_Syntax_Attr *hcParseAttrHelper_Recursive (HC_Parser *p, HC_BOOL toplevel)
{
HC_Syntax_Attr *attr = hcMalloc(p->mem_userdata, sizeof(*attr));
if (toplevel) {
HC_Token ft = hcParserExpectToken(p, HC_Token_Kind_IDENTIFIER);
if (hcParserConsumeToken(p, HC_Token_Kind_ATTRIBUTE_NAMESPACE)) {
HC_Token st = hcParserExpectToken(p, HC_Token_Kind_IDENTIFIER);
attr->namespace = ft;
attr->name = st;
} else {
attr->name = ft;
}
} else if (!hcParserCheckToken(p, '(') &&
!hcParserCheckToken(p, ')') &&
!hcParserCheckToken(p, '{') &&
!hcParserCheckToken(p, '}') &&
!hcParserCheckToken(p, '[') &&
!hcParserCheckToken(p, ']')) {
attr->name = hcParserGetToken(p);
hcParserSwallowToken(p);
}
if (hcParserConsumeToken(p, '(')) {
attr->child = hcParserAttrHelper_Nested(p);
hcParserExpectToken(p, ')');
} else if (hcParserConsumeToken(p, '[')) {
attr->child = hcParserAttrHelper_Nested(p);
hcParserExpectToken(p, ']');
} else if (hcParserConsumeToken(p, '{')) {
attr->child = hcParserAttrHelper_Nested(p);
hcParserExpectToken(p, '}');
}
return attr;
}
// NOTE(naman): Attributes are parsed in opposite order, to enable easy parsing of pointer and member attributes.
HC_FUNC_DECORATOR
HC_Syntax_Attr* hcParseAttr (HC_Parser *p)
{
if (!hcParserConsumeToken(p, HC_Token_Kind_ATTRIBUTE_BEGIN)) {
return NULL;
}
HC_Syntax_Attr *last_attr = NULL;
do {
HC_Syntax_Attr *attrnext = hcParseAttrHelper_Recursive(p, HC_TRUE);
if (last_attr == NULL) {
last_attr = attrnext;
} else {
attrnext->next = last_attr;
last_attr = attrnext;
}
} while (hcParserConsumeToken(p, ','));
hcParserExpectToken(p, HC_Token_Kind_ATTRIBUTE_END);
return last_attr;
}
HC_FUNC_DECORATOR
HC_Syntax_Attr* hcParseAndPrependAttribute (HC_Parser *p, HC_Syntax_Attr *parent)
{
HC_Syntax_Attr *attr = hcParseAttr(p);
if (attr != NULL) {
attr->next = parent;
return attr;
} else {
return parent;
}
}
HC_FUNC_DECORATOR
HC_Syntax_Type* hcParseBasicTypeHelper_Typedefed (HC_Parser *p)
{
HC_Token tok = hcParserGetToken(p);
HC_BOOL is_long = hcTokenIsStr(tok, "long");
HC_Syntax_Typer_Scope *sc = NULL;
while ((sc = hcTyperIterRevScopesInParser(&p->typer, sc))) {
HC_Syntax_Typer_List *tl = NULL;
while ((tl = hcTyperIterTypesInScope(sc, tl))) {
if (hcTokensAreEqual(tok, hcTyperGetName(tl))) {
HC_Syntax_Type *type = hcMalloc(p->typer.mem_userdata, sizeof(*type));
if (is_long) {
hcParserExpectToken(p, HC_Token_Kind_LONG);
if (hcParserConsumeToken(p, HC_Token_Kind_LONG)) {
*type = (HC_Syntax_Type) {
.name = hcTokenMake(HC_Token_Kind_IDENTIFIER, "long long", HC_STRLEN("long long"), 0, 0, p->err_userdata),
.kind = HC_Syntax_Type_Kind_BASE,
};
return type;
} else if (hcParserConsumeToken(p, HC_Token_Kind_DOUBLE)) {
*type = (HC_Syntax_Type) {
.name = hcTokenMake(HC_Token_Kind_IDENTIFIER, "long double", HC_STRLEN("long double"), 0, 0, p->err_userdata),
.kind = HC_Syntax_Type_Kind_BASE,
};
return type;
}
} else {
hcParserSwallowToken(p);
}
*type = (HC_Syntax_Type) {
.name = hcTyperGetName(tl),
.kind = HC_Syntax_Type_Kind_BASE,
};
return type;
}
}
}
return NULL;
}
/* NOTE(naman): Declarator Parser
* ************ ********** ******
* Warning: None of this will make sense, herein lies naught but madness, abandon all hope ye who enter here.
*
* C declarators are the most difficult thing to parse in the language (well, that and the lexer hack).
* The "conventional wisdom" is to do this using the "spiral method": https://c-faq.com/decl/spiral.anderson.html
* But this is wrong, as documented by none other than Linus Torvalds himself:
* https://web.archive.org/web/20141218085356/https://plus.google.com/+gregkroahhartman/posts/1ZhdNwbjcYF
* The correct way is to use the "forward-and-backward" method as documented here:
* - http://unixwiz.net/techtips/reading-cdecl.html
* - https://cseweb.ucsd.edu/~ricko/rt_lt.rule.html
* - http://www.ericgiguere.com/articles/reading-c-declarations.html
* But doing this using a recusrive descent parser requires an infinite look ahead and can not be done with a
* simple LL(1) parser like ours.
*
* But fear not, Messers Kernighan and Ritchie arrive to the rescue. Well, kind of.
* In K&R2 (Section 5.12), they write a deceptively simple program called cdecl that converts
* C declarations into human readable string (similar to above methods). This webpage is a updated version
* of that program: https://cdecl.org/
*
* Unfortunately, the program simply spits out strings instead of creating an AST. Because of this,
* they don't have to keep any kind of metadata and can simply afford to simulate infinite lookahead
* by writing those strings in a side-buffer and then printing them out at the right moment.
* While we can't do that same due to the requirement of returning an AST, we can use the stack
* as a side-buffer and write out functions in a way that the growing and shrinking of the call stack
* corresponds with the moments when the correct tree manipulation can be done to splice
* intermediate nodes together into a fully formed tree.
*
* To do this, we create the idea of a "inverse-transform" AST node (or "xform"), which is either:
* 1. a pointer that points to a variable of a non-available type, or
* 2. an array that holds some number of elements of a non-available type, or
* 3. a function that returns a value of a non-available type.
* This xform is then "applied" to the fully formed type (almost as a program) to get the
* actual type. This allows us to reify the lookahead into a data transformation problem
* that can be solved with some reckless pointer munching and careful order of operation.
*/
HC_FUNC_DECORATOR
HC_Syntax_Type* hcParseDeclaratorHelper_ApplyInverseTransformToType (HC_Syntax_Typer *typer, HC_Syntax_Type *real, HC_Syntax_Type *xform)
{
if (xform == NULL) {
return real;
}
if ((xform->kind == HC_Syntax_Type_Kind_ARRAY) || (xform->kind == HC_Syntax_Type_Kind_PTR)) {
real = hcParseDeclaratorHelper_ApplyInverseTransformToType(typer, real, xform->array_ptr.deref);
} else if (xform->kind == HC_Syntax_Type_Kind_FUNC) {
real = hcParseDeclaratorHelper_ApplyInverseTransformToType(typer, real, xform->function.return_type);
}
switch (xform->kind) {
case HC_Syntax_Type_Kind_PTR: real = hcGetPtrOfType(typer, real); break;
case HC_Syntax_Type_Kind_ARRAY: real = hcGetArrayOfType(typer, real, xform->array_ptr.elems); break;
case HC_Syntax_Type_Kind_FUNC: {
real = hcGetFunctionThatReturnsType(typer, real);
real->function.args = xform->function.args;
} break;
case HC_Syntax_Type_Kind_BASE: case HC_Syntax_Type_Kind_STRUCT:
case HC_Syntax_Type_Kind_UNION: case HC_Syntax_Type_Kind_ENUM:
default: break;
}
real->attr = xform->attr;
real->specifier = xform->specifier;
real->qualifier = xform->qualifier;
real->_decl_spec = xform->_decl_spec;
return real;
}
HC_FUNC_DECORATOR
HC_Syntax_Type* hcParseDeclaratorHelper_Suffix (HC_Parser *p)
{
HC_Syntax_Type *xform = NULL;
while (hcParserCheckToken(p, '(') || hcParserCheckToken(p, '[')) {
if (hcParserConsumeToken(p, '(')) {
HC_Syntax_Type *func_xform = hcGetFunctionThatReturnsType(&p->typer, NULL);
HC_Syntax_Decl *last_param = NULL;
do {
HC_Syntax_Attr *param_attr = hcParseAttr(p);
HC_Syntax_Type *param_base_type = hcParseBasicType(p);
param_attr = hcParseAndPrependAttribute(p, param_attr);
HC_Syntax_Decl *param_decl = NULL;
if (param_base_type != NULL) {
if (hcTokenIsStr(param_base_type->name, "void")) {
param_decl = hcParserCreateDecl(p->mem_userdata, param_base_type, param_attr);
} else {
param_decl = hcParseDeclarator(p, param_base_type, param_attr);
}
if (func_xform->function.args == NULL) {
func_xform->function.args = param_decl;
}
if (last_param != NULL) {
last_param->next = param_decl;
}
last_param = param_decl;
}
} while (hcParserConsumeToken(p, ','));
hcParserExpectToken(p, ')');
HC_Syntax_Type *rest = hcParseDeclaratorHelper_Suffix(p);
xform = hcParseDeclaratorHelper_ApplyInverseTransformToType(&p->typer, rest, func_xform);
} else {
HC_Syntax_Expr *expr = NULL;
hcParserExpectToken(p, '[');
if (!hcParserConsumeToken(p, ']')) {
expr = hcParseExpr(p);
hcParserExpectToken(p, ']');
}
HC_Syntax_Type *rest = hcParseDeclaratorHelper_Suffix(p);
xform = hcGetArrayOfType(&p->typer, rest, expr);
}
}
return xform;
}
HC_FUNC_DECORATOR
HC_Syntax_Type* hcParseDeclaratorHelper_Pointer (HC_Parser *p, HC_Syntax_Type *base_type, HC_Syntax_Decl *decl)
{
HC_Syntax_Type *type = base_type;
while (hcParserConsumeToken(p, '*')) {
type = hcGetPtrOfType(&p->typer, type);
type->attr = hcParseAndPrependAttribute(p, NULL);
HC_BOOL stop_loop = HC_FALSE;
while (!stop_loop) {
stop_loop = HC_TRUE;
if (hcParserConsumeToken(p, HC_Token_Kind_CONST)) { type->qualifier.q_const = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_RESTRICT)) { type->qualifier.q_restrict = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_VOLATILE)) { type->qualifier.q_volatile = HC_TRUE; stop_loop = HC_FALSE; }
}
}
HC_Syntax_Type *xform_parens = NULL;
if (hcParserCheckToken(p, HC_Token_Kind_IDENTIFIER)) {
decl->name = hcParserExpectToken(p, HC_Token_Kind_IDENTIFIER);
decl->attr = hcParseAndPrependAttribute(p, decl->attr);
} else if (hcParserConsumeToken(p, '(')) {
xform_parens = hcParseDeclaratorHelper_Pointer(p, NULL, decl);
if (!hcParserConsumeToken(p, ')')) {
hcTokenError(hcParserGetToken(p), "Closing parenthesis expected in declarator", p->err_userdata);
}
}
HC_Syntax_Type *xform_suffix = hcParseDeclaratorHelper_Suffix(p);
HC_Syntax_Type *xform = hcParseDeclaratorHelper_ApplyInverseTransformToType(&p->typer, xform_suffix, xform_parens);
type = hcParseDeclaratorHelper_ApplyInverseTransformToType(&p->typer, type, xform);
return type;
}
HC_FUNC_DECORATOR
HC_Syntax_Decl* hcParseDeclarator (HC_Parser *p, HC_Syntax_Type *base_type, HC_Syntax_Attr *attr)
{
HC_Syntax_Decl *decl = hcParserCreateDecl(p->mem_userdata, base_type, attr);
*decl = (HC_Syntax_Decl){0};
decl->type = hcParseDeclaratorHelper_Pointer(p, base_type, decl);
decl->attr = hcParseAndPrependAttribute(p, decl->attr);
return decl;
}
// If token is type, return type; otherwise, NULL;
HC_FUNC_DECORATOR
HC_Syntax_Type* hcParseBasicType (HC_Parser *p)
{
HC_Syntax_Type *type = NULL;
if (hcParserConsumeToken(p, HC_Token_Kind_STRUCT)) {
type = hcMalloc(p->mem_userdata, sizeof(*type));
type->kind = HC_Syntax_Type_Kind_STRUCT;
type->attr = hcParseAttr(p);
if (hcParserCheckToken(p, HC_Token_Kind_IDENTIFIER)) {
type->name = hcParserGetToken(p);
hcParserSwallowToken(p);
}
if (hcParserConsumeToken(p, '{')) {
HC_Syntax_Decl *last_submember = NULL;
while (!hcParserConsumeToken(p, '}')) {
HC_Syntax_Attr *member_attr = hcParseAttr(p);
HC_Syntax_Type *member_base_type = hcParseBasicType(p);
if (!member_base_type) {
HC_Token tok = hcParserGetToken(p);
HC_ERROR_PRINTF("ERROR @ [%d-%d]: Found Unknown Type = %s\n", tok.line, tok.column, tok.str);
HC_ERROR_EXIT(p->err_userdata, -1);
}
if (!hcParserConsumeToken(p, ';')) {
do {
HC_Syntax_Decl *submember_decl = hcParseDeclarator(p, member_base_type, member_attr);
submember_decl->attr = hcParseAndPrependAttribute(p, submember_decl->attr);
submember_decl->spec = member_base_type->_decl_spec;
if (hcParserConsumeToken(p, ':')) {
submember_decl->bitfield_size = hcParseExpr(p);
}
if (type->struct_union.members == NULL) {
type->struct_union.members = submember_decl;
}
if (last_submember != NULL) {
last_submember->next = submember_decl;
}
last_submember = submember_decl;
} while (hcParserConsumeToken(p, ','));
hcParserExpectToken(p, ';');
}
}
}
} else if (hcParserConsumeToken(p, HC_Token_Kind_UNION)) {
type = hcMalloc(p->mem_userdata, sizeof(*type));
type->kind = HC_Syntax_Type_Kind_UNION;
type->attr = hcParseAttr(p);
if (hcParserCheckToken(p, HC_Token_Kind_IDENTIFIER)) {
type->name = hcParserGetToken(p);
hcParserSwallowToken(p);
}
if (hcParserConsumeToken(p, '{')) {
HC_Syntax_Decl *last_submember = NULL;
while (!hcParserConsumeToken(p, '}')) {
HC_Syntax_Attr *member_attr = hcParseAttr(p);
HC_Syntax_Type *member_base_type = hcParseBasicType(p);
if (!member_base_type) {
HC_Token tok = hcParserGetToken(p);
HC_ERROR_PRINTF("ERROR @ [%d-%d]: Found Unknown Type = %s\n", tok.line, tok.column, tok.str);
HC_ERROR_EXIT(p->err_userdata, -1);
}
if (!hcParserConsumeToken(p, ';')) {
do {
HC_Syntax_Decl *submember_decl = hcParseDeclarator(p, member_base_type, member_attr);
submember_decl->attr = hcParseAndPrependAttribute(p, submember_decl->attr);
submember_decl->spec = member_base_type->_decl_spec;
if (hcParserConsumeToken(p, ':')) {
submember_decl->bitfield_size = hcParseExpr(p);
}
if (type->struct_union.members == NULL) {
type->struct_union.members = submember_decl;
}
if (last_submember != NULL) {
last_submember->next = submember_decl;
}
last_submember = submember_decl;
} while (hcParserConsumeToken(p, ','));
hcParserExpectToken(p, ';');
}
}
}
} else if (hcParserConsumeToken(p, HC_Token_Kind_ENUM)) {
type = hcMalloc(p->mem_userdata, sizeof(*type));
type->kind = HC_Syntax_Type_Kind_ENUM;
type->attr = hcParseAttr(p);
if (hcParserCheckToken(p, HC_Token_Kind_IDENTIFIER)) {
type->name = hcParserGetToken(p);
hcParserSwallowToken(p);
}
if (hcParserConsumeToken(p, ':')) {
type->enumeration.type = hcParseBasicType(p);
}
if (hcParserConsumeToken(p, '{')) {
HC_Syntax_Enumerator *last_member = NULL;
while (!hcParserConsumeToken(p, '}')) {
HC_Syntax_Enumerator *member = hcMalloc(p->mem_userdata, sizeof(*member));
member->name = hcParserGetToken(p);
hcParserExpectToken(p, HC_Token_Kind_IDENTIFIER);
if (hcParserConsumeToken(p, '=')) {
member->value = hcParseExprWithoutComma(p);
}
member->attr = hcParseAttr(p);
if (type->enumeration.enumerators == NULL) {
type->enumeration.enumerators = member;
}
if (last_member != NULL) {
last_member->next = member;
}
last_member = member;
if (!hcParserConsumeToken(p, ',')) {
hcParserExpectToken(p, '}');
break;
}
}
}
} else {
HC_Syntax_Type *temp_type = hcMalloc(p->mem_userdata, sizeof(*temp_type));
HC_BOOL stop_loop = HC_FALSE;
while (!stop_loop) {
stop_loop = HC_TRUE;
if (type == NULL) {
HC_Syntax_Type *new_type = hcParseBasicTypeHelper_Typedefed(p);
if (new_type) {
type = new_type;
type->qualifier = temp_type->qualifier;
type->_decl_spec = temp_type->_decl_spec;
stop_loop = HC_FALSE;
temp_type = type;
}
}
if (hcParserConsumeToken(p, HC_Token_Kind_SIGNED)) { temp_type->specifier.s_signed = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_UNSIGNED)) { temp_type->specifier.s_unsigned = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_CONST)) { temp_type->qualifier.q_const = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_RESTRICT)) { temp_type->qualifier.q_restrict = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_VOLATILE)) { temp_type->qualifier.q_volatile = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_AUTO)) { temp_type->_decl_spec.storage.s_auto = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_CONSTEXPR)) { temp_type->_decl_spec.storage.s_constexpr = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_EXTERN)) { temp_type->_decl_spec.storage.s_extern = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_REGISTER)) { temp_type->_decl_spec.storage.s_register = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_STATIC)) { temp_type->_decl_spec.storage.s_static = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_THREAD_LOCAL)) { temp_type->_decl_spec.storage.s_thread_local = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_INLINE)) { temp_type->_decl_spec.function.f_inline = HC_TRUE; stop_loop = HC_FALSE; }
if (hcParserConsumeToken(p, HC_Token_Kind_NORETURN)) { temp_type->_decl_spec.function.f_noreturn = HC_TRUE; stop_loop = HC_FALSE; }
}
}
return type;
}
HC_FUNC_DECORATOR
HC_Parser* hcParserCreate (char *src, void *mem_userdata, void *err_userdata)
{
HC_Parser *parser = hcMalloc(mem_userdata, sizeof(*parser));
*parser = (HC_Parser) {
.mem_userdata = mem_userdata,
.err_userdata = err_userdata,
};
hcTokenizerMake(&parser->toker, src, err_userdata);
hcTyperMake(&parser->typer, mem_userdata, err_userdata);
hcParserAddGlobalType(parser, "void");
hcParserAddGlobalType(parser, "bool");
hcParserAddGlobalType(parser, "char");
hcParserAddGlobalType(parser, "short");
hcParserAddGlobalType(parser, "int");
hcParserAddGlobalType(parser, "long");
hcParserAddGlobalType(parser, "long long");
hcParserAddGlobalType(parser, "float");
hcParserAddGlobalType(parser, "double");
hcParserAddGlobalType(parser, "long double");
// TODO(naman): Add all other "type-specifier"s from Annex A of https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf
// _BitInt(expr)
// _Complex
// _Decimal32
// _Decimal64
// _Decimal128
// _Atomic(type)
// _Atomic type // specifier
// typeof(expr | type)
// typeof_unqual(expr | type)
return parser;
}
/* ************
* Pratt Parser
* ************
* Reference:
* Top Down Operator Precedence
* Vaughan R. Pratt (1973)
* https://dl.acm.org/doi/pdf/10.1145/512927.512931
*/
typedef struct HC_Parse_Expr_Stickiness {
/* Stickiness is a quantification of precedence. It can be thought of as defining how down low
* in the AST would the nodes appear or how early they would be evaluated.
*/
/* `prefix_stickiness` only comes into play in unary or prefix operators. It denotes how
* hard does the operator "sticks" to the element just right after it. If the stickiness is
* higher than of the SECOND NEXT element, the NEXT element will stick to this
* operator instead of sticking to the SECOND NEXT element.
*
* In reference, this is the argument provided to parse() as rbp in nud.
*/
uint32_t prefix_stickiness;
/* `left_stickiness` is for binary and postfix operators. It denotes how hard the operator
* sticks to the element just to its left.
*
* This essentially signifies the "chance" that the operand before this operator
* will stick stick to this operator. For example, in "1+f(3)", we want "f" to stick with
* "(" and not "+". THis means that the left_stickiness of "(" will have to be higher
* than "+".
*
* In reference, this is the lbp.
*/
uint32_t left_stickiness;
/* `right_stickiness` is for binary operators. It denotes how hard the operator sticks to
* the element just to its right.
*
* In reference, this is the argument provided to parse() as rbp in led.
*/
uint32_t right_stickiness;
} HC_Parse_Expr_Stickiness;
// NOTE(naman): MINIMUM STICKINESS OF OPERATORS IS 100 and NOT 0.
// Keywords, etc. have their prefix_stickiness set to 50.
// For the nodes that won't actually pass their prefix_stickiness or right_stickiness,
// they are set to 0.
// Operators that are only prefix or unary have only prefix_stickiness set.
// Operators that are only postfix have only left_stickiness set.
// Operators that are only binary have both left_stickiness and right_stickiness set.
// Operators that are both unary and binary have them all set.
static const HC_Parse_Expr_Stickiness HC_GLOBAL__parse_expr_stickiness[HC_Token_Kind_TOTAL] = {
#define PLVL(x) ((16 - (x)) * 1000)
// Precedence Table: https://en.cppreference.com/w/c/language/operator_precedence
/* Operator Prefix Left Right */
[ HC_Token_Kind_INCREMENT ] = { PLVL(2), PLVL(1), 0, }, // ++
[ HC_Token_Kind_DECREMENT ] = { PLVL(2), PLVL(1), 0, }, // --
[ '(' ] = { 100, PLVL(1), 0, }, // Prefix is 100 (minimum) to denote precedence of grouping parenthesis.
[ '[' ] = { 0, PLVL(1), 0, }, // Array subscripting
[ '.' ] = { 0, PLVL(1), PLVL(1), },
[ HC_Token_Kind_PTR_DEREF ] = { 0, PLVL(1), PLVL(1), }, // ->
[ '{' ] = { PLVL(1), 0, 0, }, // (type){list}
[ '+' ] = { PLVL(2), PLVL(4), PLVL(4), },
[ '-' ] = { PLVL(2), PLVL(4), PLVL(4), },
[ '!' ] = { PLVL(2), 0, 0, },
[ '~' ] = { PLVL(2), 0, 0, },
[ '*' ] = { PLVL(2), PLVL(3), PLVL(3), },
[ '&' ] = { PLVL(2), PLVL(8), PLVL(8), },
[ HC_Token_Kind_SIZE_OF ] = { PLVL(2), 0, 0, }, // sizeof(expr) or sizeof(type)
[ HC_Token_Kind_ALIGNOF ] = { PLVL(2), 0, 0, }, // alignof(expr) or alignof(type)
[ '/' ] = { 0, PLVL(3), PLVL(3), },
[ '%' ] = { 0, PLVL(3), PLVL(3), },
[ HC_Token_Kind_SHIFT_LEFT ] = { 0, PLVL(5), PLVL(5), }, // <<
[ HC_Token_Kind_SHIFT_RIGHT ] = { 0, PLVL(5), PLVL(5), }, // >>
[ '<' ] = { 0, PLVL(6), PLVL(6), },
[ '>' ] = { 0, PLVL(6), PLVL(6), },
[ HC_Token_Kind_LESSEQ ] = { 0, PLVL(6), PLVL(6), }, // <=
[ HC_Token_Kind_GREATEQ ] = { 0, PLVL(6), PLVL(6), }, // >=
[ HC_Token_Kind_EQUALITY ] = { 0, PLVL(7), PLVL(7), }, // ==
[ HC_Token_Kind_NOTEQUAL ] = { 0, PLVL(7), PLVL(7), }, // !=
[ '^' ] = { 0, PLVL(10), PLVL(10), },
[ '|' ] = { 0, PLVL(10), PLVL(10), },
[ HC_Token_Kind_LOGICAL_AND ] = { 0, PLVL(11), PLVL(11), }, // &&
[ HC_Token_Kind_LOGICAL_OR ] = { 0, PLVL(12), PLVL(12), }, // ||
[ '?' ] = { 0, PLVL(13), PLVL(13), },
// NOTE(naman): Left stickiness is slightly higher due to Right-to-left associativity
[ '=' ] = { 0, PLVL(14)+1, PLVL(14), },
[ HC_Token_Kind_ADD_ASSIGN ] = { 0, PLVL(14)+1, PLVL(14), }, // +=
[ HC_Token_Kind_SUB_ASSIGN ] = { 0, PLVL(14)+1, PLVL(14), }, // -=
[ HC_Token_Kind_MULTIPLY_ASSIGN ] = { 0, PLVL(14)+1, PLVL(14), }, // *=
[ HC_Token_Kind_DIVIDE_ASSIGN ] = { 0, PLVL(14)+1, PLVL(14), }, // /=
[ HC_Token_Kind_MODULUS_ASSIGN ] = { 0, PLVL(14)+1, PLVL(14), }, // %=
[HC_Token_Kind_SHIFT_LEFT_ASSIGN ] = { 0, PLVL(14)+1, PLVL(14), }, // <<=
[HC_Token_Kind_SHIFT_RIGHT_ASSIGN] = { 0, PLVL(14)+1, PLVL(14), }, // >>=
[ HC_Token_Kind_AND_ASSIGN ] = { 0, PLVL(14)+1, PLVL(14), }, // &=
[ HC_Token_Kind_XOR_ASSIGN ] = { 0, PLVL(14)+1, PLVL(14), }, // ^=
[ HC_Token_Kind_OR_ASSIGN ] = { 0, PLVL(14)+1, PLVL(14), }, // |=
// NOTE(naman): Directly inside a function call and designated initializers, comma is parsed
// as a separator while it is a combiner for expressions everywhere else.
// To work around this, we assign the precedence 1000 to comma as an combiner, but pass 1500
// as base_stickiness when parsing argument expressions so that the comma doesn't get parsed
// as a combiner.
[ ',' ] = { 0, PLVL(15), PLVL(15), },
[ HC_Token_Kind_CONSTANT_INTEGER ] = { 50, 0, 0, },
[ HC_Token_Kind_CONSTANT_FLOATING] = { 50, 0, 0, },
[HC_Token_Kind_CONSTANT_CHARACTER] = { 50, 0, 0, },
[ HC_Token_Kind_CONSTANT_STRING ] = { 50, 0, 0, },
[ HC_Token_Kind_IDENTIFIER ] = { 50, 0, 0, },
#undef PLVL
};
HC_FUNC_DECORATOR
HC_Syntax_Expr* hcParseExprHelper_Pratt (HC_Parser *p, uint32_t base_stickiness)
{
HC_Token t1 = hcParserGetToken(p);
if (hcTokenIsKind(t1, HC_Token_Kind_EOF)) {
hcTokenenizerError(&p->toker, "Source code ended abruptly while parsing expression");
}
hcParserSwallowToken(p);
HC_Parse_Expr_Stickiness t1s = HC_GLOBAL__parse_expr_stickiness[hcTokenKind(t1)];
if (t1s.prefix_stickiness == 0) {
hcTokenError(t1, "Unexpected token", p->err_userdata);
}
HC_Syntax_Expr *left = hcParserCreateExpr(p, HC_Syntax_Expr_Kind_LITERAL);
/* This parses
* 1. operators they are only applied to stuff after it (like Unary Prefix operators and
* open parenthesis)
* 2. Literals
* 3. Wrapper directives
* 4. Lambda functions
*/
/*
* "NULL Denotation" (NUD) is a function associated with each token that is used to handle
* things that have no association to the object left to them.
*
* Example:
* 1. -5
* In this, the "-" has nothing to it's left. As such, the NULL Denotation function
* would be called that would then call parseExpr again to parse "5".
* 2. -5 + 6
* In this, since the unary "-" has higher precedence that binary "+", the NUD called
* would just parse "5" and return. We can also say that the prefix_stickiness of prefix
* "-" is higher than left_stickiness of "+".
* 3. -(5+6)
* In this, the NUD will call parseExpr to parse "(5+6)" with very high prefix_stickiness
* passed in as the base_stickiness. Then, the NUD of "(" will get called which in turn
* will call parseExpr again with very low prefix_stickiness (lowest, in fact).
* parseExpr doesn't return until it finds a object of lower left_stickiness than
* the passed base_stickiness. This won't happen until we find ")". At this point,
* the parseExpr returns and now, we are in the parseExpr which was called by prefix "-".
* The next object's left_stickiness would be lower than the base_stickiness since
* "-" has the highest prefix_stickiness. This means that this function will return
* too now and we get properly parsed AST of "-(5+6)".
*/
HC_Token_Kind kind1 = hcTokenKind(t1);
switch ((int)kind1) {
case HC_Token_Kind_CONSTANT_INTEGER: case HC_Token_Kind_CONSTANT_FLOATING:
case HC_Token_Kind_CONSTANT_CHARACTER: case HC_Token_Kind_CONSTANT_STRING:
case HC_Token_Kind_IDENTIFIER: {
left->kind = HC_Syntax_Expr_Kind_LITERAL;
left->literal.token = t1;
} break;
case '(': case '{': {
HC_Syntax_Type *base_type = NULL;
HC_BOOL parse_init_list = HC_FALSE;
if (hcTokenIsKind(t1, '(')) {
base_type = hcParseBasicType(p);
if (base_type) {
HC_Syntax_Decl *decl = hcParseDeclarator(p, base_type, NULL);
base_type = decl->type;
hcParserExpectToken(p, ')');
if (hcParserConsumeToken(p, '{')) {
parse_init_list = HC_TRUE;
} else {
left->kind = HC_Syntax_Expr_Kind_TYPE_CAST;
left->type_cast.type = base_type;
left->type_cast.expr = hcParseExprHelper_Pratt(p, t1s.prefix_stickiness);
}
} else {
left = hcParseExprHelper_Pratt(p, t1s.prefix_stickiness);
if (left == NULL) return NULL;
if (!hcParserConsumeToken(p, ')')) {
hcTokenError(hcParserGetToken(p), "Expected token ')'", p->err_userdata);
}
}
} else if (hcTokenIsKind(t1, '{')) {
parse_init_list = HC_TRUE;
}
if (parse_init_list) {
left->kind = HC_Syntax_Expr_Kind_DESIG_INIT;
left->desig_init.type = base_type;
HC_Syntax_Expr *last_di = NULL;
while (true) {
if (hcParserConsumeToken(p, '}')) break;
HC_Syntax_Expr *di = hcParserCreateExpr(p, HC_Syntax_Expr_Kind_DESIG_INIT);
HC_BOOL designation_seen = HC_FALSE;
HC_Syntax_Dsig *last_desig = NULL;
while (hcParserCheckToken(p, '.') || hcParserCheckToken(p, '[')) {
designation_seen = HC_TRUE;
HC_Syntax_Dsig *desig = hcMalloc(p->mem_userdata, sizeof(*desig));
if (hcParserConsumeToken(p, '.')) {
desig->dotted = hcParserExpectToken(p, HC_Token_Kind_IDENTIFIER);
} else if (hcParserConsumeToken(p, '[')) {
desig->indexed = hcParseExprHelper_Pratt(p, 50);
hcParserExpectToken(p, ']');
}
if (di->desig_init.desig == NULL) {
di->desig_init.desig = desig;
}
if (last_desig != NULL) {
last_desig->next = desig;
}
last_desig = desig;
}
if (designation_seen) {
hcParserExpectToken(p, '=');
}
// NOTE(naman): This is so that the comma between values is parsed as a separator
// and not a combiner. See more detail in the comment next to the entry for comma
// in the precedence table above.
di->desig_init.init = hcParseExprHelper_Pratt(p, 1500);
if (di->desig_init.init == NULL) return NULL;
if (left->desig_init.first_init == NULL) {
left->desig_init.first_init = di;
}
if (last_di != NULL) {
last_di->next_sibling = di;
}
last_di = di;
if (!hcParserConsumeToken(p, ',')) {
hcParserExpectToken(p, '}');
break;
}
}
}
} break;
case '+': case '-': case '!': case '~':
case '*': case '&':
case HC_Token_Kind_INCREMENT: case HC_Token_Kind_DECREMENT: {
left->kind = HC_Syntax_Expr_Kind_PREFIX;
left->prefix.op = t1;
left->prefix.rhs = hcParseExprHelper_Pratt(p, t1s.prefix_stickiness);
if (left->prefix.rhs == NULL) return NULL;
} break;
case HC_Token_Kind_SIZE_OF: {
left->kind = HC_Syntax_Expr_Kind_SIZE_OF;
HC_BOOL paren = HC_FALSE;
if (hcParserConsumeToken(p, '(')) paren = HC_TRUE;
HC_Syntax_Type *arg_type = hcParseBasicType(p);
if (arg_type) {
HC_Syntax_Decl *arg_decl = hcParseDeclarator(p, arg_type, NULL);
arg_type = arg_decl->type;
left->size_of.arg_type = arg_type;
if (left->size_of.arg_type == NULL) return NULL;
} else {
left->size_of.arg_expr = hcParseExprHelper_Pratt(p, t1s.prefix_stickiness);
if (left->size_of.arg_expr == NULL) return NULL;
}
if (paren) hcParserExpectToken(p, ')');
} break;
default: {
hcTokenError(t1, "Token can not be parsed as a prefix operator, literal or name", p->err_userdata);
} break;
}
HC_Token t2 = hcParserGetToken(p);
HC_Parse_Expr_Stickiness t2s = HC_GLOBAL__parse_expr_stickiness[hcTokenKind(t2)];
while (!hcTokenIsKind(t2, HC_Token_Kind_EOF) && (base_stickiness < t2s.left_stickiness)) {
hcParserSwallowToken(p);
HC_Syntax_Expr *right = hcParserCreateExpr(p, HC_Syntax_Expr_Kind_LITERAL);
/* This parses:
* 1. Infix operators
* 2. Postfix operators
* 3. Function calls
* 4. Data init
* 5. Array indexing
*/
/*
* "Left Denotation" (LED) is a function associated with each token that is
* used to handle things that do have a association to the left.
*
* Example:
* 1. 2+3
* In this, when we parse "+", we call it's LED. That calls parseExpr back with
* base_stickiness equal to the right_stickiness of "+". That call to parseExpr
* parses "3" and returns.
* 2. 2+3*4
* Here, when the LED of "+" calls parseExpr, the base_stickiness is equal to the
* right_stickiness of "+" which is lower than the left_stickiness of "*". That
* means that the whole "3*4" gets parsed before the parseExpr returns it to be
* associated with "+".
*/
HC_Token_Kind kind2 = hcTokenKind(t2);
switch ((int)kind2) {
case HC_Token_Kind_INCREMENT: case HC_Token_Kind_DECREMENT: {
right->kind = HC_Syntax_Expr_Kind_POSTFIX;
right->postfix.op = t2;
right->postfix.lhs = hcParseExprHelper_Pratt(p, t2s.right_stickiness);
if (right->postfix.lhs == NULL) return NULL;
} break;
case '+': case '-': case '*': case '/': case '%':
case '&': case '|': case '^':
case '.': case HC_Token_Kind_PTR_DEREF:
case '<': case '>':
case HC_Token_Kind_LESSEQ: case HC_Token_Kind_GREATEQ:
case HC_Token_Kind_EQUALITY: case HC_Token_Kind_NOTEQUAL:
case HC_Token_Kind_SHIFT_LEFT: case HC_Token_Kind_SHIFT_RIGHT:
case HC_Token_Kind_LOGICAL_AND: case HC_Token_Kind_LOGICAL_OR:
case '=':
case HC_Token_Kind_ADD_ASSIGN: case HC_Token_Kind_SUB_ASSIGN:
case HC_Token_Kind_MULTIPLY_ASSIGN: case HC_Token_Kind_DIVIDE_ASSIGN:
case HC_Token_Kind_MODULUS_ASSIGN: case HC_Token_Kind_SHIFT_LEFT_ASSIGN:
case HC_Token_Kind_SHIFT_RIGHT_ASSIGN: case HC_Token_Kind_AND_ASSIGN:
case HC_Token_Kind_XOR_ASSIGN: case HC_Token_Kind_OR_ASSIGN: {
right->kind = HC_Syntax_Expr_Kind_INFIX;
right->infix.lhs = left;
right->infix.op = t2;
right->infix.rhs = hcParseExprHelper_Pratt(p, t2s.right_stickiness);
if (right->infix.rhs == NULL) return NULL;
} break;
case '(': {
right->kind = HC_Syntax_Expr_Kind_CALL;
right->call.func = left;
HC_Syntax_Expr *last_arg = NULL;
if (!hcParserConsumeToken(p, ')')) {
do {
HC_Syntax_Expr *arg = hcParseExprHelper_Pratt(p, 1500);
if (arg == NULL) return NULL;
if (right->call.first_arg == NULL) {
right->call.first_arg = arg;
}
if (last_arg) {
last_arg->next_sibling = arg;
}
last_arg = arg;
} while (hcParserConsumeToken(p, ','));
hcParserExpectToken(p, ')');
}
} break;
case '[': {
right->kind = HC_Syntax_Expr_Kind_INDEX;
right->index.array = left;
right->index.index = hcParseExprHelper_Pratt(p, t2s.right_stickiness);
if (right->index.index == NULL) return NULL;
hcParserExpectToken(p, ']');
} break;
case '?': {
right->kind = HC_Syntax_Expr_Kind_TERNARY;
right->ternary.cond = left;
right->ternary.then = hcParseExprHelper_Pratt(p, t2s.right_stickiness);
if (right->ternary.then == NULL) return NULL;
hcParserExpectToken(p, ':');
right->ternary.or_else = hcParseExprHelper_Pratt(p, t2s.right_stickiness);
if (right->ternary.or_else == NULL) return NULL;
} break;
default: {
hcTokenError(t2, "Token is not an operator", p->err_userdata);
} break;
}
t1 = t2;
t1s = HC_GLOBAL__parse_expr_stickiness[hcTokenKind(t1)];
t2 = hcParserGetToken(p);
t2s = HC_GLOBAL__parse_expr_stickiness[hcTokenKind(t2)];
left = right;
}
return left;
}
HC_FUNC_DECORATOR
HC_Syntax_Expr* hcParseExpr (HC_Parser *p)
{
return hcParseExprHelper_Pratt(p, 50);
}
HC_FUNC_DECORATOR
HC_Syntax_Expr* hcParseExprWithoutComma (HC_Parser *p)
{
// NOTE(naman): This is so that the comma after expression is parsed as a separator
// and not a combiner. See more detail in the comment next to the entry for comma
// in the precedence table above.
return hcParseExprHelper_Pratt(p, 1500);
}
HC_FUNC_DECORATOR
HC_Syntax_Stmt* hcParseDecl (HC_Parser *p, HC_Syntax_Type *base_type, HC_Syntax_Attr *attr)
{
HC_Syntax_Stmt *last_lvalue = NULL;
HC_Syntax_Decl *type_decl = hcParseDeclarator(p, base_type, attr);
HC_Syntax_Stmt *decl = NULL;
if (type_decl->type->kind == HC_Syntax_Type_Kind_FUNC) {
decl = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_FUNDECL);
decl->stmt_fundecl.decl = type_decl;
decl->stmt_fundecl.decl->attr = hcParseAndPrependAttribute(p, decl->stmt_fundecl.decl->attr);
decl->stmt_fundecl.decl->spec = base_type->_decl_spec;
if (hcParserCheckToken (p, '{')) {
decl->stmt_fundecl.body = hcParseStmt(p);
} else {
hcParserExpectToken(p, ';');
}
} else {
HC_Syntax_Stmt *lvalue = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_VARDECL);
lvalue->stmt_vardecl.decl = type_decl;
lvalue->stmt_vardecl.decl->attr = hcParseAndPrependAttribute(p, type_decl->attr);
lvalue->stmt_vardecl.decl->spec = base_type->_decl_spec;
decl = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_VARDECL);
do {
if (lvalue == NULL) {
lvalue = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_VARDECL);
lvalue->stmt_vardecl.decl = hcParseDeclarator(p, base_type, attr);
lvalue->stmt_vardecl.decl->attr = hcParseAndPrependAttribute(p, lvalue->stmt_vardecl.decl->attr);
lvalue->stmt_vardecl.decl->spec = base_type->_decl_spec;
}
if (hcParserConsumeToken(p, '=')) {
lvalue->stmt_vardecl.expr = hcParseExprWithoutComma(p);
}
if (hcParserCheckToken(p, ',')) {
hcParserSwallowToken(p);
}
if (decl->stmt_vardecl.first_var == NULL) {
decl->stmt_vardecl.first_var = lvalue;
}
if (last_lvalue != NULL) {
last_lvalue->next_sibling = lvalue;
}
last_lvalue = lvalue;
if (lvalue == NULL) {
hcTokenError(hcParserGetToken(p), "No variable parsed in declaration", p->err_userdata);
}
lvalue = NULL;
} while (!hcParserConsumeToken(p, ';'));
}
return decl;
}
HC_FUNC_DECORATOR
HC_Syntax_Stmt* hcParseStmt (HC_Parser *p)
{
HC_Syntax_Stmt *root = NULL;
bool attach_attr = true;
HC_Syntax_Attr *attr = hcParseAttr(p);
HC_Syntax_Type *base_type = hcParseBasicType(p);
if (base_type) {
root = hcParseDecl(p, base_type, attr);
attach_attr = false;
} else if (hcParserConsumeToken(p, HC_Token_Kind_EOF)) {
return NULL;
} else if (hcParserConsumeToken(p, ';')) {
root = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_EMPTY);
} else if (hcParserConsumeToken(p, '{')) {
hcTyperPushTypeScope(&p->typer);
root = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_BLOCK);
HC_Syntax_Stmt *last_stmt = NULL;
while (!hcParserConsumeToken(p, '}')) {
HC_Syntax_Stmt *child = hcParseStmt(p);
if (root->stmt_block.first_stmt == NULL) {
root->stmt_block.first_stmt = child;
}
if (last_stmt) {
last_stmt->next_sibling = child;
}
last_stmt = child;
}
hcTyperPopTypeScope(&p->typer);
} else if (hcParserConsumeToken(p, HC_Token_Kind_RETURN)) {
root = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_RETURN);
root->stmt_return.expr = hcParseExpr(p);
if (!hcParserConsumeToken(p, ';')) {
hcTokenError(hcParserGetToken(p), "Expected token ';'", p->err_userdata);
}
} else if (hcParserConsumeToken(p, HC_Token_Kind_IF)) {
root = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_IF);
HC_Syntax_Stmt *last_branch = NULL;
HC_BOOL first_branch = HC_TRUE, else_only = HC_FALSE;
while (HC_TRUE) {
HC_Syntax_Stmt *branch = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_IF);
if (first_branch) {
first_branch = HC_FALSE;
} else {
if (hcParserConsumeToken(p, HC_Token_Kind_ELSE)) {
if (hcParserConsumeToken(p, HC_Token_Kind_IF)) {
else_only = HC_FALSE;
} else {
else_only = HC_TRUE;
}
} else {
break;
}
}
if (!else_only) {
hcParserExpectToken(p, '(');
branch->stmt_if.cond = hcParseExpr(p);
hcParserExpectToken(p, ')');
}
branch->stmt_if.body = hcParseStmt(p);
if (root->stmt_if.first_branch == NULL) {
root->stmt_if.first_branch = branch;
}
if (last_branch) {
last_branch->next_sibling = branch;
}
last_branch = branch;
if (else_only) break;
}
} else if (hcParserConsumeToken(p, HC_Token_Kind_FOR)) {
root = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_FOR);
hcParserExpectToken(p, '(');
if (!hcParserConsumeToken(p, ';')) {
HC_Syntax_Type *init_base_type = hcParseBasicType(p);
if (init_base_type) {
root->stmt_for.init_decl = hcParseDecl(p, init_base_type, NULL);
} else {
root->stmt_for.init_expr = hcParseExpr(p);
}
hcParserExpectToken(p, ';');
}
if (!hcParserConsumeToken(p, ';')) {
root->stmt_for.cond = hcParseExpr(p);
hcParserExpectToken(p, ';');
}
if (!hcParserConsumeToken(p, ')')) {
root->stmt_for.updt = hcParseExpr(p);
hcParserExpectToken(p, ')');
}
root->stmt_for.body = hcParseStmt(p);
} else if (hcParserConsumeToken(p, HC_Token_Kind_WHILE)) {
root = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_WHILE);
hcParserExpectToken(p, '(');
root->stmt_while.cond = hcParseExpr(p);
hcParserExpectToken(p, ')');
root->stmt_while.body = hcParseStmt(p);
} else if (hcParserConsumeToken(p, HC_Token_Kind_TYPEDEF)) {
root = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_TYPEDEF);
HC_Syntax_Type *tdef_base_type = hcParseBasicType(p);
HC_Syntax_Decl *last_decl = NULL;
if (!hcParserConsumeToken(p, ';')) {
do {
attach_attr = false;
HC_Syntax_Decl *decl = hcParseDeclarator(p, tdef_base_type, attr);
decl->attr = hcParseAndPrependAttribute(p, decl->attr);
decl->spec = tdef_base_type->_decl_spec;
hcTyperAddType(&p->typer, decl->name);
if (root->stmt_typedef.first_def == NULL) {
root->stmt_typedef.first_def = decl;
}
if (last_decl != NULL) {
last_decl->next = decl;
}
last_decl = decl;
} while (hcParserConsumeToken(p, ','));
hcParserExpectToken(p, ';');
}
} else {
root = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_STMT_EXPR);
root->stmt_expr.expr = hcParseExpr(p);
if (!hcParserConsumeToken(p, ';')) {
hcTokenError(hcParserGetToken(p), "Expected token ';'", p->err_userdata);
}
}
if (attach_attr) {
root->attribute = attr;
}
return root;
}
HC_FUNC_DECORATOR
HC_Syntax_Tree hcParse (HC_Parser *p)
{
HC_Syntax_Tree st = {0};
st.top_level = hcParserCreateStmt(p, HC_Syntax_Stmt_Kind_TOP_LEVEL);
HC_Syntax_Stmt *last_stmt = NULL;
while (HC_TRUE) {
HC_Syntax_Stmt *stmt = hcParseStmt(p);
if (stmt == NULL) break;
if (st.top_level->top_level.first_stmt == NULL) {
st.top_level->top_level.first_stmt = stmt;
}
if (last_stmt) {
last_stmt->next_sibling = stmt;
}
last_stmt = stmt;
}
return st;
}
HC_FUNC_DECORATOR void hcPrintStmt (HC_Syntax_Stmt *node, void *fmt_userdata, void *err_userdata, unsigned indent_level);
HC_FUNC_DECORATOR void hcPrintDecl (HC_Syntax_Decl *decl, void *fmt_userdata, void *err_userdata, unsigned indent_level);
HC_FUNC_DECORATOR
void hcPrintToken (HC_Token tok, void *fmt_userdata)
{
HC_Token_Kind kind = hcTokenKind(tok);
switch ((int)kind) {
case HC_Token_Kind_EQUALITY: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "=="); break;
case HC_Token_Kind_NOTEQUAL: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "!="); break;
case HC_Token_Kind_LESSEQ: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "<="); break;
case HC_Token_Kind_GREATEQ: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ">="); break;
case HC_Token_Kind_LOGICAL_OR: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "||"); break;
case HC_Token_Kind_LOGICAL_AND: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "&&"); break;
case HC_Token_Kind_SHIFT_LEFT: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "<<"); break;
case HC_Token_Kind_SHIFT_RIGHT: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ">>"); break;
case HC_Token_Kind_ADD_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "+="); break;
case HC_Token_Kind_SUB_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "-="); break;
case HC_Token_Kind_MULTIPLY_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "*="); break;
case HC_Token_Kind_DIVIDE_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "/="); break;
case HC_Token_Kind_MODULUS_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "%%="); break;
case HC_Token_Kind_XOR_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "^="); break;
case HC_Token_Kind_OR_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "|="); break;
case HC_Token_Kind_AND_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "&="); break;
case HC_Token_Kind_SHIFT_LEFT_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "<<="); break;
case HC_Token_Kind_SHIFT_RIGHT_ASSIGN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ">>="); break;
case HC_Token_Kind_INCREMENT: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "++"); break;
case HC_Token_Kind_DECREMENT: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "--"); break;
case HC_Token_Kind_PTR_DEREF: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "->"); break;
case HC_Token_Kind_ELLIPSIS: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "..."); break;
case HC_Token_Kind_ATTRIBUTE_NAMESPACE: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "::"); break;
case HC_Token_Kind_ATTRIBUTE_BEGIN: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "[["); break;
case HC_Token_Kind_ATTRIBUTE_END: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "]]"); break;
case HC_Token_Kind_CONCATENATE: HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "##"); break;
default: {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " %s ", tok.str);
} break;
}
}
HC_FUNC_DECORATOR
void hcPrintAddIndentation (void *fmt_userdata, unsigned indent_level)
{
for (unsigned i = 0; i < indent_level; i++) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "\t");
}
}
HC_FUNC_DECORATOR
void hcPrintAttributeRecursive (HC_Syntax_Attr *attr, void *fmt_userdata, unsigned indent_level)
{
if (!hcTokenIsKind(attr->namespace, HC_Token_Kind_EOF)) {
hcPrintToken(attr->namespace, fmt_userdata);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "::");
}
if (!hcTokenIsKind(attr->name, HC_Token_Kind_EOF)) {
hcPrintToken(attr->name, fmt_userdata);
}
if (attr->child) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
for (HC_Syntax_Attr *ac = attr->child; ac != NULL; ac = ac->next) {
hcPrintAttributeRecursive(ac, fmt_userdata, indent_level);
if (ac->next != NULL)HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " ");
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ") ");
}
}
HC_FUNC_DECORATOR
void hcPrintAttribute (HC_Syntax_Attr *attr, void *fmt_userdata, unsigned indent_level)
{
if (attr == NULL) return;
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " [[");
for (HC_Syntax_Attr *a = attr; a != NULL; a = a->next) {
hcPrintAttributeRecursive(a, fmt_userdata, indent_level);
if (a->next != NULL) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ", ");
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "]] ");
}
HC_FUNC_DECORATOR void hcPrintTypeTopDown (HC_Syntax_Type *type, void *fmt_userdata, void *err_userdata, unsigned indent_level);
HC_FUNC_DECORATOR void hcPrintExpr (HC_Syntax_Expr *expr, void *fmt_userdata, void *err_userdata, unsigned indent_level);
HC_FUNC_DECORATOR
HC_BOOL hcPrintTypeBottomUp (HC_Syntax_Type *type, void *fmt_userdata, void *err_userdata, unsigned indent_level)
{
if (type == NULL) return HC_FALSE;
switch (type->kind) {
case HC_Syntax_Type_Kind_BASE: {
if (type->attr) hcPrintAttribute(type->attr, fmt_userdata, indent_level);
if (type->specifier.s_signed) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "signed ");
if (type->specifier.s_unsigned) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "unsigned ");
if (type->qualifier.q_const) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "const ");
if (type->qualifier.q_restrict) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "restrict ");
if (type->qualifier.q_volatile) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "volatile ");
hcPrintToken (type->name, fmt_userdata);
return HC_FALSE;
} break;
case HC_Syntax_Type_Kind_PTR: {
if (hcPrintTypeBottomUp(type->array_ptr.deref, fmt_userdata, err_userdata, indent_level)) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "*");
} break;
case HC_Syntax_Type_Kind_ARRAY: {
if (hcPrintTypeBottomUp(type->array_ptr.deref, fmt_userdata, err_userdata, indent_level)) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
}
} break;
case HC_Syntax_Type_Kind_FUNC: {
if (hcPrintTypeBottomUp(type->function.return_type, fmt_userdata, err_userdata, indent_level)) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
}
} break;
case HC_Syntax_Type_Kind_STRUCT: {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "struct");
if (type->attr) hcPrintAttribute(type->attr, fmt_userdata, indent_level);
if (type->name.length) hcPrintToken (type->name, fmt_userdata);
if (type->struct_union.members) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " {\n");
for (HC_Syntax_Decl *d = type->struct_union.members; d != NULL; d = d->next) {
hcPrintDecl(d, fmt_userdata, err_userdata, indent_level+1);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";\n");
}
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "}");
}
} break;
case HC_Syntax_Type_Kind_UNION: {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "union");
if (type->attr) hcPrintAttribute(type->attr, fmt_userdata, indent_level);
if (type->name.length) hcPrintToken (type->name, fmt_userdata);
if (type->struct_union.members) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " {\n");
for (HC_Syntax_Decl *d = type->struct_union.members; d != NULL; d = d->next) {
hcPrintDecl(d, fmt_userdata, err_userdata, indent_level+1);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";\n");
}
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "}");
}
} break;
case HC_Syntax_Type_Kind_ENUM: {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "enum");
if (type->attr) hcPrintAttribute(type->attr, fmt_userdata, indent_level);
if (type->name.length) hcPrintToken (type->name, fmt_userdata);
if (type->enumeration.type) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " : ");
hcPrintTypeBottomUp(type->enumeration.type, fmt_userdata, err_userdata, indent_level);
hcPrintTypeTopDown(type->enumeration.type, fmt_userdata, err_userdata, indent_level);
}
if (type->enumeration.enumerators) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " {\n");
for (HC_Syntax_Enumerator *d = type->enumeration.enumerators; d != NULL; d = d->next) {
hcPrintAddIndentation(fmt_userdata, indent_level + 1);
hcPrintToken (d->name, fmt_userdata);
if (d->value) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " = ");
hcPrintExpr(d->value, fmt_userdata, err_userdata, 0);
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ",\n");
}
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "}");
}
} break;
default: {
HC_ERROR_PRINTF("Unable to print unknown type");
HC_ERROR_EXIT(err_userdata, -1);
} break;
}
return HC_TRUE;
}
HC_FUNC_DECORATOR
void hcPrintTypeTopDown (HC_Syntax_Type *type, void *fmt_userdata, void *err_userdata, unsigned indent_level)
{
if (type == NULL) return;
switch (type->kind) {
case HC_Syntax_Type_Kind_BASE: break;
case HC_Syntax_Type_Kind_PTR: {
if (type->array_ptr.deref->kind != HC_Syntax_Type_Kind_BASE) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
hcPrintTypeTopDown(type->array_ptr.deref, fmt_userdata, err_userdata, indent_level);
} break;
case HC_Syntax_Type_Kind_ARRAY: {
if (type->array_ptr.deref->kind != HC_Syntax_Type_Kind_BASE) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "[");
hcPrintExpr(type->array_ptr.elems, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "]");
hcPrintTypeTopDown(type->array_ptr.deref, fmt_userdata, err_userdata, indent_level);
} break;
case HC_Syntax_Type_Kind_FUNC: {
if (type->function.return_type->kind != HC_Syntax_Type_Kind_BASE) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
for (HC_Syntax_Decl *d = type->function.args; d != NULL; d = d->next) {
hcPrintDecl(d, fmt_userdata, err_userdata, indent_level);
if (d->next != NULL) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ", ");
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
hcPrintTypeTopDown(type->function.return_type, fmt_userdata, err_userdata, indent_level);
} break;
case HC_Syntax_Type_Kind_STRUCT:
case HC_Syntax_Type_Kind_UNION:
case HC_Syntax_Type_Kind_ENUM:
break;
default: {
HC_ERROR_PRINTF("Unable to print unknown type");
HC_ERROR_EXIT(err_userdata, -1);
} break;
}
}
HC_FUNC_DECORATOR
void hcPrintDecl (HC_Syntax_Decl *decl, void *fmt_userdata, void *err_userdata, unsigned indent_level)
{
hcPrintAddIndentation(fmt_userdata, indent_level);
if (decl->spec.storage.s_auto) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "auto ");
if (decl->spec.storage.s_constexpr) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "constexpr ");
if (decl->spec.storage.s_extern) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "extern ");
if (decl->spec.storage.s_register) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "register ");
if (decl->spec.storage.s_static) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "static ");
if (decl->spec.storage.s_thread_local) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "thread_local ");
if (decl->spec.function.f_inline) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "inline ");
if (decl->spec.function.f_noreturn) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "noreturn ");
hcPrintTypeBottomUp(decl->type, fmt_userdata, err_userdata, indent_level);
if (decl->name.length) hcPrintToken(decl->name, fmt_userdata);
hcPrintAttribute(decl->attr, fmt_userdata, 0);
hcPrintTypeTopDown(decl->type, fmt_userdata, err_userdata, indent_level);
}
HC_FUNC_DECORATOR
void hcPrintExpr (HC_Syntax_Expr *expr, void *fmt_userdata, void *err_userdata, unsigned indent_level)
{
if (expr == NULL) return;
switch (expr->kind) {
case HC_Syntax_Expr_Kind_LITERAL: {
hcPrintToken(expr->literal.token, fmt_userdata);
} break;
case HC_Syntax_Expr_Kind_INFIX: {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
hcPrintExpr(expr->infix.lhs, fmt_userdata, err_userdata, indent_level);
hcPrintToken(expr->infix.op, fmt_userdata);
hcPrintExpr(expr->infix.rhs, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
} break;
case HC_Syntax_Expr_Kind_PREFIX: {
hcPrintToken(expr->prefix.op, fmt_userdata);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
hcPrintExpr(expr->prefix.rhs, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
} break;
case HC_Syntax_Expr_Kind_POSTFIX: {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
hcPrintExpr(expr->postfix.lhs, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
hcPrintToken(expr->postfix.op, fmt_userdata);
} break;
case HC_Syntax_Expr_Kind_CALL: {
hcPrintExpr(expr->call.func, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
HC_BOOL first_arg = HC_TRUE;
for (HC_Syntax_Expr *n = expr->call.first_arg; n != NULL; n = n->next_sibling) {
if (!first_arg) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " ");
}
first_arg = HC_FALSE;
hcPrintExpr(n, fmt_userdata, err_userdata, indent_level);
if (n->next_sibling) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ",");
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
} break;
case HC_Syntax_Expr_Kind_INDEX: {
hcPrintExpr(expr->index.array, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "[");
hcPrintExpr(expr->index.index, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "]");
} break;
case HC_Syntax_Expr_Kind_TERNARY: {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
hcPrintExpr(expr->ternary.cond, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "?");
hcPrintExpr(expr->ternary.then, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ":");
hcPrintExpr(expr->ternary.or_else, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
} break;
case HC_Syntax_Expr_Kind_DESIG_INIT: {
if (expr->desig_init.type) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
hcPrintTypeBottomUp(expr->desig_init.type, fmt_userdata, err_userdata, indent_level);
hcPrintTypeTopDown(expr->desig_init.type, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "{\n");
for (HC_Syntax_Expr *di = expr->desig_init.first_init; di != NULL; di = di->next_sibling) {
hcPrintAddIndentation(fmt_userdata, indent_level + 1);
for (HC_Syntax_Dsig *d = di->desig_init.desig; d != NULL; d = d->next) {
if (!hcTokenIsKind(d->dotted, HC_Token_Kind_EOF)) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ".");
hcPrintToken (d->dotted, fmt_userdata);
} else if (d->indexed) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "[");
hcPrintExpr(d->indexed, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "]");
}
}
if (di->desig_init.desig) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " = ");
}
hcPrintExpr(di->desig_init.init, fmt_userdata, err_userdata, indent_level + 1);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ",\n");
}
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "}");
} break;
case HC_Syntax_Expr_Kind_TYPE_CAST: {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
hcPrintTypeBottomUp(expr->type_cast.type, fmt_userdata, err_userdata, indent_level);
hcPrintTypeTopDown(expr->type_cast.type, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
hcPrintExpr(expr->type_cast.expr, fmt_userdata, err_userdata, indent_level);
} break;
case HC_Syntax_Expr_Kind_SIZE_OF: {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "sizeof(");
if (expr->size_of.arg_type) {
hcPrintTypeBottomUp(expr->size_of.arg_type, fmt_userdata, err_userdata, indent_level);
hcPrintTypeTopDown(expr->size_of.arg_type, fmt_userdata, err_userdata, indent_level);
} else {
hcPrintExpr(expr->size_of.arg_expr, fmt_userdata, err_userdata, indent_level);
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
} break;
default: {
HC_ERROR_PRINTF("Unable to print expression syntax expr");
HC_ERROR_EXIT(err_userdata, -1);
} break;
}
}
HC_FUNC_DECORATOR
void hcPrintStmt (HC_Syntax_Stmt *stmt, void *fmt_userdata, void *err_userdata, unsigned indent_level)
{
if (stmt == NULL) return;
if (stmt->attribute) {
hcPrintAddIndentation(fmt_userdata, indent_level);
hcPrintAttribute(stmt->attribute, fmt_userdata, indent_level);
}
switch (stmt->kind) {
case HC_Syntax_Stmt_Kind_TOP_LEVEL: {
for (HC_Syntax_Stmt *n = stmt->top_level.first_stmt; n != NULL; n = n->next_sibling) {
hcPrintStmt(n, fmt_userdata, err_userdata, indent_level);
}
} break;
case HC_Syntax_Stmt_Kind_STMT_EMPTY: {
} break;
case HC_Syntax_Stmt_Kind_STMT_BLOCK: {
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "{\n");
for (HC_Syntax_Stmt *n = stmt->stmt_block.first_stmt; n != NULL; n = n->next_sibling) {
hcPrintStmt(n, fmt_userdata, err_userdata, indent_level + 1);
}
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "}\n");
} break;
case HC_Syntax_Stmt_Kind_STMT_RETURN: {
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "return ");
hcPrintExpr(stmt->stmt_return.expr, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";\n");
} break;
case HC_Syntax_Stmt_Kind_STMT_IF: {
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_BOOL first_branch = HC_TRUE;
for (HC_Syntax_Stmt *n = stmt->stmt_if.first_branch; n != NULL; n = n->next_sibling) {
if (!first_branch) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "else ");
if (n->stmt_if.cond) HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "if ");
if (n->stmt_if.cond) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "(");
hcPrintExpr(n->stmt_if.cond, fmt_userdata, err_userdata, indent_level + 1);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")");
}
hcPrintStmt(n->stmt_if.body, fmt_userdata, err_userdata, indent_level + 1);
first_branch = HC_FALSE;
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "\n");
} break;
case HC_Syntax_Stmt_Kind_STMT_FOR: {
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "for (");
if (stmt->stmt_for.init_expr) {
hcPrintExpr(stmt->stmt_for.init_expr, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";");
} else if (stmt->stmt_for.init_decl) {
hcPrintStmt(stmt->stmt_for.init_decl, fmt_userdata, err_userdata, indent_level);
} else {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";");
}
hcPrintExpr(stmt->stmt_for.cond, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";");
hcPrintExpr(stmt->stmt_for.updt, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")\n");
hcPrintStmt(stmt->stmt_for.body, fmt_userdata, err_userdata, indent_level + 1);
} break;
case HC_Syntax_Stmt_Kind_STMT_WHILE: {
hcPrintAddIndentation(fmt_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "while (");
hcPrintExpr(stmt->stmt_while.cond, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ")\n");
hcPrintStmt(stmt->stmt_while.body, fmt_userdata, err_userdata, indent_level + 1);
} break;
case HC_Syntax_Stmt_Kind_STMT_FUNDECL: {
hcPrintDecl(stmt->stmt_fundecl.decl, fmt_userdata, err_userdata, indent_level);
if (stmt->stmt_fundecl.body) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " ");
hcPrintStmt(stmt->stmt_fundecl.body, fmt_userdata, err_userdata, indent_level);
} else {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";\n");
}
} break;
case HC_Syntax_Stmt_Kind_STMT_VARDECL: {
for (HC_Syntax_Stmt *l = stmt->stmt_vardecl.first_var; l != NULL; l = l->next_sibling) {
hcPrintDecl(l->stmt_vardecl.decl, fmt_userdata, err_userdata, indent_level);
if (l->stmt_vardecl.expr) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, " = ");
hcPrintExpr(l->stmt_vardecl.expr, fmt_userdata, err_userdata, indent_level);
}
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";\n");
}
} break;
case HC_Syntax_Stmt_Kind_STMT_EXPR: {
hcPrintAddIndentation(fmt_userdata, indent_level);
hcPrintExpr(stmt->stmt_expr.expr, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";\n");
} break;
case HC_Syntax_Stmt_Kind_STMT_TYPEDEF: {
hcPrintAddIndentation(fmt_userdata, indent_level);
for (HC_Syntax_Decl *d = stmt->stmt_typedef.first_def; d != NULL; d = d->next) {
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, "typedef\n");
hcPrintDecl(d, fmt_userdata, err_userdata, indent_level);
HC_SOURCE_OUTPUR_PRINTF(fmt_userdata, ";\n");
}
} break;
default: {
HC_ERROR_PRINTF("Unable to print syntax stmt");
HC_ERROR_EXIT(err_userdata, -1);
} break;
}
}
HC_FUNC_DECORATOR
void hcPrintSyntaxTree (HC_Syntax_Tree syntax_tree, void *fmt_userdata, void *err_userdata)
{
hcPrintStmt(syntax_tree.top_level, fmt_userdata, err_userdata, 0);
}
#endif // HYPERC_IMPLEMENTATION
//////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////////////////////////////
#if 0 || defined(HYPERC_TEST) /* TEST CODE */
#include "std.h" // https://gist.github.com/namandixit/22d61e7e416f7e4637730d3e5ff2a479
internal_function
SDL_PRINTF_VARARG_FUNC(2)
void SDL_strformat (SDL_strformat_Data *data, SDL_PRINTF_FORMAT_STRING char *fmt, ...)
{
SDL_IOStream *io = SDL_IOFromDynamicMem();
va_list args;
va_start(args, fmt);
if (data->str) {
SDL_IOprintf(io, "%s", data->str);
SDL_free(data->str);
}
SDL_IOvprintf(io, fmt, args);
va_end(args);
SDL_WriteU8(io, '\0');
SDL_PropertiesID sbp = SDL_GetIOProperties(io);
char *out = SDL_GetPointerProperty(sbp, SDL_PROP_IOSTREAM_DYNAMIC_MEMORY_POINTER, NULL);
if (out == NULL) {
SDL_assert_release(false && "Couldn't format string");
}
SDL_SetPointerProperty(sbp, SDL_PROP_IOSTREAM_DYNAMIC_MEMORY_POINTER, NULL); // This transfers the ownership of out to the app
SDL_CloseIO(io);
data->str = out;
}
internal_function
void test (Memory_Allocator_Interface *miface, Char *src, Sint expected_result, Bool print_out)
{
if (print_out) {
SDL_Log("Source: %s", src);
}
HC_Parser *parser = hcParserCreate(src, miface, nullptr);
HC_Syntax_Tree parsed_tree = hcParse(parser);
SDL_strformat_Data fmtdat = {0};
hcPrintSyntaxTree(parsed_tree, &fmtdat, nullptr);
Char *out = fmtdat.str;
if (print_out) {
SDL_Log("Output:\n%s", out);
}
Char *outfilename = "artifacts/hyperc.test.c";
Char *outexename = "artifacts/hyperc.test.exe";
SDL_IOStream *outfileio = SDL_IOFromFile(outfilename, "w");
SDL_IOprintf(outfileio, "%s", out);
SDL_CloseIO(outfileio);
{
Char *outargs[] = {
"clang", "--std=c23",
"-Weverything", "-Wpedantic",
"-Wno-unused-value", "-Wno-unreachable-code", "-Wno-unreachable-code-return",
"-Wno-missing-prototypes", "-Wno-pre-c++17-compat", "-Wno-c99-compat",
"-Wno-gnu-binary-literal", "-Wno-sign-conversion", "-Wno-shorten-64-to-32",
"-Wno-pre-c23-compat", "-Wno-unknown-attributes", "-Wno-unused-variable",
"-Wno-unsafe-buffer-usage", "-Wno-redundant-parens", "-Wno-declaration-after-statement",
"-Wno-shadow",
outfilename, "-o", outexename, nullptr,
};
SDL_PropertiesID props = SDL_CreateProperties();
SDL_SetPointerProperty(props, SDL_PROP_PROCESS_CREATE_ARGS_POINTER, outargs);
SDL_SetBooleanProperty(props, SDL_PROP_PROCESS_CREATE_STDERR_TO_STDOUT_BOOLEAN, true);
SDL_Process *process = SDL_CreateProcessWithProperties(props);
if (process == NULL) {
SDL_Log("TEST: %s: Compilation failed", src);
SDL_DestroyProcess(process);
SDL_DestroyProperties(props);
SDL_RemovePath(outfilename);
return;
}
SDL_WaitProcess(process, true, nullptr);
SDL_DestroyProcess(process);
SDL_DestroyProperties(props);
SDL_RemovePath(outfilename);
}
{
Char *runargs[] = {
outexename, nullptr,
};
SDL_PropertiesID props = SDL_CreateProperties();
SDL_SetPointerProperty(props, SDL_PROP_PROCESS_CREATE_ARGS_POINTER, runargs);
SDL_SetBooleanProperty(props, SDL_PROP_PROCESS_CREATE_STDERR_TO_STDOUT_BOOLEAN, true);
SDL_Process *process = SDL_CreateProcessWithProperties(props);
if (process == NULL) {
SDL_Log("TEST: %s: Running failed", src);
SDL_DestroyProcess(process);
SDL_DestroyProperties(props);
SDL_RemovePath(outexename);
return;
}
int exitcode = 0;
SDL_WaitProcess(process, true, &exitcode);
SDL_DestroyProcess(process);
SDL_DestroyProperties(props);
SDL_RemovePath(outexename);
if (exitcode != expected_result) {
SDL_Log("=======XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxX========\nTEST: FAILURE %d (Got %d) = %s\n=======XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxX========", expected_result, exitcode, src);
} else {
SDL_Log("TEST: success %d = %s", expected_result, src);
}
}
SDL_free(out);
}
internal_function
MEMORY_ALLOCATOR_REQUEST_FUNC(sdlMalloc)
{
unused_variable(userdata);
void *data = SDL_calloc(1, amount);
return data;
}
internal_function
MEMORY_ALLOCATOR_RESCIND_FUNC(sdlFree)
{
unused_variable(userdata);
SDL_free(ptr);
}
Sint main (Sint argc, Char *argv[])
{
unused_variable(argc);
unused_variable(argv);
Memory_Allocator_Interface miface = memICreate(nullptr, &sdlMalloc, &sdlFree);
test(&miface, "int main (void) {return 0;}", 0, false);
test(&miface, "int main (void) {return 42;}", 42, false);
test(&miface, "int main (void) {return 4+20;}", 24, false);
test(&miface, "int main (void) {return 5+20-4;}", 21, false);
test(&miface, " int main ( void ) { return 12 + 34 - 5 ; } ", 41, false);
test(&miface, "int main (void) {return 5+6*7;}", 47, false);
test(&miface, "int main (void) {return 5*(9-6);}", 15, false);
test(&miface, "int main (void) {return (3+5)/2;}", 4, false);
test(&miface, "int main (void) {return -10+20;}", 10, false);
test(&miface, "int main (void) {return - -10;}", 10, false);
test(&miface, "int main (void) {return - - +10;}", 10, false);
test(&miface, "int main (void) {return 0==1;}", 0, false);
test(&miface, "int main (void) {return 42==42;}", 1, false);
test(&miface, "int main (void) {return 0!=1;}", 1, false);
test(&miface, "int main (void) {return 42!=42;}", 0, false);
test(&miface, "int main (void) {return 0<1;}", 1, false);
test(&miface, "int main (void) {return 1<1;}", 0, false);
test(&miface, "int main (void) {return 2<1;}", 0, false);
test(&miface, "int main (void) {return 0<=1;}", 1, false);
test(&miface, "int main (void) {return 1<=1;}", 1, false);
test(&miface, "int main (void) {return 2<=1;}", 0, false);
test(&miface, "int main (void) {return 1>0;}", 1, false);
test(&miface, "int main (void) {return 1>1;}", 0, false);
test(&miface, "int main (void) {return 1>2;}", 0, false);
test(&miface, "int main (void) {return 1>=0;}", 1, false);
test(&miface, "int main (void) {return 1>=1;}", 1, false);
test(&miface, "int main (void) {return 1>=2;}", 0, false);
test(&miface, "int main (void) {int a=3; return a;}", 3, false);
test(&miface, "int main (void) {int a=3; int z=5; return a+z;}", 8, false);
test(&miface, "int main (void) {int a; int b; a=b=3; return a+b;}", 6, false);
test(&miface, "int main (void) {static int foo=3; return foo;}", 3, false);
test(&miface, "int main (void) {volatile int foo123=3; register int bar=5; return foo123+bar;}", 8, false);
test(&miface, "int main (void) {signed int foo123=3; unsigned int bar=5; return foo123+bar;}", 8, false);
test(&miface, "int main (void) {return 1; 2; 3;}", 1, false);
test(&miface, "int main (void) {1; return 2; 3;}", 2, false);
test(&miface, "int main (void) {1; 2; return 3;}", 3, false);
test(&miface, "int main (void) {0LU; 0L; return 3;}", 3, false);
test(&miface, "int main (void) {0UL; 0LL; return 3;}", 3, false);
test(&miface, "int main (void) {0LLU; 0Ull; return 3;}", 3, false);
test(&miface, "int main (void) {0l; 0ll; return 3;}", 3, false);
test(&miface, "int main (void) {0x0L; 0b0L; return 3;}", 3, false);
test(&miface, "int main (void) {2147483647; 0; return 3;}", 3, false);
test(&miface, "int main (void) {2147483648; 2; return 0xffffffff;}", -1, false);
test(&miface, "int main (void) {4294967295U; 4294967296U; return 3;}", 3, false);
test(&miface, "int main (void) {0b1111111111111111111111111111111111111111111111111111111111111111; return 3;}", 3, false);
test(&miface, "int main (void) {0x1ffffffff; return 3;}", 3, false);
test(&miface, "int main (void) {0.3F; 0.; return 3;}", 3, false);
test(&miface, "int main (void) {.0; 5.l; return 3;}", 3, false);
test(&miface, "int main (void) {2.0L; 0x.2p-3; return 3;}", 3, false);
test(&miface, "int main (void) {'a'; u8'b'; return 3;}", 3, false);
test(&miface, "int main (void) {u'c'; U'd'; return 3;}", 3, false);
test(&miface, "int main (void) {L'N'; '\\0'; return 3;}", 3, false);
test(&miface, "int main (void) {'\\n'; '\\x005f'; return 3;}", 3, false);
test(&miface, "int main (void) {\"a\"; u8\"b\"; return 3;}", 3, false);
test(&miface, "int main (void) {\"aa\"; u8\"bb\"; return 3;}", 3, false);
test(&miface, "int main (void) {u\"c\"; U\"d\"; return 3;}", 3, false);
test(&miface, "int main (void) {u\"cc\"; U\"dd\"; return 3;}", 3, false);
test(&miface, "int main (void) {L\"N\"; \"\\0\"; return 3;}", 3, false);
test(&miface, "int main (void) {L\"NN\"; \"\\0\\0\"; return 3;}", 3, false);
test(&miface, "int main (void) {\"\\n\"; \"\\x005f\"; return 3;}", 3, false);
test(&miface, "int main (void) {\"\\n\\n\"; \"\\x005f\\x005f\"; return 3;}", 3, false);
test(&miface, "int main (void) { {1; {2;} return 3;} }", 3, false);
test(&miface, "int main (void) { ;;; return 5; }", 5, false);
test(&miface, "int main (void) { ;;; 1; ;; {;;2;;;;} return 5; }", 5, false);
test(&miface, "int main (void) { if (0) return 2; return 3; }", 3, false);
test(&miface, "int main (void) { if (1-1) return 2; return 3; }", 3, false);
test(&miface, "int main (void) { if (1) return 2; return 3; }", 2, false);
test(&miface, "int main (void) { if (2-1) return 2; return 3; }", 2, false);
test(&miface, "int main (void) { if (0) { 1; 2; return 3; } else { return 4; } }", 4, false);
test(&miface, "int main (void) { if (1) { 1; 2; return 3; } else { return 4; } }", 3, false);
test(&miface, "int main (void) { int i=0; int j=0; for (i=0; i<=10; i=i+1) j=i+j; return j; }", 55, false);
test(&miface, "int main (void) { for (;;) {{{return 3;};};}; return 5; }", 3, false);
test(&miface, "int main (void) { int i=0; while(i<10) { i=i+1; } return i; }", 10, false);
test(&miface, "int main (void) { int i=0; while(i+1) { return i+3; } return i; }", 3, false);
test(&miface, "int main (void) { int x=3, y=5, z=7; return x+y-z; }", 1, false);
test(&miface, "int main (void) { int x=3; return *&x; }", 3, false);
test(&miface, "int main (void) { static int x=3, y, z=6; y = 2; return *&x+z+y; }", 11, false);
test(&miface, "int main (void) { long x=3; return *&x; }", 3, false);
test(&miface, "int main (void) { long long x=3; int y = *&x; return y; }", 3, false);
test(&miface, "int main (void) { int x=3; int *y=&x; int **z=&y; return **z; }", 3, false);
test(&miface, "int main (void) { int x=3; int *y=&x; *y=5; return x; }", 5, false);
test(&miface, "int ret3(void) { return 3; } int main(void) { return ret3(); }", 3, false);
test(&miface, "int ret5(void) { return 5; } int main(void) { return ret5(); }", 5, false);
test(&miface, "static inline int ret3(void) { return 3; } int main(void) { return ret3(); }", 3, false);
test(&miface, "int add(int x, int y) { return x+y; } int main(void) { return add(3, 5); }", 8, false);
test(&miface, "int sub(int x, int y) { return x-y; } int main(void) { return sub(5, 3); }", 2, false);
test(&miface, "int add6(int a, int b, int c, int d, int e, int f) { return a+b+c+d+e+f; } int main(void) { return add6(1,2,3,4,5,6); }", 21, false);
test(&miface, "int add6(int a, int b, int c, int d, int e, int f) { return a+b+c+d+e+f; } int main(void) { return add6(1,2,add6(3,4,5,6,7,8),9,10,11); }", 66, false);
test(&miface, "int add6(int a, int b, int c, int d, int e, int f) { return a+b+c+d+e+f; } int main(void) { return add6(1,2,add6(3,add6(4,5,6,7,8,9),10,11,12,13),14,15,16); }", 136, false);
test(&miface, "int main (void) { int x = 3; int *d = &x; int **dd = &d; return **dd; }", 3, false);
test(&miface, "[[hyperc::hello(bye((no)) go(away((please)))), hyperc::alvida(la la la la)]] int main(void) { return 3; }", 3, false);
test(&miface, "int main (void) { struct {int a; int b;} x; x.a=1; x.b=2; return x.a; }", 1, false);
test(&miface, "int main (void) { struct {int a; int b;} x; x.a=1; x.b=2; return x.b; }", 2, false);
test(&miface, "int main (void) { struct {char a; int b; char c;} x; x.a=1; x.b=2; x.c=3; return x.a; }", 1, false);
test(&miface, "int main (void) { struct {char a; int b; char c;} x; x.b=1; x.b=2; x.c=3; return x.b; }", 2, false);
test(&miface, "int main (void) { struct {char a; int b; char c;} x; x.a=1; x.b=2; x.c=3; return x.c; }", 3, false);
test(&miface, "int main (void) { struct {char a; char b;} x[3]; char *p=(char*)x; p[0]=0; return x[0].a; }", 0, false);
test(&miface, "int main (void) { struct {char a; char b;} x[3]; char *p=(char*)x; p[1]=1; return x[0].b; }", 1, false);
test(&miface, "int main (void) { struct {char a; char b;} x[3]; char *p=(char*)x; p[2]=2; return x[1].a; }", 2, false);
test(&miface, "int main (void) { struct {char a; char b;} x[3]; char *p=(char*)x; p[3]=3; return x[1].b; }", 3, false);
test(&miface, "int main (void) { struct {char a[3]; char b[5];} x; char *p=(char*)&x; x.a[0]=6; return p[0]; }", 6, false);
test(&miface, "int main (void) { struct {char a[3]; char b[5];} x; char *p=(char*)&x; x.b[0]=7; return p[3]; }", 7, false);
test(&miface, "int main (void) { struct { struct { char b; } a; } x; x.a.b=6; return x.a.b; }", 6, false);
test(&miface, "int main (void) { struct {int a;} x; return sizeof(x); }", 4, false);
test(&miface, "int main (void) { struct {int a; int b;} x; return sizeof(x); }", 8, false);
test(&miface, "int main (void) { struct {int a, b;} x; return sizeof(x); }", 8, false);
test(&miface, "int main (void) { struct {int a[3];} x; return sizeof(x); }", 12, false);
test(&miface, "int main (void) { struct {int a;} x[4]; return sizeof(x); }", 16, false);
test(&miface, "int main (void) { struct {int a[3];} x[2]; return sizeof(x); }", 24, false);
test(&miface, "int main (void) { struct {char a; char b;} x; return sizeof(x); }", 2, false);
test(&miface, "int main (void) { struct {char a; int b;} x; return sizeof(x); }", 8, false);
test(&miface, "int main (void) { struct {int a; char b;} x; return sizeof(x); }", 8, false);
test(&miface, "int main (void) { struct t {int a; int b;} x; struct t y; return sizeof(y); }", 8, false);
test(&miface, "int main (void) { struct t {int a; int b;}; struct t y; return sizeof(y); }", 8, false);
test(&miface, "int main (void) { struct t {char a[2];}; { struct t {char a[4];}; } struct t y; return sizeof(y); }", 2, false);
test(&miface, "int main (void) { struct t {int x;}; int t=1; struct t y; y.x=2; return t+y.x; }", 3, false);
test(&miface, "int main (void) { struct t {char a;} x; struct t *y = &x; x.a=3; return y->a; }", 3, false);
test(&miface, "int main (void) { struct t {char a;} x; struct t *y = &x; y->a=3; return x.a; }", 3, false);
test(&miface, "int main (void) { struct t {int a,b;}; struct t x; x.a=7; struct t y; struct t *z=&y; *z=x; return y.a; }", 7, false);
test(&miface, "int main (void) { struct t {int a,b;}; struct t x; x.a=7; struct t y, *p=&x, *q=&y; *q=*p; return y.a; }", 7, false);
test(&miface, "int main (void) { struct t {int a,b;}; struct t x; x.a=7; struct t y; struct t *z=&y; *z=x; return y.a; }", 7, false);
test(&miface, "int main (void) { struct t {int a,b;}; struct t x; x.a=7; struct t y, *p=&x, *q=&y; *q=*p; return y.a; }", 7, false);
test(&miface, "int main (void) { struct t {int a; int b;} x; struct t y; return sizeof(y); }", 8, false);
test(&miface, "int main (void) { struct t {int a; int b;}; struct t y; return sizeof(y); }", 8, false);
test(&miface, "int main (void) { struct {char a; long long b;} x; return sizeof(x); }", 16, false);
test(&miface, "int main (void) { struct {char a; short b;} x; return sizeof(x); }", 4, false);
test(&miface, "int main (void) { struct foo *bar; return sizeof(bar); }", 8, false);
test(&miface, "int main (void) { struct T *foo; struct T {int x;}; return sizeof(struct T); }", 4, false);
test(&miface, "int main (void) { struct T { struct T *next; int x; } a; struct T b; b.x=1; a.next=&b; return a.next->x; }", 1, false);
test(&miface, "int main (void) { typedef struct T T; struct T { int x; }; return sizeof(T); }", 4, false);
test(&miface, "int main (void) { typedef struct T T; struct T { int x; }; T t; t.x = 5; return t.x; }", 5, false);
test(&miface, "int main (void) { int x; { typedef int T; T t = 9; x = t; } return x; }", 9, false);
test(&miface, "int main (void) { typedef struct T { int x; } T; { typedef int T; T t = 3; } T t; t.x = 5; return t.x; }", 5, false);
test(&miface, "int main (void) { typedef struct T { struct { int x; } X; } T; T t; t.X.x = 5; return t.X.x; }", 5, false);
test(&miface, "int main (void) { int x[3] = {[2] = 4, [0] = 6, [1] = 8}; return x[1]; }", 8, false);
test(&miface, "int main (void) { typedef struct T T; struct T { int x; int y[1]; int z[1]; }; T t = {.x = 5, .y[0] = 6, .z = {7,} }; return t.x+t.y[0]+t.z[0]; }", 18, false);
test(&miface, "int main (void) { typedef struct T T; struct T { int x; int y[1]; int z[1]; }; T t; t = (T){.x = 5, .y[0] = 6, .z = {7,} }; return t.x+t.y[0]+t.z[0]; }", 18, false);
test(&miface, "int main (void) { typedef struct T { struct T2 { int x, y; } t2; struct T3 { int z; } t3; } T; T t = {.t2 = { .x = 5, .y = 6 }, .t3 = {.z = 7}}; return t.t2.x+t.t2.y+t.t3.z; }", 18, false);
test(&miface, "int main (void) { typedef union T T; union T { int x; int y[1]; }; T t; t = (T){.x = 5}; return t.y[0]; }", 5, false);
test(&miface, "int main (void) { typedef enum T { T_a = 3, T_b } T; T t; t = T_a; return T_b; }", 4, false);
test(&miface, "int main (void) { int x = 1; return x ? 42 : 69; }", 42, false);
test(&miface, "int main (void) { /* return 0; */ return 1; }", 1, false);
test(&miface, "int main (void) { // return 0;\n return 1; }", 1, false);
test(&miface, "int main (void) { // return \\\nreturn 0\n return 1; }", 1, false);
test(&miface, "#include <stdio.h>\nint main (void) { return 1; }", 1, false);
test(&miface, "#define A(x) (x+\\\nx)\n int main (void) { return 1; }", 1, false);
return 0;
}
#endif
@namandixit
Copy link
Author

Copying the comment by Chris Wellons on Reddit here for future reference. Reddit shadow-banned the account (perhaps due to being too new?) and I don't want to loose this very helpful set of instructions (by the way, thanks @skeeto! I thanked you on Reddit but the comment never went through)


Neat project! I like the custom allocator interface. The HC_ERROR_* functions ought to take userdata pointers, too.

no exhaustive testing or fuzzing.

Here's an AFL++ fuzz tester:

#include <setjmp.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>

static jmp_buf fail;

#define HC_ERROR_EXIT(_)    longjmp(fail, 1)
#define HC_MALLOC(_, z)     malloc(z)
#define HC_SOURCE_OUTPUR_PRINTF(...)
#define HYPERC_IMPLEMENTATION
#include "hyperc.h"

__AFL_FUZZ_INIT();

int main(void)
{
   __AFL_INIT();
   char *src = 0;
   unsigned char *buf = __AFL_FUZZ_TESTCASE_BUF;
   while (__AFL_LOOP(10000)) {
       int len = __AFL_FUZZ_TESTCASE_LEN;
       src = realloc(src, len+1);
       memcpy(src, buf, len);
       src[len] = 0;
       if (!setjmp(fail)) {
           hcParse(hcParserCreate(src, 0));
       }
   }
}

It's too bad that HC_ERROR_EXIT doesn't take a userdata pointer, so that the jmpbuf wouldn't need to be a global. Usage:

$ afl-gcc-fast -g3 -fsanitize=address,undefined fuzz.c
$ mkdir i
$ echo 'int main(void){}' > i/main.c
$ afl-fuzz -ii -oo ./a.out

It immediately finds this one:

#include <stdbool.h>
#include <stdint.h>
#include <stdlib.h>
#define HC_MALLOC(_, z) malloc(z)
#define HC_SOURCE_OUTPUR_PRINTF(...)
#define HYPERC_IMPLEMENTATION
#include "hyperc.h"

int main(void)
{
   hcParserCreate("#", 0);
}

Then:

$ cc -g3 -fsanitize=address,undefined crash.c
$ ./a.out
ERROR: AddressSanitizer: global-buffer-overflow on address ...
READ of size 1 at ...
   #0 hcTokenizerEatWhitespace hyperc.h:755
   #1 hcTokenizerMoveForward hyperc.h:772
   #2 hcTokenizerMake hyperc.h:688
   #3 hcParserCreate hyperc.h:2122
   #4 main crash.c:11

Quick fix:

@@ -700,3 +700,3 @@ void hcTokenizercharAdvance (HC_Tokenizer *toker, uint32_t count)
{
-    for (size_t i = 0; i < count; i++) {
+    for (size_t i = 0; i < count && toker->src[toker->cursor]; i++) {
        if (toker->src[toker->cursor] == '\n') {
@@ -753,3 +753,3 @@ void hcTokenizerEatWhitespace (HC_Tokenizer *toker)
            ate_something_this_loop = HC_TRUE;
-            while (true) {
+            while (toker->src[toker->cursor]) {
                if (toker->src[toker->cursor] == '\n') {

Finding more crashes takes awhile, and it hangs often on infinite loops. The next crash it found was this:

hcParserCreate("0x", 0);

Then:

$ ./a.out 
ERROR: AddressSanitizer: global-buffer-overflow on address ...
READ of size 1 at ...
   #0 hcTokenizerMoveForward hyperc.h:802
   #1 hcTokenizerMake hyperc.h:688
   #2 hcParserCreate hyperc.h:2122

Thanks for sharing!

@skeeto
Copy link

skeeto commented Mar 15, 2025

I'm glad you found my write-up useful enough to save!

Your account was indeed shadowbanned. Sorry to hear that. I've noticed a flood of false positive shadowbans in the subreddits I moderate the past couple weeks. It seems the admins dialed up some sensitivity knob way too far. A little trick: Add a period to the TLD to view the logged-out version of reddit while logged in, which lets you check your own shadowban status:

https://old.reddit.com./user/NamanDixitCodes

For information on appealing it, see r/ShadowBan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment