🤓 #3 - Nerding Out
🤓
Happy end of the month, everyone! Today, we’re gonna talk about fun things that can help you write some pretty snazzy C code, when you’re tired of formatting strings and other weird stuff. It will rely a tiny bit of vendor extensions (that might not be vendor extensions, soon!) so you can see the magic.
But before we get there…
⭐ CoSy Tech Con!
Most of you signed up to hear about CoSy! Progress is, of course, still being made:
- We are still reaching out for sponsors and connecting with a lot of organizations! If you’d like to sponsor us, send an e-mail to cosy@soasis.org! We have a little two-page primer you can have, and can talk to your organization more directly about benefits!
- We have a budget for how much Live Captioning should cost us, so we now know almost everything
- The official page that will also contain hallway tracks and other fun events will be at Hopin!
- We are going to be locking in training slots for people who have shown interest in training! If you’re interested still interested please send an e-mail to inquiries@soasis.org!
- The stickers and other cool things are now being worked on by artists!
All in all, things are moving ahead steadily. We hope that by March and April, we can start announcing the various trainings that will be happening and show off the new submission system!
One final note is that we’ll also possibly be increasing the conference to be 3 days, but taking less time each day. This might prevent people from needing to sit at their computers for too many hours (even if we give them breaks), and make the content accessible to a larger sum of folks since it doesn’t span the whole working day. And with that talked about, we are now officially going to…
COMPLETELY Nerd Out, with C
We’ve all seen it, before:
printf("%s", some_string);
printf
. Friend of the debugger-less, revealing muse of ordering problems, champion of the 12 AM beleaguered student. Also, way too verbose and way too dangerous. Consider, for example, what happens when the equivalent printf("%d", some_string);
happens, or even printf("%s", some_integer);
. On some compilers, you get warnings when you do this! For example, with GCC:
error: format '%d' expects argument of
type 'int', but argument 2 has
type 'char *' [-Werror=format=]
printf("%d", some_string);
^----------------
Most compilers implement checks for the format string, to make sure you’re matching the type of what you specified. But, it only works on string constants that the compiler is capable of reading! This works, but it’s… kind of frustrating! Wouldn’t it be better, if we had a way to know the type up front and just act on that? C doesn’t have templates like C++, but it turns out with a little magic, we can get somewhat close to it in C…
Can we divine the type of an expression given as an argument, and create variadic function calls that way?
🔍 C features, A Closer Look
As said earlier, C doesn’t have templates. There’s no compile-time variadics, no type traits, nothing but our bare hands and some macros. So, we have to write this…
enum arg_type {
arg_type_int,
arg_type_double,
arg_type_ptr_char,
// ... and more...
arg_type_shrug
};
This is our way of knowing what type we got. We’ll use it in a hot second, here. Next, we need something that will wrap each argument. Again, we don’t have templates here, so we’re going to use the one and only true type erasure we need: void*
:
typedef struct arg_ {
enum arg_type type;
void* data;
} arg;
Now, upon seeing a pointer, many programmers would reflexively assume that we’re going to use some kind of allocation or other shenanigans to produce a pointer. But we don’t have to, thanks to this sweet feature…
Compound Literals
Compound Literals is a feature that does not exist in C++, but does in ISO Standard C. This is one of the ways where C code is not only different from C++, but far more powerful. Compound literals allow you make an
“unnamed object initialized by the initializer list. If the compound literal occurs outside the body of a function, the object has static storage duration;otherwise, it has automatic storage duration associated with the enclosing block.”
— §6.5.2.5 Compound Literals, Semantics, Paragraph 5
What this means is that if you write this:
void foo () {
{
int compound_literal_arr* = (int[]){1, 2, 3, 4, 5};
/* Lots of work */
printf("%d", *(compound_literal_arr + 2)); // prints 3
/* Lots more work */
// compound_literal_arr is still alive here!
}
// compound_literal_arr is gone here :(
}
You get an object that lasts for the duration of compound_literal_arr
‘s scope (“automatic storage duration associated with the enclosing block”). This is incredibly useful, because it means we can hoist arguments into a higher lifetime that allows us to take pointers to them! In particular…
A function that takes anything
We can use what we just learned to create arguments whose lifetimes we know are long enough for us to do things like print them out. So…
void process_arg(arg arg0) {
switch (arg0.type) {
case arg_type_int:
{
// points at a single integer
int* value = *(int*)arg0.data;
}
break;
case arg_type_double:
{
// points at a single double
double* value = *(double*)arg0.data;
}
break;
case arg_type_char:
{
// points at a character string
char* value = *(char**)arg0.data;
}
break;
/* and so on, and so forth... */
default:
break;
}
}
int main () {
// create a "arg" struct, pass it into the
// function with the right type, and with the value we want
int value = 1;
process_arg((arg){ arg_type_int, (int[]){ value } });
return 0;
}
This obviously works, but:
- We are still spelling out the type
- We have to wrap it in that ugly-looking Compound Literal syntax
Can we abstract away some of this? Well, we can! Enter…
Enter _Generic
_Generic
is a tool most C programmers shouldn’t be using. But, occasionally, it can come in handy. It’s an expression type, and it takes a list of TYPE : EXPRESSION
associations:
void process_arg(arg arg0);
int main () {
int value = 1;
arg_type type = _Generic(value,
int: arg_type_int,
double: arg_type_double,
char*: arg_type_ptr_char,
default: arg_type_shrug
);
process_arg((arg){ type, (int[]){ value } });
return 0;
}
We can move this into a macro, for clarity:
void process_arg(arg arg0);
#define GET_ARG_TYPE(value) _Generic(value, \
int: arg_type_int, \
double: arg_type_double, \
char*: arg_type_ptr_char, \
/* and so on, and so forth... */ \
default: arg_type_shrug \
)
int main () {
int value = 1;
process_arg((arg){ GET_ARG_TYPE(value), (int[]){ value } });
return 0;
}
Nice! We still have one problem: (int[])
is still an explicit reference to the fact that we’re trying to create a compound array literal of type int
. How do we make it rely solely on “value
”? Well, we could try creating a _Generic
that produces a void*
, but the problem with _Generic
is that it’s an expression. Every branch of _Generic
gets evaluated, which means if you try to make a _Generic
tree of this, you will get some weird errors for different types:
#define GET_ARG_DATA(value) _Generic(value, \
int: (void*)((int[]){ (value) }), \
double: (void*)((double[]){ (value) }), \
char*: (void*)((char*[]){ (value) }), \ // !!!
default: (void*)NULL \
)
int main () {
int value = 1;
// ERRORS:
GET_ARG_DATA(value);
// initialization of 'char *' from 'int' makes
// pointer from integer
// without a cast [-Werror=int-conversion]
return 0;
}
😓 Ouch.
_Generic
does not have “template-like” properties, even in a Macro: all branches must be valid and compile for all types its exposed to. This makes it much less good for us… until we realize we skip the _Generic
altogether, with a little extra help from a widely-implemented extension called __typeof
:
#define GET_ARG_DATA(value) _Generic(value, \
int: (void*)((int[]){ (value) }), \
double: (void*)((double[]){ (value) }), \
char*: (void*)((char*[]){ (value) }), \ // !!!
default: (void*)NULL \
)
#define GET_ARG_DATA(value) \
((__typeof((value))[]){ (value) })
void process_arg(arg arg0);
int main () {
int value0 = 1;
double value1 = 2.0;
const char* value3 = "3";
process_arg((arg){ GET_ARG_TYPE(value0), GET_ARG_DATA(value0) });
process_arg((arg){ GET_ARG_TYPE(value1), GET_ARG_DATA(value1) });
process_arg((arg){ GET_ARG_TYPE(value2), GET_ARG_DATA(value2) });
return 0;
}
And, we add some more macros, just to make it less spammy:
#define GET_ARG_TYPE(value) _Generic(value, \
int: arg_type_int, \
double: arg_type_double, \
char*: arg_type_ptr_char, \
/* and so on, and so forth... */ \
default: arg_type_shrug \
)
#define GET_ARG_DATA(value) \
((__typeof((value))[]){ (value) })
#define GET_ARG(value) \
(arg){ GET_ARG_TYPE(value), GET_ARG_DATA(value) }
void process_arg(arg arg0);
int main () {
int value0 = 1;
double value1 = 2.0;
const char* value3 = "3";
process_arg( GET_ARG(value0) );
process_arg( GET_ARG(value1) );
process_arg( GET_ARG(value2) );
return 0;
}
Very good! There is even more you can do to make this work out.... but we’re going to leave that…
For next time. 😁
It’s a lot to put in one session, so we’re going to break it up a little bit!
We’ll pick this up next Newsletter, and talk about the preprocessor’s __VA_ARGS__
, variable arguments to functions using ...
, and how this concept can be extended to have a format like…
int main () {
process_args(1, 2.0, "3");
process_args(0x4, '5');
process_args((char)0x36);
return 0;
}
in C!
Until then, have a look at this C++ paper proposing Compound Literals to the Language. There’s also a paper to make typeof
work in C, and not have it be a “widely implemented extension” anymore.
See you next time!
— Shepherd’s Oasis 💙
P.S. With end of the first month of the New Year, we here at Shepherd’s Oasis decided we’d share our New Year’s Resolution! Right now, it’s just 1600 x 900, but maybe we’ll update to 4096 x 2160 in this New Year.