Getting silly with C, part (void*)2

🪴 Anil's Garden

They won’t be able to find bugs in your code if they can’t figure out how it works.

In the previous installment of our introductory series about the C language, we outlined the basic syntax of the switch() statement ( demo link ):

#include <stdio.h>
 
int main() {
 
  int i = 1;
 
  switch (i)
 
         if (0) case 0:        puts("i = 0");
    else if (0) case 1 ... 10: puts("i = 1 ... 10");
    else if (0) case 11:       puts("i = 11");
    else if (0) default:       puts("i = something else");
 
  return 0;
 
}

Today, let’s continue with function declarations. You will be delighted to discover that the following code compiles cleanly with gcc - Wall — and that it calls puts() exactly once ( link ):

int typedef puts(char* puts; char* puts; char* puts);
 
int main() {
  puts(puts);
  puts("Welcome to my humble abode!");
}

The explanation is simple. The first line is not actually a function declaration: instead, because of the buried (and out-of-order) typedef keyword, it defines a type. More specifically, it defines a function type — a fairly useless but permitted construct in C that shouldn’t be confused with the more practical function pointer type. In the end, we create a type named puts that can be used to declare a function according to the template of int (…).

The second mystery might be what’s going on with the parameters — char* puts; char* puts; char* puts. This syntax is an obscure GNU extension called a forward parameter declaration. The intended use is this:

int myfunction(int len; char data[len], int len) {
  ...
}

In essence, it’s a way to tell the compiler that a parameter of a certain type and name is forthcoming, so that we can reference it ahead of the time. You can apparently have as many semicolon-delimited forward declarations as you’d like, but in the end, we’re just creating a type for int (char*). At the typedef stage, parameter names are ignored and have no global effects, so repeating puts in there is just for show.

Past this point, puts is a type, and we can’t redefine the symbol in the global scope; that said, the language permits symbols to be shadowed within nested blocks ( link ):

char* foo = "bar";
 
int main() {
  int foo = 123; /* No error */
  return foo;
}

…less intuitively, the same also applies to types ( link ):

typedef float foo;
 
int main() {
  int foo = 123; /* No error */
  return foo;
}

The final piece of the puzzle is the observation that parentheses can be added in variable declarations with no ill effects ( link ):

int main() {
  int (foo) = 123;
  return foo;
}

This brings us back to what the puts(puts) line in main() actually does: it is not a function call at all! Instead, it’s equivalent to “ puts puts” — a declaration of a function named puts that follows the int (char*) type template. Critically, this newly-instantiated puts symbol clobbers the global type, so the next reference to puts is a real function call.

All right, all right — too easy! Let’s move on to control flow ( link ):

#include <stdio.h>
 
typedef int _();
 
int main() {
  puts("Welcome to my humble program.");
  _ main asm("_");
}
 
int z() {
  puts("ANYTHING IS POSSIBLE AT ZOMBO.COM");
  return 0;
  _ z asm("main");
}

If you run this code, the only output will be an ad for zombo.com. Why? For one, we have another function typedef in there — but if we get rid of it for clarity, the code still doesn’t make much sense ( link ):

...
 
int main() {
  puts("Welcome to my humble program.");
  int main() asm("_");
}
 
...

The other trick is the asm(“…”) syntax. It’s not actually an assembly block; when the keyword appears in a function or variable declaration, it specifies an underlying “assembler name” for the C symbol. You’d normally use it like so ( link ):

int foobar(char*) asm("puts");
 
int main() {
  foobar("Hello world");
  return 0;
}

In our earlier example, we attach the renaming directive to a local declaration of main() that shadows the global symbol — but the result of the renaming is global! In effect, we renamed main() to _ () and z() to main(). Clang complains, but GCC doesn’t mind at all.

Let’s follow that trail for a bit longer. Check out the following code ( link ):

int main() {
  i("This is fine.");
  return 0;
}
 
[[gnu::unused]] void elsewhere() {
  typedef int i();
  for (i i, i asm("puts"), i;;);
}

The code prints “this is fine”. Recent versions of GCC allow the iterator variable of a for() loop to be a function declaration, so we can bury the renaming there!

I don’t quite know why this for () syntax is now allowed, but it gets even more wacky if we trade a function declaration for a fully-fledged definition ( link ):

#include <stdio.h>
 
int main() { 
  int i = 0;
  for (void _(){} i++ < 3; ) _(), puts("this is fine.");
  return 0;
}

A two-expression for()?! Sort of! Three independent elements are still expected, but a function definition ( void _() { } ) is special in C in that it doesn’t end with a semicolon. In other words, for ( ; ) is “correct”, while for (; ; ) would be not.

While I have you here, did you know that the C language has a BASIC compatibility mode that enables line numbering? It’s true, use if (BASIC) to see if your compiler supports the feature ( link ):

#include <stdio.h>
 
typedef int BASIC[];
 
int main() {
 
  if ((BASIC) {
    [20] puts("cruel"),
    [30] puts("world"),
    [10] puts("hello"),
  });
 
}

I’m sure you can figure this one out. And with this, we conclude today’s introductory lesson of C. Until next time, fellow programmers!

I write well-researched, original articles about geek culture, algorithms, and more. If you like the content, please subscribe. It’s increasingly difficult to stay in touch with readers via social media; my typical post on X is shown to less than 5% of my followers and gets a ~0.2% clickthrough rate.

🪴 Anil's Garden

Explorer

Getting silly with C, part (void*)2

They won’t be able to find bugs in your code if they can’t figure out how it works.

Graph View

Backlinks