I suspect that people who don't "get" pointers (#4 in the article) actually have...

BrandonM · on Oct 19, 2011

> For instance, in C the * character is used both to declare a pointer and to dereference one and they can be stacked to dereference nested structures.

This problem can be solved by thinking of * as always dereferencing a pointer. So:

  int *a; /* "*a" is an int, thus a is a pointer to one */
  int *b, *c; /* *b and *c are both ints */

This also has the nice side effect of explaining why good C programmers write the * next to the variable name and why multiple variables declared on the same line all require a * next to the variable name.

> Then & references a variable memory location but is also used in method signatures to change the semantics of calling a function into pass-by-reference. To complicate & further, & is often seen alongside const declarations, which are a whole other thing that people have to keep in their heads.

Now it sounds like you're talking about C++ moreso than C. In C, & is pretty much always the "address-of" operator (except in contexts where it's clearly logical- or bitwise-AND).

From your post, it seems that the problem is more one of poor teaching, poor explanations (metaphors), or poorly-designed extensions (C++) as opposed to shortcomings of C.

loup-vaillant · on Oct 20, 2011

I second the remarks about poor teaching (though C type syntax does sucks hard). The main problem probably comes from the fact that ordinary variables are themselves an indirection of sorts, and programming courses do not make that clear (we tend to confuse the variable and the value it holds). Excerpt from my article on assignment[1]:

[The pervasive use of the assignment statement] influenced many programming languages and programming courses. This resulted in a confusion akin to the classic confusion of the map and the territory.

Compare these two programs:

  (* Ocaml *)        │    # most imperative languages
  let x = ref 1      │    int x = 1
  and y = ref 42     │    int y = 42
  in x := !y;        │    x := y
     print_int !x    │    print(x)

In Ocaml, the assignment statement is discouraged. We can only use it on "references" (variables). By using the "ref" keyword, the Ocaml program makes explicit that x is a variable, which holds an integer. Likewise, the "!" operator explicitly access the value of a variable. The indirection is explicit.

Imperative languages don't discourage the use of the assignment statement. For the sake of brevity, they don't explicitly distinguish values and variables. Disambiguation is made from context: at the left hand side of assignment statements, "x" refer to the variable itself. Elsewhere, it refers to its value. The indirection is implicit.

Having this indirection implicit leads to many language abuses. Here, we might say "x is equal to 1, then changed to be equal to y". Taking this sentence literally would be making three mistakes:

(1) x is a variable. It can't be equal to 1, which is a value (an integer, here). A variable is not the value it contains.

(2) x and y are not equal, and will never be. They are distinct variables. They can hold the same value, though.

(3) x itself doesn't change. Ever. The value it holds is just replaced by another.

The gap between language abuse and actual misconception is small. Experts can easily tell a variable from a value, but non-specialists often don't. That's probably why C pointers are so hard. They introduce an extra level of indirection. An int* in C is roughly equivalent to an int ref ref in Ocaml (plus pointer arithmetic). If variables themselves aren't understood, no wonder pointers look like pure magic.

[1]: http://www.loup-vaillant.fr/articles/assignment

BrandonM · on Oct 20, 2011

Thanks for the informative reply and the excellent blog post!

JoeAltmaier · on Oct 20, 2011

I'm actually astonished by the hero-worship surrounding the inventor of C. Its got so many embarassing flaws, or had at any rate; its been patched and bandaged for decades.

Sure, say that hindsite is 20-20, but I knew C was busted long before I learned anything about objects or monads etc. Syntax is goofy, backward as you say (I thought that the 1st time I ever saw a C program decades ago); its promise as an expression-based syntax was botched and mortgaged, resulting in generations of hacks; its fundamental expression evaluation syntax was broken to begin with, and as patched its cumbersome and baffling at times. </rant>