Specifying Types

C specifies types like this:

int integer;
int array[2];
int *pointer;
int function(int);

The clever rule C follows is that declarations and expressions look the same. So int *pointer can be read as, "from now on, *pointer is an int." Similarly, function(0) is an int, and array[0] is an int.

Despite its simplicity, this rule has some unfortunate and confusing consequences. It's hard for beginners:

int* pointer1, pointer2; // pointer2 is an int, not an int*

And even experts:

int (*function_pointer)(int) = function; // function_pointer is a pointer to a function taking and
                                         // returning int, because (*function_pointer)(0) is an int
 
int (*returns_function_pointer())(int) { // returns_function_pointer is a function returning a
  return pf;                             // function pointer, because
}                                        // (*returns_function_pointer())(0) is an int

What's more, in C++, the rule no longer works:

int &reference; // reference to int; &reference is not an int, it's an int*

That's one reason why C++ programmers tend to write int& reference instead of int &reference. C++11 at least has the std::function type:

std::function<int(int)> returns_function() {
  return function;
}

The Java/C# approach

Languages like Java and C# ditch C's syntax for a similar style that can be read left-to-right:

int integer;
int[] array;
int* pointer; // C# only
int function(int);

Neither language has function pointers, so that complication doesn't apply either. C# has delegate types like Func<int, int>, but they're not first-class citizens of the type syntax.

Func<int, int> func = function;
 
Func<int, int> returnsFunc() {
  return func;
}

The Haskell, Go, F#, etc. approach

Many other languages, new and old, specify the type after the name being declared.

integer  :: Int
list     :: [Int]
function :: Int -> Int

Function types in these languages look like the mathematical style, \(f:\mathbb{Z} \to \mathbb{Z}\). Even higher-order functions have simple-to-read types.

returnsFunction :: Int -> (Int -> Int)
returnsFunction _ = function

Another approach?

Despite how nice the Haskell approach is, I find the C style easier to read. If the std::function could be omitted, I think C++ would have a great type syntax.

int integer;
int[] array;
int* pointer;
int& reference;
int function(int);
int(int) returns_function();

Compare these styles when representing the two functions \(\mathrm{sum}(a, b) = a + b\) and \(\mathrm{compose}(f) = f \circ f\).

  1. C/C++ style
    int sum(int a, int b) {
      return a + b;
    }
     
    // Can't be implemented in C
    int (*compose(int (*f)(int))(int);
     
    // But it can in C++
    std::function<int(int)> compose(std::function<int(int)> f) {
      return [](int n) { return f(f(n)); }
    }
  2. Haskell style
    add :: Int -> Int -> Int
    add a b = a + b
     
    compose :: (Int -> Int) -> (Int -> Int)
    compose f = f . f
  3. Hypothetical style
    int sum(int a, int b) = a + b;
     
    int(int) compose(int(int) f) = n => f(f(n));

I think c. is the clear winner, especially with some added type inference.

Leave a Reply

Your email address will not be published. Required fields are marked *