* (This is a revised repost of an earlier part of my Haskell tutorial.)
*

Haskell is a strongly typed language. In fact, the type system in Haskell

is both stricter and more expressive than any type system I've seen for any

non-functional language. The moment we get beyond writing trivial

integer-based functions, the type system inevitably becomes visible, so we

need to take the time now to talk about it a little bit, in order to

understand how it works.

One of the most important things to recognize about Haskell's type system

is that it's based on *type inference*. What that means is that in

general, you *don't* need to provide type declarations. Based on how

you use a value, the compiler can usually figure out what type it is. The net

effect is that in many Haskell programs, you don't write *any* type

declarations, but your program is still carefully type-checked. You

*can* declare a type for any expression you want, but the only time you

*must* is when something about your code is ambiguous enough that type

inference can't properly select one type for an expression.

I'm not going to go into a lot of detail, but the basic idea behind

Haskell's type system is that there's a basic propositional logic of types.

Every type is represented as a statement in the type-logic. So as you'll see

below, an integer is written as type `Integer`

. A function

from integers to integers has a type which is an implication statement:

"`Integer -> Integer`

", which you can read in the logic as

"If the input type is Integer then the output type is Integer". The

way that type inference works is that values with known types become

propositions; the basic type inference process s just logical inference

over the type logic.

The type logic is designed so that the inference process, when it analyzes

an expression, it will generate the *most general* type; that is, when

it's inferring a type for an expression, it will always pick the most general,

inclusive type that can match the expression. And that leads to one

complication for beginners: Haskell's type system is almost two different type

systems - a basic type system, and a *meta*-type system. The meta-type

system is based on something called *type classes*, which group related

families of types together. The most general types that come up as a result of

type inference are frequently based on type classes, rather than on specific

concrete types.

A type-class in code is basically a *predicate* over a type

variable. For example, the type of the parameters to the "`+`

"

operator must be a numeric type. But since "`+`

" can be used on

integers, floats, complex, rationals, and so on, the type of the

"`+`

" operator's parameters needs to be something that includes all

of those. The way that Haskell says that is the type is "`Num a =>`

" - that is, that the type a is a member of the type-class

a

`Num`

. The way to understand that is "Some type 'a' such that 'a' is a

numeric type.".

The thing to remember is that essentially, a type-class is a type for

types. A *type* can be thought of as a predicate which is only true for

members of that type; a *type-class* is essentially a second-order

predicate over *types*, which is only true for types that are members

of the type-class. What type-classes do is allow you to define a *general
concept* for grouping together a family of conceptually related types.

Then you can write functions whose parameter or return types are formed using

a type-class; the type class defines a

*predicate*which describes the

properties of types that could be used in the function.

So if we write a function whose parameters need to be numbers, but don't

need to be a *specific kind* of number, we would write it to use

a type-class that described the basic properties of numbers. Then you'd be

able to use *any* type that satisfied the type-predicate defined

by the type-class "Num".

If you write a function that uses numeric operations, but you don't write

a type declaration, then Haskell's type-inference system will infer types for

it based on the "`Num`

" type-class: "`Num`

" is the most

general type-class of numbers; the more things we actually *do* with

numbers, the more constrained the type becomes. For example, if we use the "/"

operator, instead of inferring that the type of parameter must be an instance

of the "Fractional" type-class.

With that little diversion out of the way, we can get back to talking

about how we'll use types in Haskell. Types start at the bottom with a bundle

of *primitive* atomic types which are built in to the language:

`Integer`

, `Char`

, `String`

,

`Boolean`

, and quite a few more. Using those types, we can

*construct* more interesting types. For now, the most important

constructed type is a *function type*. In Haskell, functions are just

values like anything else, and so they need types. The basic form of a simple

single-parameter function is "`a -> b`

", where "`a`

" is

the type of the parameter, and "`b`

" is the type of the value

returned by the function.

So now, let's go back and look at our factorial function. What's the type

of our basic "`fact`

" function? According to Hugs, it's

"`Num a => a -> a`

".

Definitely not what you might expect. What's happening is that the system

is looking at the expression, and picking the most general type. In fact, the

only things that are done with the parameter are comparison, subtraction, and

multiplication: so the system infers that the the parameter must be a

*number*, but it can't infer anything more than that. So it says that

the type of the function parameter is a numeric type; that is, a member of the

type-class "`Num`

"; and that the return type of the function is the same as

the type of the parameter. So the statement "`Num a => a -> a`

" basically says

that "`fact`

" is a function that takes a parameter of *some* type

represented by a *type variable* "`a`

" and returns a value of the

*same* type; and it also says that the type variable "`a`

" must be a

member of the meta-type "`Num`

", which is a type-class which includes all of

the numeric types. So according to Haskell, as written the factorial function

is a function which takes a parameter of *any* numeric type, and returns a

value of the *same* numeric type as its parameter.

If we look at that type, and think about what the factorial function actually

does, there's a problem. That type isn't correct, because factorial is only

defined for integers, and if we pass it a non-integer value as a parameter, it

will *never terminate*! But Haskell can't figure that out for itself -

all it knows is that we do three things with the parameter to our function: we

compare it to zero, we subtract from it, and we multiply by it. So Haskell's

most general type for that is a general numeric type. So since we'd like to

prevent anyone from mis-calling factorial by passing it a fraction (which will

result in it never terminating), we should put in a type declaration to force

it to take the more specific type "`Integer -> Integer`

" - that is,

a function from an integer to an integer. The way that we'd do that is to add

an *explicit type declaration* before the function:

> fact :: Integer -> Integer > fact 0 = 1 > fact n = n*fact(n-1)

That does it; the compiler accepts the stronger constraint provided by our

type declaration.

So what we've seen so far is that a function type for a single parameter

function is created from two other types, joined by the "`->`

"

type-constructor. With type-classes mixed in, that can be *prefixed* by

type-class constraints, which specify the *type-classes* of any type

variables in the function type.

Before moving on to multiple parameter functions, it's useful to introduce a

little bit of syntax. Suppose we have a function like the following:

poly x y = x*x + 2*x*y + y*y

That definition is actually a shorthand. Haskell is a lambda calculus based

language, so semantically, functions are really just lambda expressions: that

definition is really just a binding from a name to a lambda expression.

In lambda calculus, we'd write a definition like that as:

poly ≡ λ x y . x*x + 2*x*y + y*y

Haskell's syntax is very close to that. The definition in Haskell syntax

using a lambda expression would be:

poly = ( x y -> x*x + 2*x*y + y*y)

The λ in the lambda expression is replaced by a backslash, which is the

character on most keyboards that most resembles lambda; the "." becomes an

arrow.

Now, with that out of the way, let's get back to multi-parameter functions.

Suppose we take the poly function, and see what Hugs says about the type:

poly x y = x*x + 2*x*y + y*y hugs> Main> :type poly

hugs> poly :: Num a => a -> a -> a

This answer is very surprising to most people: it's a *two*

parameter function. So intuitively, there should by *some* grouping

operator for the two parameters, to make the type say "a function that takes

two a's, and returns an a"; something like "`(a,a) -> a`

".

But functions in Haskell are automatically *curried*. Currying is a

term from mathematical logic; it's based on the idea that if a function is a

value, then you don't *ever* need to be able to take more than one parameter.

Instead of a two parameter function, you can write a one-parameter function

that returns another one-parameter function. Since that sounds really

confusing, let's take a moment and look again at our "poly" example:

poly x y = x*x + 2*x*y + y*y

Now, suppose we knew that "x" was going to be three. Then we could write a

special one-parameter function:

poly3 y = 3*3 + 2*3*y + y*y

But we *don't* know "`x`

". But what we *can* do

is write a function that takes a *parameter* "`x`

", and

returns a function where all of the references to `x`

are filled

in, and given a y value, will return the value of the polynomial for x and

y:

polyn x = (y -> x*x + 2*x*y + y*y)

If we call "`polyn 3`

", the result is exactly what we wrote for

"`poly3`

". If we call "`polyn a b`

", it's semantically

*exactly* the same thing as "`poly a b`

". (That doesn't mean

that the compiler actually *implements* multi-parameter functions by

generating single-parameter functions; it generates multi-parameters functions

the way you'd expect. But everything *behaves* as if it did.) So what's

the type of `polyn`

? Well, it's a function that takes a parameter

of type `a`

; and returns a *function* of type "`Num a`

". So, the type of

=> a -> a`polyn`

is "`Num a => a -> (a ->`

"; and since the precedence and associativity rules are set up to

a)

make currying convenient, the parents in that aren't necessary - that's the

same as "`Num a => a -> a -> a`

". Since "`poly`

" and

"`polyn`

" are supposed to be semantically equivalent, that means

that "`poly`

"'s type must also be "`Num a => a -> a ->`

a'".

Haskell accepts Unicode characters for certain symbols like lambdas, so you can actually write

poly = λ x y → x*x + 2*x*y + y*y

which is even closer to the classic lambda syntax.

Very fascinating post! This is definitely a language I'm going to have to spend some time playing with.

Near the end of your article, I was about to complain with something along the lines of "Well, won't currying cause the type definitions of functions which accept many parameters to rapidly become unreadable?". Then, I remembered that I'm currently working with C++ STL and have to stare at 200 characters of gobbeldygook every time I have a syntax problem involving a list of strings. Compared to that, having to count on my fingers to figure out "a => a -> a -> a -> a -> a -> a" is downright elegant. 🙂

I'm not sure I'd say that types "are" propositions and programs "are" proofs. It's more accurate to say that types are the objects and proofs are the morphisms in a cartesian closed category Hask, and that to any CCC we can associate an "internal logic" to discuss its semantics.

To say that a program "is" a proof immediately raises the questions to a casual reader "what does it prove?" and "what does it mean to 'prove' a type?"

@John Armstrong - i think the "Types are Propositions, Programs are Proofs" statement comes from Intuitionistic Type Theory. If a Type is interpreted as a Proposition, then a Program of that Type can be interpreted as a Proof of that Proposition. This doesn't mean you're proving a Type as such.

It doesn't seem to be relevant to the article though; i guess it's just a catchy title.

John, please see the Curry-Howard isomorphism. However, do also note that --- even in a type system as comparatively expressive as Haskell's --- this correspondence is useful for proving metatheorems about the type system but the proofs that correspond to programs are typically not particularly interesting at the object level. For example, you can't prove that a list-reverse function actually reverses its argument by its type without an even more expressive type system. (To see what such a type system would look like, see the Coq tool from INRIA.)

A more appropriate type for fact would be:

fact :: Integral a => a -> a

wb: but see "theorems for free!" for example which demonstrates how parametric types can give rise to quite useful proofs at the object level. And related papers about free theorems in the presence of typeclass constraints, etc. which yield some even more useful (if less elegant) results. also see "faking it" and other work on how types can be made significantly more expressive vis a vis proofs and invariants, and okasaki and others' work on data structures and recursive types which shows how even plain h98 types can be made to encode fairly interesting properties. "i am not a number, i am a free variable" is pretty cool in this regard too.

One point that has been missed is that Haskell, unlike say Coq, has unbounded recursion. This means that all propositions have proofs --- for example,

pf :: a = let x = x in x

is a proof of any 'a'.

Great post, it's so good to see someone explaining clearly these concepts. Thanks

@John Armstrong

The proposition associated with a type is the intuitionistic statement 'There is an object of type T'. The program proves it by generating an example. Note that there is no other way of proving such an intuitionistic statement; the statement 'The assumption that there are no objects of type T leads to a contradiction' is not intuitionistically equivalent to the first statement.

Most programs that programmers write are rather trivial as proofs (they usually don't aim to generate objects of a type when they don't know whether or not the type is inhabited). However it tends to be more useful when run in the other direction. (If I prove a proposition intuitionistically, I get an algorithm that generates witnesses to the proposition 'for free')

Note that under my interpretation Simon@8 is wrong. If the program never terminates, then it never produces an object of it's result type, what type has been proven to be inhabited?

I guess it wasn't as clear as I thought it was from the text.

What I find fascinating about the idea of types as logical statements, and inference as proof, is that the process of performing type inference on a language like Haskell is the process of performing logical proofs in the type logic. By finding the fundamental mapping between types and logical statements, Hindley-Milner type inference is really logical inference in a particular intuitionistic logic.

When you compare that to the kinds of type inference that you do when you're compiling languages like C++, the logical structure of the type system in Haskell is amazingly beautiful. It

makes sensein a deep way, because the type system is a logic, and type inference is a proof in that logic.Haskell is becoming interesting and I am glad of having chosen it to improve my functional programming...

So then what would be the type signature for something like map which ought to take some type of set, a function that accepts members of the set, and returns a set whose members of the type produced by the function.

@13:

A set-map function would have a signature like:

setmap :: Set a -> (a -> b) -> Set b

But the exact type would depend on the set implementation. For example, if the set was represented as a binary search tree, you'd need to put a type-class constraint on a to ensure that it supported the comparison operations needed by the set, like:

setmap :: (Ord a, Ord b) => Set a -> (a -> b) -> Set b

This means that all propositions have proofs

It's more accurate to say that types are the objects and proofs are the morphisms in a cartesian closed category Hask, and that to any CCC we can associate an "internal logic" to discuss its semantics.

I don't know if it's entirely accurate to say that type *inference* is inference in the logic corresponding to the type system, although it may be related. What inference does is walk through sub-terms, and figure out what typing rules correspond to each sub expression. So, in the case where it sees:

f x

it knows that f must have a type of the form 'a -> b', and x must have type 'a', and the combination has type 'b'. The inference algorithm uses this by collecting a bunch of equations between the variables like 'a' and 'b', and then uses a unification algorithm to figure out the most general type for everything.

For a more logical-inference type application, it might be good to look at djinn. It uses the fact that the completeness (I think that's the right one) theorem for the logic corresponding to (a somewhat restricted subset of) the Haskell type system can be proved constructively, which translates to an algorithm for deciding whether a given proposition/type is inhabited, which if it is, also yields a term with the relevant type. So djinn takes a type and writes code with that type, or tells you that it's impossible to do so (and it even gives you the correct implementation of callCC for the continuation monad, which I remember finding impressive when I first saw it).