Since my post on datatypes for my π-calculus language, I've gotten a bunch of
questions from people who (I guess) picked up on the series after the original post where
I said that the idea of the series was to see if I could create a programming language
based on it. The questions are all variations on "Why design another programming language? Do you really think anyone will ever use it?"
I'll answer the second question first. Of course I realize that the chances that anyone
but me, and maybe a couple of readers, will ever use it are close to zero. But that's not the point.
Which brings us back to the first question. And for that, silly (but true) answer, and
the serious (but also true) answer.
The silly answer is: because creating a programming language is interesting and fun!
Even if it doesn't end up getting used, it's still an interesting thing which I'll enjoy
doing, and which I think a lot of readers will enjoying reading and participating in.
The serious reason is that there's a problem that needs solving, and I think that the
right way to solve it involves a different kind of programming language. Modern software is
pretty much inevitably running in an environment with concurrency and communication.
Networks and multiple core processors are an inevitable part of our computing environment,
and we need to program for them. But our current languages absolutely suck for
that kind of programming.
Why do I think that a language based on π-calculus will be better? Well, that goes
back to something about my philosophy on programming languages. One of the things that I
believe is a very big issue in software engineering is the fact that most programming
languages force programmers to work out detailed knowledge about their systems in their
minds, then write their programs omitting most of that detailed knowledge, and
then make the compiler/development tools attempt to recompute the information that they
knew in the first place.
For example, when I was in grad school, one of the very hot research areas was compilation of scientific code for massively parallel execution. The main focus area
was doing work on numerical arrays in Fortran - looking at Fortran code, analyzing it
to determine the data dependencies, and then using the dependency information to
figure out how to generate the fastest possible parallel code that respected those dependencies.
The crazy thing about this was that the people who wrote scientific code for those compilers usually knew what the dependencies were. They generally knew how they wanted the code to execute. And so they were spending tons of times learning how the compiler was going to analyze it, so that they could write their code in a way that would result in the compiler generating the code they wanted it to. So they were starting with the knowledge of how they wanted the code to be parallelized. But they couldn't write that down in the program explicitly. They had to figure out how to
implicitly encode it to get the results they wanted, because the language didn't let them express their knowledge.
I see the same kind of thing happening all over the place, but particularly, I see it
becoming a huge issue is in multi-threaded programming. Like I said, with the increasing use of multi-core processors, and distributed systems, we're all writing code that involves concurrency, threading, and communication. But the programming languages that we use are terrible at it. Look at Java or C++: you see languages that are first and foremost sequential languages. Almost everything about them is built around the good
old fashioned single-threaded vonNeumann model of computing. Then threading is added on
top, primarily through libraries. So almost all of the code is written in, at best, a threading/concurrency-unaware style.
What that means is that when you go to write multi-threaded code, you're really screwed. Sure, you know how you want to set up the threading. But you can't really tell that to the compiler. So all of those functions in the libraries you want to use... Can you call them? Are they compatible with your threading strategy? Who knows? Even if the author of a library wrote his library in a thread-aware fashion, there's no particularly good way for him to make that explicit in the code. And so the multithreaded system has bugs, deadlocks, improper synchs, race conditions, etc.
I'm not saying that a π-calculus based language is going to miraculously get rid of
deadlocks, race conditions, etc. But I do believe that the problems of
multi-threaded programming aren't as difficult as many people think they are. The problem
is that the way we program and the languages we use are particularly ill-suited towards
solving problems that involve concurrency and threading, because (1) they let us write most
of our code without thinking about how it will behave in a multi-threaded environment, and
(2) even when we do think about how the code should behave in a threaded environment, the
languages don't let us express enough of our knowledge about their concurrency
So on a deep level, the point of building this language is to see if a π-calculus based approach to describing concurrency, wired deep into the language semantics, can make
concurrent/multi-threaded programming easier.