Ultimate Spaghetti Coding with Linguine

Todays dose of programming pathology is a monstrosity called *Linguine*. Linguine is a language that (obviously) carries spaghetti code to an extreme. You can see the [language specification][linguine-spec], or download an [implementation with examples][linguine-impl].
It's really a remarkably simple language. There are only 10 commands, including input and output:
1. "`x=y`": assign the value of y to the variable x.
2. "`x+y`": assign the value of x+y to the variable x.
3. "`x-y`": assign the value of x-y to the variable x.
4. "`x|y`": assign the value of x NAND y to the variable x.
5. "`x>y`": assign the value of shifting x by y bits to x. If y is positive, then it shifts y bits right; if y is negative, then it shifts -y bits left.
6. "`x?`": Input one character and store it in the variable x.
7. "`x\$`": Output the content of register x as a character.
8. "`x#`": Output the content of register x as an integer.
9. "`x<y:loc`": if x < y then jump to the program line labeled "loc".
10. "`x~y:loc`": if x = y then jump to the program line labeled "loc".
Command operands on either integer literals, or on the contents of an unlimited set of numbered registers each of which contains an integer. Valid operands are either an integer (which identifies a register number if it appears on the left, or a literal value on the right); or an operand preceeded by "*", which performs an indirection on the value it precedes. So for example, "*3" is the contents of register three. "**3" is the contents of the register whose number is stored in register 3, etc. Any of the operands may be any of these forms, so for example, "1~*3:**4" is a valid command.
A linguine statement consists of three parts: a line label, a list of commands, and a branch target, written "label[commands]target". When executed, the line evaluates each command in the list in sequence. If it reaches the end of the list without having branched, then it will branch to the statement labeled by the branch target.
The lines of the program may appear in any order. The first statement executed will be the statement whose label is smallest. Branching to 0 ends program execution.
All quite simple, right?
1[0=72,0\$,0+29,0\$,0+7,0\$,0\$,0+3,0\$,1=32,1\$,0-24,0\$,0+24,0\$,0+3,0\$,0-6,0\$,0-8,0\$,1+1,1\$,1-23,1\$]0
That's perfectly clear, isn't it? Actually, it's basically the same as the hello world program we saw in brainfuck. We'll piece it apart.
* `0=72,0\$`: 72 is the ascii code for "H". So store that in register 0. Then output register 0.
* `0+29,0\$`: 101 is the ascii code for "e". So add 29 to the contents of register 0, which will make the new value of register 0 be 101. Then output that.
* etc.
How about something a tad more interesting? Let's look at generating the
Fibonacci sequence:
1[0=32,2=1,1#,0\$,2#]2
2[1+*2,3=*1,1=*2,2=*3,0\$,2#]2
* This starts with line 1:
* "`0=32`: Store the ascii code for a whitespace in register 0.
* "`2=1`": store the value 0 in register 2.
* "`1#`": output the value of register 1. (Since registers contain 0 when the program starts, that prints a zero.)
* "`0\$`": output the contents of register 0 as a character, which prints a whitespace.
* "`2#`": print the contents of register 2 (1).
* Finally, branch to statement 2.
* Statement 2: registers 1 and 2 contain the previous two fibonacci numbers. So this adds them together to get the next one, and then rotates the values of the registers so that the newest fibonacci number is in register 2, and the one before that is in register one.
* "`1+*2`": add the value of register 2 to register 1.
* "`3=*1`": store the value of register 1 in register 3.
* "`1=*2`": store the value of register 2 in register 1.
* "`2=*3`": store the value of register 3 in register 2.
* "`0\$,2#`": output a space, and then output the value of register 2.
* Finally, the branch instruction repeats statement 2, to keep generating fibonacci numbers.
Here's a nice little multiply program, which demonstrates how to write subroutines. Note that recursion doesn't work here, because we need to use fixed register numbers. (You can set up recursion using additional levels of indirection, it's just messy as heck.)
'Multiply: -2 *= *-3
'Programmed by Jeffry Johnston, 2005
'-1=return jump, -4=temp, -5=temp
100[-4=*-3,-5=*-2,-2=0,-4<0:101]102
101[-4~0:*-1,-2-*-5,-4+1]101 '-3 is negative
102[-4~0:*-1,-2+*-5,-4-1]102 '-3 is positive
This takes the address of the instruction to return to in register -1; it takes the left operand of multiple (which is also the target of the result) in register -2, and the right operand in register -3. Registers -4 and -5 are used for temporary storage.
* Line 100: set up, and determine whether the right operand is positive or negative.
* "`-4=*-3,-5=*-2`": Copy the contents of register -3 to register -4, and the contents of register -2 to register -5.
* "`-2=0`": Store 0 in register -2.
* "`-4<0:101`": If the contents of register -4 is less than 0, then branch to statement 101.
* If you reach the end of this line, then the contents of register -4 is non-negative, and you branch to statement 102.
* Line 101: do multiplication by repeated subtraction
* "`-4~0:*-1`": If the contents of register -4 is zero, then we're done, so return by branching to the line labeled by the contents of register -1.
* "`-2-*-5`": Subtract the contents of register -5 from the contents of register -2 (the result register).
* "`-4-1`": decrement (or increment, depending on your point of view) the contents of register four, to show that you've done one subtraction.
* Repeat the line.
* Line 102: basically exactly the same as 101, except that the does a repeated add rather than a subtract.
An example of calling this would be:
1[-1=2,-2=36,-3=13]200
2[-2#]0
This stores 36 and 13 as the parameters to the multiple subroutine, and then invokes it by branching to it at line 200. The "return address" is set to statement 2 by storing 2 in register -1. When control returns to statement 2, the result of the multiplication is printed out.
One last example, which you can puzzle out on your own. A brainfuck interpreter in 21 lines of Linguine:
'BF interpreter in Linguine
'Programmed by Jeffry Johnston, 2005
1[*-2?,*-2~33:2,-2+1]1
2[*-2=-1,-2+1]3
3[-3=**-1,-3~-1:0,-3~43:4,-3~44:6,-3~45:7,-3~46:9,-3~60:10,-3~62:11,-3~91:12,-3~93:14]15
4[*-2+1,*-2~256:5]15 '+
5[*-2=0]15
6[*-2?,*-2~-1:5]15 ',
7[*-2-1,*-2~-1:8]15 '-
8[*-2=255]15
9[*-2\$]15 '.
10[-2-1]15 '
12[*-2~0:13]15 '[
13[-3=1]16
14[-3=-1]16 ']
15[-1+1]3
16[-4=1]17
17[-1+*-3,*-1~91:18,*-1~93:19]20
18[-4+*-3]20
19[-4-*-3]20
20[-4~0:21]17
21[-3~1:15]3
[linguine-impl]: http://kidsquid.com/files/linguine/linguine.tar.gz

• Flaky says:

While these obscure and twisted languages are no doubt amusing, if one has certain masochistic tendencies, how about telling us about more practical languages? What sort of ideas have you been exploring as part of your research? Personally I've fairly recently come to contanct with ML (Standard and OCaml) and I've practically fallen in love with it. What kinds of languages/features do you think/hope will be used in the future? Parallel programming, static vs. dynamic typing, statically checked invariants (e.g. dependently typed ML), logic programming in mixed paradigm languages (Oz), etc. ?

• Flaky says:

Oh, and if you haven't stumbled upon it yet, you'll love Twelf: http://www.cs.cmu.edu/~twelf/

• While I have to agree that at some level linguine isn't all that terribly interesting, being very similar to certain limited assembly languages, other pathological programming languages are nice examples of certain kinds of perverse minimalism that do in fact get you to think very carefully about the ways computer languages get written. (e.g. Unlambda, Thue, Whenever, or even Brainfuck, the first time you see it)
In many ways linguine reminds me of the assembly language on the PIC family of microcontrollers, except that linguine is significantly more expressive. Programming in assembly is sometimes enjoyable, and figuring out how to translate a given algorithm into the restrictions of a limited machine can be enlightening, but I wouldn't expect much enlightenment directly from exposure to an assembly language.
If our host is taking suggestions, may I humbly recommend owl? I also know our host is aware of LazyK, as he promised a post on it at some point...

• Mark C. Chu-Carroll says:

Flaky:
Writing about real, useful programming languages is too close to work for me. That's what I do during the day. This blog is what I do for fun.
Besides which, my views on real programming languages are too depressing.
I'm a huge fan of OCaml and Haskell. I think that they're much closer to what we *should* be programming in than anything we're typically using today.
The problem is that what programming languages are actually used isn't driven by anything technical. As a programming language, Java and C++ are both steaming piles of crap, but they're the most widely used languages today.
Programming languages that have real technical advantages *don't* get used, because they're unfamiliar to the majority of programmers. No company is willing to adopt a programming language that their people don't know, and that no one else is using. Doesn't matter what the technical advantages are.
I've seen several examples of people taking new languages, and doing absolutely astounding demonstrations of how much more productive they can be with the new language compared against C++, or Java, or Perl.
What do you think happens when you show someone running a business a way to use a language that allows their people to write programs that are 1/10th the length of their old language, that run 40% faster, and that have fewer bugs?
The answer is, they say "But my developers don't know that, so they won't be able to do it."
I've seen this happen over and over, starting all the way back when I was still in college. A friend worked for AT&T, and did an experiment with smalltalk. Beat the living crap out of the C they were using at the time. Didn't matter.
In grad school, I was working in parallel programming. There was a group at, I think, Los Alamos that designed a language for developing massively parallel array-based programs. The language was called Sisal. Sisal programs are functional, easier to parallelize, less buggy, easier to read, and dramatically faster than even the very best Fortran code. Guess how many people in the world are using Sisal today for parallel programming?
I used to know a guy who worked for a small company, who rewrote the Apache Tomcat stuff in OCaml. I mean, the whole thing: the entire web server, all of the modules, WebDAV service, an OCaml web-app container that provided the same capabilities as JavaBeans, only without all of the hideous gorp. Rebuilt his companies app on top of it. The end result was 4 times faster than their Java app running on tomcat. The server was less than 1/10th the length of tomcat; the companies app was literally 1/20th the size of the Javabeans app. What do you think happened? He got fired. Because the company only had four programmers, and the other three didn't know Ocaml.
The fact of the matter is, the programming language we use haven't dramatically changed since the late 1960s. It's hard to make a case for why C++ or Java or Perl or PHP is really that much better than late-60s/early 70s languages like Simula-67, Snobol-4, or Algol. Our tools have gotten a lot nicer - although even there, it's hard to make much of a case that they're dramatically better than Smalltalk-80.
Hasn't changed much in that long, and I don't expect to see them change any more than that any time soon.

• Middle Manager Extraordinaire says:

What do you think happened? He got fired. Because the company only had four programmers, and the other three didn't know Ocaml.
Of course he did. What a colossal waste of the company's time and money, and what a way to not be a team player!

• Mark C. Chu-Carroll says:

mme:
That attitude is exactly why I believe there will be no major progress in programming languages.
The fact that there are languages that are demonstrably better than the ones we use today is absolutely irrelevant.
The fact that we can *show* that we can build better systems - better by objective measures like length of code, bugs/line, developer hours/task - that has no impact on real software development. And it won't in the foreseeable future either. Because the moneymen who get to make the decisions won't allow it to. They're more worried about not "wasting time" on continuing education for their developers than they are about making sure that their developers are actually able to do their job as effectively as possible; they're more worried about whether people meet their definition of "team player" than whether those people are the most skilled and effective people at doing their job.

• ArtK says:

Mark -- you promised you wouldn't share your depressing views of the popular languages and then you did. Just what I really needed to read as I'm slogging my way through a melange of C++, Java and C#. What's worse is that most of it was written by somebody who doesn't think at all the way that I do.
Thanks, though for the glimpses of the pathological languages, they're quite fun. I agree with Daniel that linguini seems very similar to the assembly language on some limited processors.

• Morgan says:

Mark,
I think (hope!) that, given his chosen alias, Middle Manager Extraordinaire was engaging in a bit of satire there...
Of course, the position he (surely) satirised is prevalent enough and damaging enough as to be rather unfunny, really...
Have you read any of Paul Graham's essays on the topics of language design and software development, or are you familiar with the current wave of 'Web 2.0' startups? A lot of these are using 'better' (if still not cutting-edge) languages than Java and C++ - Ruby and Python are among the most popular - and turning the ability to develop more quickly, more reliably, and with fewer people into features and customers. The limiting factor for most of these groups in their language choice seems to be library support, which is a different, more thorny problem than language development or quality. So it seems to me that there is hope on the horizon for adoption of superior languages - but large or 'traditional' companies are the last place you'll see change.

"So it seems to me that there is hope on the horizon for adoption of superior languages - but large or 'traditional' companies are the last place you'll see change."
It has always been thusly, and over most or all technologies. Businesses makes money, not superlative technology - until they are forced to.
Yes, research and startups may try new technology. Startups often kroaks, often because a competitor used known technology and was first.
What are we gonna do? It's built into the system. (Hint on mode of attack... but let's not hold our breaths. I note that even or perhaps especially quality companies tend to be rather conservative.)

On the positive side I note that old technology do die, perhaps akin to spoken languages. So there is progress... glacial.

• Mark C. Chu-Carroll says:

Morgan:
I realize that middle manager was probably being sarcastic; but what he said is all too commonly what we hear in the real world.
WRT the acceptance of languages like Python and Ruby; you'll have to pardon me if I don't jump for joy. Python and Ruby are both very nearly underpowered Lisps different syntax. *Scheme* was developed in 1978, and it's a young'un in the Lisp family. Still nothing about either that represents a new idea in programming languages more recent than the 1970.

• Janne says:

Mark, Tim Bray made a very good point of Pythom and Ruby on one hand, and Lisp-lanugages on the other just a few days ago (http://www.tbray.org/ongoing/When/200x/2006/08/22/Lt-Lt#P-3):
"[...]but Python and Ruby seem to have oozed further into the mainstream than Lisp ever has.
"Why?" is an interesting question, and I think the biggest single piece of the answer is inheritance from Perl's culture of ruthless practicality. In other words, not afraid to use syntax to make the point; things are generally lists, except when they're not; attention to typographic values; building hashes and regexes into the language; not requiring you to ever say "lambda". Any more?
If Lisp's audience had been harried sysadmins rather than AI researchers, it'd rule the world by now."
And he's right. Being elegant is not what matters in the "real world" no matter how much we'd wish it were so.
And mme above may have been sarcastic, but there's a point there too. An excellent programmer can be immensely more productive in Lisp or OCaml. But most programmers are not excellent, so most companies can't hire excellent programmers even if they wanted to - and chances are they don't want to, since being an excellent programmer would imply being a _specialist_ and thus not knowing much about the actual business of the company. If you need someone to adapt and keep up your accounting system you probably gain a lot more from, say, an accountant or economist that also happens to program than you'd do from a programmer with little clue about company economics and law.
For the vast majority of companies, going with whatever gives them the greatest pool of reasonably competent talent is the name of the game. Under no circumstance do you want to be dependent on a particular person for a non-core business requirement like your IT system. Being able to yank any IT guy off the unemployment line or a bread and butter consultancy to support your stuff at short notice beats saving a few weeks of development time every time. Depending on the field, that probably means C++, Java, C# or Visual Basic, with the dynamic languages (Perl, Python and Ruby) as somewhat oddball outside options. Anything that will stump a mediocre application writer with a minor in business management is, righty, out of the question.

• Jianying says:

Practicality of usage is not always the determiner of success, The rise of XML and XML Schema is example of that. Their notation is far from economical but their technical practicality outbalanced usage practicality. (I know these are not programming language per se but they illustrate the point, I think)
Secondly I think ultimately what we'll have is a separation between syntax and symantics. This allows the language to be pure but with the syntactic sugars that makes it "practical". With compiled languages, there is no reason that "libraries" and "language core" should be any different to the user.
A example in the small is ruby on rails, where thru clever libraries database access routines becomes seamlessly integrated into ruby. The user is none-the-wiser that this integration is not a core element.
And the days of clinging to established languages is I think near its end. The proliferation of scripting languages, XML based document formats, and standard interchanges. Will be a boon to language exploration.

• Flaky says:

"Besides which, my views on real programming languages are too depressing. ..."
Damn, and here I was all optimistic that in a decade or so we'd be programming in new languages with all the best features plugged into one, with proof generating, verifying compilers. The way I see it, a lot of the crapware that the industry produces is directly caused by picking the wrong tools for the job. It's idiotic that we have to deal with problems such as buffer overflows, when efficient algorithms for array bounds checking have been available for quite a while.

• You know, I started writing a long comment in response to this thread, but it doesn't really add anything, so I'll just say: ME TOO
That being said, another programmer I know recently forwarded me an article on why Cobol is still around (from the Sep. 1997 Communications of the ACM), and in it the author makes some very good points that businesses usually have very different needs than one might think that they do, and that these needs aren't by and large being addressed by people who are doing research work in language design. (or even, the paper make a point of this, being addressed by language vendors)
I'll go a step further and say that the two environments of academic programming language use and business programming language use are two completely different ecosystems, and no matter how highly evolved a language may be in one ecosystem, it doesn't necessarily stand a chance in the other. (There are in all likelihood several different partially overlapping programming language ecosystems. I'm just picking two extreme characterizations for illustrative purposes)
This also means that even if an academic were to sit down and consider the needs of a shop still using COBOL, and design the perfect replacement, she might still be completely unable to get the business world to take her language seriously without some pre-existing buzz about the language. Now, although a language can generate buzz on its own merits in the academic world, a business-focused language likely wouldn't do that, because the new features would likely be uninteresting in an academic sense. So the new business language would die on the vine - too boring to grow big in the academic world and too small and unknown to be accepted in the boring world.

• Michael Chermside says:

I have been deeply steeped in the Python community for a number of years now (a regular reader and occasional commentator on the python-dev list). One of the things that has heartened me significantly is that the architects of this language are NOT afraid of learning from "academic" languages. They are *proud* of the times when they can mimic a good idea from Haskell (like list comprehensions); they try to look to things like ML when discussing how typing might (optionally) be made more explicit.
I doubt this is unique to Python. In fact, I think that many of the "newer" languages share this openness to learn something from the lesser-known but well-regarded languages. I hope that this trend will continue. If the next "super-popular" language (think of C, C++, or Java) is one that does not share the traditional C syntax (as those do), perhaps it will be built by people more willing to learn.
I can hope, anyhow... right?

• Scientopia Blogs