I'm currently away on a family vacation, and as soon as vacation is
over, I'm off on a business trip for a week. And along the way, I've got some
deadlines for my book. So to fill in, I'm recycling some old posts. I decided
that it's been entirely too long since there was any pathological programming
'round these parts, so I'm going to repost some of my favorites.
Today, we're going to take a look at a brilliant language called Befunge.
Befunge is the work of an evil genius named Chris Pressey.
Normal programming languages are based on a basically one-dimensional
syntax; the program is a string, a sequence of characters, and it's processed
by reading that string in a straight-ahead fashion. But that's not Befunge!
It's got a two-dimensional syntax. Befunge is something like a two-dimensional
turing machine: the program is written on a two dimensionaltorus called the playfield. Each
instruction in Befunge is a single character, and where it's located on the
torus is crucial - control is going to move either up/down or left/right
on the torus. All of the control flow is expressed by just changing the direction that
the head moves over the torus.
(In case you're not familiar with a torus, it's what you get
if you take a very flexible sheet of paper, and roll it so that you connect
the top edge to the bottom, and then roll that tube so that you connect the
left edge to the right. You get a donut shape where moving up from what used
to be the top of the page puts you on the bottom of the page; moving left from
the left edge of the page puts you on the right.)
The basics of computation in Befunge are pretty straightforward. It's a
stack based language. Operations take their parameters from the stack, and
leave their results on the stack. Nothing too complicated there. There are
arithmetic operators, IO operators, control flow operators, all operating on
the values on the stack.
The arithmetic operators are the usual kinds of things: There are
operators for addition (+), subtraction (-), division (/), multiplication (*),
modulo (%), and logical negation (!). Digit characters are treated as
operators that push the numeric value of the digit onto the stack. (So "99"
will create a stack with two nines.) Comparisons are done using "`", which
pops the top two values from the stack and compares them. So if the top of the
stack was "x", and the value beneath it was "y", then "`" would leave a "0" on
the stack if x≤y, and 1 if x > y.
For IO, there are four operators. "&" reads an integer from standard input
and pushes it onto the stack. "~" reads a single character from standard
input, and leaves it on the stack. "." pops a value off the stack and writes
it to standard out as an integer. "," pops a value off the stack and writes it
to standard out as a character.
Of course, we can't have a stack-based language without some stack
operators: ":" makes a duplicate of the top value on the stack; "$" discards
the top value on the stack; "" swaps the top two values of the stack.
So far, nothing has looked particularly pathological - in fact, nothing
even looks particularly strange, other that the fact that it's pretty
illegible because of the single-character operators. But now, we get to
control flow, and that is where the insanity/brilliance of Befunge
In Befunge, there's a read-head that moves over the program. Each step, it
executes the instruction under the head. But instead of just moving left or
right, it can move left, right, up, or down. ">" is an instruction that tells
the head to start moving to the right; "<" tells the head to start moving
left; "^" means start moving up, and "v" means to start moving down. So, for
Is a program that runs an infinite loop: the head will just cycle over
those four characters. An even more interesting infinite loop (taken from the
befunge documentation) is:
>v>v >^ ^ <
Conditionals work by picking the direction that the head will move: "_" pops the stack, and if the value is zero, then it makes the head move right (">"); if it's non-zero, it makes the head move left ("<"). Similarly, "|" pops a value, and makes the head move up if the value was non-zero, or down if it was zero. To make things confusing, "#" means "skip the next instruction." (Actually, it's important for when a vertical and horizontal control flow cross.) And finally,
"@" is the exit command; it makes the head stop, and the program halt.
There's also a little hack for strings. A double-quote character (")
starts a string; when one is encountered, the head keeps moving in the same
direction, but instead of executing the characters as instuctions, it just
pushes the character values onto the stack. When the second quote is found, it
goes back to executing instructions.
Finally, just in case the basic two dimensional flow control isn't
pathological enough, there are two instructions for modifying cells on the
playfield! "g" pops an X and Y value off the stack, and pushes the character
at (X,Y) onto the stack; "p" pops an X, Y, and a character off the stack, and
writes the character onto location (X,Y) of the playfield. (The coordinates
are relative to the cell where the program started.)
So, let's look at a couple of befunge programs. As usual, we start with
the good old "hello world".
v >v"Hello world!"0< ,: ^_25*,@
We start at the top left, head moving right. It moves until it hits the "v"
character, which makes it go down; then "<" makes it go left. 0 pushes a zero
onto the stack. Then we've got a quote - so it starts pushing characters onto
the stack until the next quote. When we get to the next quote, if we wrote the
stack so that the top comes first, it would look like : ('H' 'e' 'l' 'l' 'o' '
' 'w' 'o' 'r' 'l' 'd' '!' 0 ).
Then we hit a "v" which makes the head go down. This is the beginning of a
loop; the leftmost two characters of rows 2, 3, and 4 are a while loop! The
head goes down to ":" which duplicates the top of the stack; then it hits "_",
which turns left if the value on top of the stack is not zero; then the head
turns up, outputs a character, turns right, and we're back at the start of the
loop. So the loop will output each character until we get the the "0" we
pushed before the string; then at the "_", we turn right. 2 and 5 are pushed
on the stack and multiplied, leaving a 10, which is the linefeed character
(basically "n" for you C weenies). It outputs the linefeed, and then exits.
How about a truly pathological example? Here's a self-reproducing program
in 12 bytes.
Stepping through that:
- Dup the value on top of the stack.
That'll produce a "0" if the stack is empty. (So first pass, stack=)
- "0" Push a zero on the stack. (First pass, stack=[0,0])
- "g": Fetch the value at (x,y) on the stack; that's (0,0) initially.
(First pass, stack = [':'])
- ",": Output it. So we printed the character at (0,0) (First pass, stack
- ":" dup the stack top again. (First pass, stack = )
- "93+". Get the number 12 onto the stack. (First pass, stack = [12,0])
- Compare what was on top of the stack to twelve. Leave a 0 there if it
was, or a 1 if it wasn't. (First pass, stack = ).
- "#" skip over the next character.
- "_" go right if the top of stack is zero; left if it's one. So if the
value copied by the second ":" was greater than 12, then go left (in which
case you hit "@" and halt); otherwise, keep going right.
- "1+": add one to the top of the stack (First pass, stack = ()). Then
keep going right until you hit the right edge, and the you jump back to the
left edge, so you're at the first ":" again.
- Now the whole thing repeats, except that there's a one on the stack. So
the "g" will fetch (1,0); the "12" will be compared to 1, and the "1+" on the
end will leave "2" on the stack. So now it fetches and outputs (2,0). And so
on, until it reaches the "(_)" after outputting (12,0), when it halts.
One more, which I'll let you figure out for yourself. Here's a program
that prompts you for a number, and computes its factorial:
v >v"Please enter a number (1-16) : "0< ,: >$*99g1-:99p#v_.25*,@ ^_&:1-99p>:1-:!|10 < ^ <
So is Befunge Turing complete? The answer is less obvious than many other programming
languages. In fact, it was a point of debate in the esolang community for a while.
The ultimate answer is that if the stack can contain arbitrarily large
numbers, then it's Turing complete. Otherwise, you're stuck in pushdown-automata
land. You've got unbounded storage on the stack, but you can only access
the top two elements - so you've got no random-access storage. If you've
got arbitrarily large numbers, you can put together a scheme for random-access
storage using data encoded into those big numbers. Want proof?
<v+**44_v v88+*<v:!-".":!-",":!-">":!-"<":!-"-":!-"+":!-"!":~<+8 0> :7`!^ >4**-v*>"["-!"]"-!#v_ #v_ #v_ #v_ #v_ #v_ #v_ #v_!#^_ v $< #* >1+$>1+$>1+$>1+$>1+$>1+$>1+$> ^ >0:44*>/44*%:7`#v_:00p8+44**>+:1`#v_$:888**/888>**%:884**`!#v_ v ^ *44:_@#:< #^88$< ^ **44< ^ 888/**888:< 0 $00g1-:0`#v_1-:0`#v_1-:0`#^_1-:0`#v_1-:0`#v_1-:0`#v_1-0`#v_v > v_v >$1+v -1$< $> v+**488$< v~\(< v,:$< >:884**`# # < >884**% >> > > >888* * *+^ +***888 $:0:44*/>44*%:7`#v_44*>*+:1`v ^ p80+**562<0_^#!: < : v ^ ^ /*44:< ^*44-8_$:44*>/44*%:7`v ^2_^# < $ `#v_8>+1`#v_v ^*44: _\)$1:7>6+`!#v_77+`#v_1-:0 -1< ^8**44< + ^ 7:$<1 <+1< ^ 0p80*84< $:0:44*>/44*%:7`#v_44*>*+:1`v > ^ *44: < ^*44-8_$:44*>/44*%:7`v #v_1-:#v_$/8>44*/:0`#v_$>` #v_v ^*44: _44**+:1:4>4*%:5`!#v_6`! < -1< ^ +8**44< ^+8**44< + ^ 4:/*44$< 0+1 ^ p80*84<
That mess is a Brainfuck interpreter written in Befunge! And since
Brainfuck is Turing complete, so is Befunge with bignums. Unfortunately, the
site from which I downloaded that was a geocities page that I can't seem to access
anymore. (If you know who the author is, please let me know so that I can provide