Two-Dimensional Pathology: Befunge

Sep 08 2009 Published by under pathological programming

I'm currently away on a family vacation, and as soon as vacation is
over, I'm off on a business trip for a week. And along the way, I've got some
deadlines for my book. So to fill in, I'm recycling some old posts. I decided
that it's been entirely too long since there was any pathological programming
'round these parts, so I'm going to repost some of my favorites.

Today, we're going to take a look at a brilliant language called Befunge.
Befunge is the work of an evil genius named Chris Pressey.

Normal programming languages are based on a basically one-dimensional
syntax; the program is a string, a sequence of characters, and it's processed
by reading that string in a straight-ahead fashion. But that's not Befunge!
It's got a two-dimensional syntax. Befunge is something like a two-dimensional
turing machine: the program is written on a two dimensionaltorus called the playfield. Each
instruction in Befunge is a single character, and where it's located on the
torus is crucial - control is going to move either up/down or left/right
on the torus. All of the control flow is expressed by just changing the direction that
the head moves over the torus.

(In case you're not familiar with a torus, it's what you get
if you take a very flexible sheet of paper, and roll it so that you connect
the top edge to the bottom, and then roll that tube so that you connect the
left edge to the right. You get a donut shape where moving up from what used
to be the top of the page puts you on the bottom of the page; moving left from
the left edge of the page puts you on the right.)

The basics of computation in Befunge are pretty straightforward. It's a
stack based language. Operations take their parameters from the stack, and
leave their results on the stack. Nothing too complicated there. There are
arithmetic operators, IO operators, control flow operators, all operating on
the values on the stack.

The arithmetic operators are the usual kinds of things: There are
operators for addition (+), subtraction (-), division (/), multiplication (*),
modulo (%), and logical negation (!). Digit characters are treated as
operators that push the numeric value of the digit onto the stack. (So "99"
will create a stack with two nines.) Comparisons are done using "`", which
pops the top two values from the stack and compares them. So if the top of the
stack was "x", and the value beneath it was "y", then "`" would leave a "0" on
the stack if x≤y, and 1 if x > y.

For IO, there are four operators. "&" reads an integer from standard input
and pushes it onto the stack. "~" reads a single character from standard
input, and leaves it on the stack. "." pops a value off the stack and writes
it to standard out as an integer. "," pops a value off the stack and writes it
to standard out as a character.

Of course, we can't have a stack-based language without some stack
operators: ":" makes a duplicate of the top value on the stack; "$" discards
the top value on the stack; "" swaps the top two values of the stack.

So far, nothing has looked particularly pathological - in fact, nothing
even looks particularly strange, other that the fact that it's pretty
illegible because of the single-character operators. But now, we get to
control flow, and that is where the insanity/brilliance of Befunge
reveals itself.

In Befunge, there's a read-head that moves over the program. Each step, it
executes the instruction under the head. But instead of just moving left or
right, it can move left, right, up, or down. ">" is an instruction that tells
the head to start moving to the right; "<" tells the head to start moving
left; "^" means start moving up, and "v" means to start moving down. So, for
example:

>v
^<

Is a program that runs an infinite loop: the head will just cycle over
those four characters. An even more interesting infinite loop (taken from the
befunge documentation) is:

>v>v
>^
^  <

Conditionals work by picking the direction that the head will move: "_" pops the stack, and if the value is zero, then it makes the head move right (">"); if it's non-zero, it makes the head move left ("<"). Similarly, "|" pops a value, and makes the head move up if the value was non-zero, or down if it was zero. To make things confusing, "#" means "skip the next instruction." (Actually, it's important for when a vertical and horizontal control flow cross.) And finally,
"@" is the exit command; it makes the head stop, and the program halt.

There's also a little hack for strings. A double-quote character (")
starts a string; when one is encountered, the head keeps moving in the same
direction, but instead of executing the characters as instuctions, it just
pushes the character values onto the stack. When the second quote is found, it
goes back to executing instructions.

Finally, just in case the basic two dimensional flow control isn't
pathological enough, there are two instructions for modifying cells on the
playfield! "g" pops an X and Y value off the stack, and pushes the character
at (X,Y) onto the stack; "p" pops an X, Y, and a character off the stack, and
writes the character onto location (X,Y) of the playfield. (The coordinates
are relative to the cell where the program started.)

So, let's look at a couple of befunge programs. As usual, we start with
the good old "hello world".

v
>v"Hello world!"0<
,:
^_25*,@

We start at the top left, head moving right. It moves until it hits the "v"
character, which makes it go down; then "<" makes it go left. 0 pushes a zero
onto the stack. Then we've got a quote - so it starts pushing characters onto
the stack until the next quote. When we get to the next quote, if we wrote the
stack so that the top comes first, it would look like : ('H' 'e' 'l' 'l' 'o' '
' 'w' 'o' 'r' 'l' 'd' '!' 0 ).

Then we hit a "v" which makes the head go down. This is the beginning of a
loop; the leftmost two characters of rows 2, 3, and 4 are a while loop! The
head goes down to ":" which duplicates the top of the stack; then it hits "_",
which turns left if the value on top of the stack is not zero; then the head
turns up, outputs a character, turns right, and we're back at the start of the
loop. So the loop will output each character until we get the the "0" we
pushed before the string; then at the "_", we turn right. 2 and 5 are pushed
on the stack and multiplied, leaving a 10, which is the linefeed character
(basically "n" for you C weenies). It outputs the linefeed, and then exits.

How about a truly pathological example? Here's a self-reproducing program
in 12 bytes.

 :0g,:93+`#@_1+ 

Stepping through that:

  1. Dup the value on top of the stack.
    That'll produce a "0" if the stack is empty. (So first pass, stack=[0])
  2. "0" Push a zero on the stack. (First pass, stack=[0,0])
  3. "g": Fetch the value at (x,y) on the stack; that's (0,0) initially.
    (First pass, stack = [':'])
  4. ",": Output it. So we printed the character at (0,0) (First pass, stack
    = [])
  5. ":" dup the stack top again. (First pass, stack = [0])
  6. "93+". Get the number 12 onto the stack. (First pass, stack = [12,0])
  7. Compare what was on top of the stack to twelve. Leave a 0 there if it
    was, or a 1 if it wasn't. (First pass, stack = [0]).
  8. "#" skip over the next character.
  9. "_" go right if the top of stack is zero; left if it's one. So if the
    value copied by the second ":" was greater than 12, then go left (in which
    case you hit "@" and halt); otherwise, keep going right.
  10. "1+": add one to the top of the stack (First pass, stack = ([1])). Then
    keep going right until you hit the right edge, and the you jump back to the
    left edge, so you're at the first ":" again.
  11. Now the whole thing repeats, except that there's a one on the stack. So
    the "g" will fetch (1,0); the "12" will be compared to 1, and the "1+" on the
    end will leave "2" on the stack. So now it fetches and outputs (2,0). And so
    on, until it reaches the "(_)" after outputting (12,0), when it halts.

One more, which I'll let you figure out for yourself. Here's a program
that prompts you for a number, and computes its factorial:

v
>v"Please enter a number (1-16) : "0<
,:             >$*99g1-:99p#v_.25*,@
^_&:1-99p>:1-:!|10          <
^     <

So is Befunge Turing complete? The answer is less obvious than many other programming
languages. In fact, it was a point of debate in the esolang community for a while.
The ultimate answer is that if the stack can contain arbitrarily large
numbers, then it's Turing complete. Otherwise, you're stuck in pushdown-automata
land. You've got unbounded storage on the stack, but you can only access
the top two elements - so you've got no random-access storage. If you've
got arbitrarily large numbers, you can put together a scheme for random-access
storage using data encoded into those big numbers. Want proof?

<v+**44_v        v88+*<v:!-".":!-",":!-">":!-"<":!-"-":!-"+":!-"!":~<+8
0> :7`!^         >4**-v*>"["-!"]"-!#v_  #v_  #v_  #v_  #v_  #v_  #v_  #v_!#^_
v     $<             #*             >1+$>1+$>1+$>1+$>1+$>1+$>1+$>   ^
>0:44*>/44*%:7`#v_:00p8+44**>+:1`#v_$:888**/888>**%:884**`!#v_        v
      ^ *44:_@#:<    #^88$< ^ **44<                ^ 888/**888:<         0
$00g1-:0`#v_1-:0`#v_1-:0`#^_1-:0`#v_1-:0`#v_1-:0`#v_1-0`#v_v            >
v_v        >$1+v -1$< $>      v+**488$<    v~\(<    v,:$<            >:884**`#
# <            >884**%       >>            >        >     >888* * *+^ +***888
$:0:44*/>44*%:7`#v_44*>*+:1`v        ^  p80+**562<0_^#!:  < :            v
^           ^ /*44:<     ^*44-8_$:44*>/44*%:7`v ^2_^#      <            $
`#v_8>+1`#v_v                            ^*44:    _\)$1:7>6+`!#v_77+`#v_1-:0
-1<  ^8**44< +                                              ^ 7:$<1  <+1<
^     0p80*84<
$:0:44*>/44*%:7`#v_44*>*+:1`v                                            >
^ *44:    <     ^*44-8_$:44*>/44*%:7`v
#v_1-:#v_$/8>44*/:0`#v_$>`  #v_v       ^*44:    _44**+:1:4>4*%:5`!#v_6`!
<   -1<    ^  +8**44<  ^+8**44< +                             ^ 4:/*44$< 0+1
^                         p80*84<

That mess is a Brainfuck interpreter written in Befunge! And since
Brainfuck is Turing complete, so is Befunge with bignums. Unfortunately, the
site from which I downloaded that was a geocities page that I can't seem to access
anymore. (If you know who the author is, please let me know so that I can provide
attribution.)

10 responses so far

  • Scott says:

    You forgot the purpose of the language. I will quote wikipedia: "an attempt to devise a language which is as hard to compile as possible".

  • tilk says:

    Several years ago I've written a just-in-time compiler for a subset of Befunge without the playfield modification operations. I didn't have any good idea for an efficient implementation of them... But well, I was in high school then. Maybe I should try again 🙂
    http://www.tilk.eu/files/llfunge.tar.gz

  • mtve says:

    This Begunge-in-Befunge is new for me. I've seen different one from Wim Rijnder, and I have mine - http://frox25.no-ip.org/~mtve/code/eso/bef/bef_bef/
    Also "torus" was not quite well specified in original description, so many interpreters have bugs when PC crosses the border. Usually it can be avoided and robust Befunge program should not use this feature.

  • Freak says:

    If the following changes are made, does Befunge remain Turing complete?
    1) Field modification opcodes are changed to be relative to current location of pointer (or new opcodes are added).
    2) Stack elements are limited precision.
    3) The field is extended whenever a field modification opcode writes outside the current board. The field can be expanded without limit.

  • mtve says:

    Oh, i misread, it's Brainfuck-in-Befunge. The source has typos anyway, it looks like it should be "_" at the first position of line 9, not a space.

  • Avi Steiner says:

    7. Compare what was on top of the stack to twelve. Leave a 0 there if it was, or a 1 if it wasn't. (First pass, stack = [0]).

    Shouldn't the stack be [1] after the first pass? The definition you gave of the "`" operation is

    So if the top of the stack was "x", and the value beneath it was "y", then "`" would leave a "0" on the stack if x≤y, and 1 if x > y.

    Substituting 12 for x and 0 for y, viz step 6, gives 1, not 0. Am I missing something?

  • Matt says:

    To #4:
    On a case by case basis:
    1) With finite sized numbers on a finite sized grid there is no way to build a Turing complete language. You must create a language that can hold infinite information as a necessary condition.
    2) I believe that as long as you can hold every integer value (or at least a mapping to an integer value) it should remain Turing complete. If the language is not capable of that then you cannot hold infinitely big numbers anyway, so the base conditions are not met.
    3) This is an interesting case... I think this should be Turing complete since you could grow your torus to be infinitely large on either of it's axis. I think a proof of concept program is needed here...

  • Freak says:

    It wasn't meant to be case by case; I meant make all three changes at once.

  • Brian X says:

    I worked with Chris for a while on my own (rather less) esoteric language project, var'aq (which was kind of a slapped-together cross between PostScript and Lisp with Klingon keywords); specifically he wrote the interpreter logic while I did most of the semantics. He's a good programmer, although he kind of fell off the face of the earth for a while. (I think he may have wanted his name taken off var'aq, but he never explicitly asked me to, so he remains credited as a coauthor.)

  • LightningRose says:

    If anyone is not familiar with a torus, it looks like a fracking donut.

Leave a Reply