The Most Pathological Machine I've Ever Seen: Tag

Feb 23 2007 Published by under pathological programming

What we have here is a truly warped language.

Back in the very early days of what eventually became computer science, many of the people working in the field invented all sorts of automatons/computing formalisms. The one that I've always found the most confounding is the Tag machine invented by Emil Post.

The tag machine is simple to the point of triviality. The machine is a queue of characters (with one character designated as "Halt"), and a set of rules. Each rule has a different character that selects the rule, and a string of characters. Each step, the machine looks at the first character of the queue, and selects the rule that that is associated with that character. If the character is "Halt", the machine just stops, and whatever is left on the queue is the result of the computation. Otherwise, it appends the selected rule's string of characters to the end of the queue, and then removed a fixed number of characters from the front of the queue. The machines are called "n-Tag" machines where "n" is number of character dropped each step.

That's it. Look at the front of the queue, use it to pick a set of characters to append, and then remove and discard the first N characters from the queue..

For any N≥2, a Post N-tag machine is Turing equivalent.

Like I said above - I've always found the Post Tag machine to be thoroughly confounding. I can grasp why it's Turing equivalent, but for the life of me, I've never been able to figure out how to actually implement anything interesting on one. So, I figured, there are tons of esoteric/pathological language fans out there who love nothing more than the challenge of writing interesting programs for bizarre languages; I'll write a post-tag language, and throw it to the wolves, and see what happens!

So, todays pathological language is my own nasty little monstrosity, called (for obvious reasons), "Tag". It's implemented in (what else?) Haskell. The source code for it is here. To compile, just grab the Glasgow haskell compiler, GHC, and run "ghc -package parsec -o runtag tag.hs". Then to run the interpreter, put your code into a file, and run "runtag programfile initialqueue".

The Tag language is a variable post-tag machine enhanced with output. (I originally also included input, but I couldn't figure out any way to do anything useful with and input other than the initial queue contents.) So each rule specifies what character will trigger it, how many characters to drop off of the queue, and what string to append to the queue. Also, it can output the characters that it removes from the queue (omitting the character that selected the rule).

The syntax is very simple: it's a list of rules. Each rule has the format:

name ":" triggerchar numbertodrop "("characters to append")"

For example, "addThreeAsOnB:B2(AAA)" is a rule named "addThreAsOnB". It will run
when the first character on the queue is "B", and it will drop the first two characters from the queue, and append "AAA".

For output, if the number to drop is followed by "!", then it will print out
the dropped characters except for the one that triggered the command.

As usual, we'll start by looking at the hello world program. (Which is fortunate, since it's pretty much the only remotely useful program that I've figured out how to write so far!).

fillQueue:a2(pHpeplplpop pWpoprplpd#)
printChar:p2!(#)

The program expects the input queue to contain two characters, the first of which is "a". It looks at the first character, which is an "a"; finds the "a" rule (which is named "fillQueue"); that rule drops two characters, and then appends that ugly string. The string is the characters of "Hello World" each preceded by "p". The "p" triggers the rule named "printChar" which drops two characters; since it's got a "!" after it, it prints the second of the two characters being removed from the queue. So it prints the character after the "p"; and it appends another halt to the queue. It does that all the way until it reaches a "#", which is the halt character, and then stops. So it's hello world.

Now, if you're like me, at this point you're saying something like "Well, OK. I can see how hello world works. But how do you expect me to believe that this is really Turing complete? Where's the control flow?".

There are two tricks for control flow: the easy one, and the hard one. We'll start with the easy one: just use the character on the head of the string to select one of two possible strings to append; the appended string is both data and code - so you're selecting a flow-case by selecting a string to append.

For example - let's print out "A" if the first character on the queue is a "1", and "B" if it's a "0".

ifZero:02(pB)
ifOne:1b(pA)

The harder one shows why the machine needs to be a 2-tag or larger. You can interleave two instruction strings. So what you do is look at your condition (the value of the leading character of the queue), and append either an even or an odd number of characters, followed by the interleaved program string. If you inserted an even number of characters, then the even-position characters will be the ones that trigger commands, and the odd-position characters will be discarded; otherwise, the odd-position characters will trigger commands, and the even-position characters will be discarded.

ifA:A2(sss)
ifB:B2(ss)
ifS:s2(cd)
ifC:c2(pA#)
ifD:d2(pB#)
print:p2!( )

What this does is: if the first character is an "A", it inserts 3 "s"s; if it's "B", it inserts two "s"s. On an "s", it appends "cd". "c" causes it to print "A" and then halt; "d" causes it to print "B" and halt.

So suppose we started with "AA" on the queue. It would pick the rule "ifA:", drop the two As, and the queue would be "sss". Then it would pick rule "s", which drops two, and appends "cd" - so the queue would be "scd". It would run "s" again, which would append another "cd", leaving "dscd". "d" appends pB#, so the queue becomes "cdpB#"; c appends "pA#", leaving "pB#pA#". And then "B" gets printed, and the program halts.

Let's look at the other cases: "BB" on the queue. Rule B runs, leaving "ss" on the queue. "s" appends "cd", leaving "cd" on the queue. "c" appends "pA#" onto the queue, leaving the queue as "pA#". So it prints "A", and halts.

So, we've got the ability to do conditionals, and we've got an arbitrarily large storage space in the queue. The only thing left that we need to be Turing equivalent is some kind of looping or recursion. And that's easy: a rule triggered by "a" that appends a string with the character "a" in a position where it could trigger a rule is recursion. So this is Turing complete. But damned if I can figure out how to do much of anything actually useful with it.

No responses yet

  • Linus S. says:

    Looking forward to comments on this one, cute machine.

  • Oliver says:

    I think there's a slight problem with your "harder" control flow example, unless I'm understanding things wrong (which seems likely). When the queue is "scd", it runs "s", which removes 2 characters and appends "cd", which should produce "dcd", not "dscd". From "dcd", you get "dpB#", then "B#pB#", then "pB#ss", so it does actually print "B", but not quite the way you said.

  • jag says:

    Shouldn't ifOne read ifOne:12(pA)?

  • achgoon says:

    On Firefox 2.0, the greater or equal sign isn't being rendered. Cheers!

  • Michael Chermside says:

    Minor nit: you write "addThreeAsOnB:B2(AAA)" is a rule named "addThreAsOnB" but the latter is missing an "e" on the word Three.
    As for the language... I love it! I'm swamped this week, but I have to flag this for later so I can come back and try to make this language work.

Leave a Reply