Busy Beaver TM simulation

My Computing Theory lecturer, James Harland gave a lecture on Busy Beaver Turing Machines. I have been running some experiments to try to simulate these machines.

A BB is a Turing Machine which prints out a lot of 1's (not necessarily consecutively) and then halts. Most research seems to have been carried out by Marxen.

The difficulty in simulating these machines is that they run for a long time and use up a lot of tape. For example, the current record-holding 6-state beaver prints over 10865 1's and shifts state over 3 * 101730 times.

The first problem is representing the tape in memory -- no-one has 10856 gigabytes of memory. The second problem is in running time -- no-one wants to wait 10847 millenia (* assuming optimal performance on a 100-GHz machine).

Current progress

So far my simulator can run Marxen's 1997 6-state candidate:

States 6
1's 95,524,079
Shifts 8,690,333,381,690,951

It runs in less than a second on my 900MHz iBook.

You can download the current C source here. You will also require the Gnu MP number library and GCC 3.x (it uses std=c99). Note that this source is currently buggy, and shift counting is disabled except in the naive machine -- see below.

tm-0.4.tar.gz (10 kB)

After extracting and 'make'ing, to run the 95-million machine:

./tm --proof -k 2 --dump-tape -c 1 bb6_marxen_2.tm

Log

July 17
Been a while since I posted updates here, so here's the short of it. The proof machine is working correctly -- that is, it will look for patterns and generate inductive proofs with an arbitrary number of variables. The last week has seen me fix various subtle bugs in the proof machine, and also add one macrotoken of tail context to the macro machine. This works for the most part, but causes the bb6-marxen-3 machine to halt early and unpredictably. All other machines work, however, so it is proving difficult to locate this bug. Currently there is no way to disable tail context either.

Some memory bugs in the pattern detection and the macromachine were found with valgrind and are fixed now.

July 9
Finally got the proofs working (after a holiday!). The performance of bb6-marxen-2 (>95 million one's) is greatly improved, now running in under a second, compared with 10 minutes for the macro machine. However, the next candidate, bb6-marxen-3 still won't run. This seems to be because the base macro machine is not getting into a recurring state. This is not due to an incorrect k-value. I think it can be resolved by increasing the amount of tape the naive machine is allowed to work on when generating macro-arcs. This is probably a generalisation of Marxen's "tail context". I will have a go at it tomorrow.
June 28
Created a container (a b-tree, actually) for storing past configurations. A new machine (the "proof" TM) looks for similar configurations in the history and tries to generate proofs for moving from one parametised configuration to another. This is working great, and tomorrow I will get it to apply the proofs to the machine which should make a huge impact on performance.
June 27
Again, no major changes to performace yet; refactored a great deal of code to separate out the tape and macrotoken generation from the TM emulator. The naive TM also runs on the macro tape now because this is cleaner, even if less efficient.
June 26
Changed the tape representation so that index positions within repeated blocks cannot be stored. This means when moving through or changing tokens in a block they must be repeatedly split and rejoined. This is simpler in some ways and trickier in others, and has no real impact on performace. It will, however, make it possible to track configurations easily, so I can make a start on automatic proof generation.

I also fixed the tape dump, so that after the machine halts the exact tape configuration, including the position of the head, is shown correctly.

Any ideas/comments to s2107062@student.rmit.edu.au