I recently read the book The Annotated Turing by Charles Petzold. It was really enlightening. Though all computer science graduates study concepts like Turing machines, Turing Completeness etc. as part of their courses in colleges, questions such as how and why did Turing come up with these concepts is not taught usually (spoiler alert: Turing did not come up with either terminology that I just mentioned). Charles does a good job of making Turing’s classical 1937 paper more accessible without dumbing it down too much. He also gives a lot of historical context as well as background knowledge that Turing assumes his readers to have.
The book is aptly sub-titled A Guided Tour through Alan Turing’s Historic Paper on Computability and the Turing Machine. In it, Turing’s paper is shown as is, but is supplemented with the author’s own annotations in between. In this post, I will try to present a very concise overview of the key takeaways from the book.
Das Entscheidungs problem
Turing’s actual 1937 paper was titled On Computable Numbers, With An Application To The Entscheidungsproblem. To understand the paper, we need to understand what that problem is. Entshcheidung means decision or decidability. In 1900, during his address at the Second International Congress of Mathematicians, one of the prominent mathematicians at the time, David Hilbert, presented 23 questions as challenges for mathematicians to solve in the next century. The 10th problem in the list was this:
10. Determination of the Solvability of a Diophantine Equation
Diophantus was a Greek mathematician who is estimated to have lived around 250 CE and wrote a book called Arithmetica. Diophantine equation has come to mean an algebraic equation in which only whole number solutions are allowed.
Given such a problem, Hilbert was interested in a process with finite number of steps to determine whether the problem has a solution or not i.e. is the problem solvable? This problem came to be known as the Entscheidungsproblem or decision problem. Note, Hilbert is not interested in a general method to solve Diophantine equations here.
Turing proves that no such general decision process exists.
Gödel’s Incompleteness Theorem
As Turing himself remarks in the paper, his results are superficially similar to the famous Incompleteness Theorem by Kurt Gödel from 1931. However, he clarifies the difference between both results.
Gödel essentially shows that any consistent formal system is necessarily incomplete. We can find propositions P such that neither P nor ~P (negation of P) is provable within that system.
Till that time, mathematicians were attempting to create flawless systems which were both complete and consistent. Gödel’s results proved that such attempts were in vain.
Even with these results, a general decision process as described by Hilbert could conceivably exist. Given an unprovable proposition, it would identify it as unprovable. It would similarly identify provable propositions also correctly. Such a decision process is what Turing shows as cannot exist.
Turing’s approach leverages Predicate Calculus extensively. I will not try to explain the details of Turing’s proof here (for that, you need to read the book), but I will try to summarize the key ideas that he builds up and uses in his proof.
Computing Machine (M): is a machine which
- is supplied with a tape running through it, divided into sections (squares) capable of bearing a symbol each.
- whose motion at each stage is completely defined by its configuration.
- and can perform one of the following operations at any moment:
- read one symbol
- print one of two kinds of symbols onto a square:
- figures (either 0 or 1)
- other symbols
- change the square being scanned (by shifting left or right)
Here configuration is what tells the machine what action to perform when it encounters a symbol. Henceforth, I’ll refer to Computing Machines as just machines.
Circular and Circle free machines: If a machine never writes more than a finite number of symbols of the first kind (0 or 1), it is called circular. Otherwise, it is circle-free. This notion becomes one of the most crucial aspects of Turing’s proof later.
Notice that the terminology here is a bit counter-intuitive. In Turing’s formulation, a machine which does not stops is considered good (circle-free) whereas one which stops after finite steps is bad (circular).
Computable Sequences: A sequence is said to be computable if it can be computed by a circle-free machine. Turing shows that his machines are capable of computing all sequences that we generally consider as computable (such as pi).
Description Number (D.N): Turing shows how all machines can be represented using a standard form such as:
q1 S0 S1 R q2; q2 S0 S0 R q3; q3 S0 S2 R q4; q4 S0 S0 R q1;
Here the first section (separated by a ; ) => q1 S0 S1 R q2;
can be interpreted as => From q1 state, if the machine reads symbol S0, print the symbol S1, shift right and transition to state q2.
He modifies (serializes) the form to a notation that can be more easily parsed by another machine. This notation is called the Standard Description (S.D) of the machine. S.D for the above machine for eg:
DADDCRDAA; DAADDRDAAA; DAAADDCCRDAAAA; DAAAADDRDA
He further says each such S.D can be represented by a number (by replacing A with 1, C with 2 etc.). This is the Description Number (D.N) for that machine. Note that S.D and D.N can be used inter changeably as you can always convert from to the other.
Universal Machine (U): This is one of the most brilliant insights of Alan Turing. Once he shows that any machine can be represented as a number, he creates a machine which can take the Description Number of any other machine M as input and prints the same output as M. This can be considered as the first ‘general-purpose’ computer ever conceived.
In the next section, I will describe how Turing uses this universal machine to prove that the decision problem cannot be solved. However, it is worth appreciating that we remember Turing in the present more for his conception of Computing Machines (later known as Turing Machines), the Universal Machine and its limitations rather than for proving the Decision Problem. This is one of the cases in history where the journey contributed more than the destination (end goal).
The high level approach that Turing takes to solve the decision problem can be outlined as follows:
Each of these steps might seem like a logical jump from the previous one, but they do follow from the previous ones consistently. However, there is a lot of tedious details to the representation of these statements in predicate calculus and hence in the proofs.
I will try to present an brief explanation of the first of these four steps. However, by definition, this will not be rigorous or complete.
No general process exists to determine whether a machine is circle free
Turing proves this by reductio ad absurdum or proof by contradiction. First, he assumes that there exists a machine D which given the S.D of any machine M, tests this S.D and if M is circular, outputs “u” (unsatisfactory) and if it is circle-free, outputs “s” (satisfactory) in a finite number of steps.
If M is circle-free, we can give the S.D of M to the Universal Machine (U) to compute the output of M. Turing combines D and U to create a machine H which works as follows:
Here R(N) keeps track of the number of valid circle-free machines encountered so far.
If you follow along the flow-chart, you’ll see that the Nth digit of H‘s output is the Nth digit in the output of a Nth valid circle-free machine.
For e.g., if you count up from 1:
1 2 3 4 ... 313,325,317 -> M1 - Prints 0s to the right ... 3,133,225,317 -> M2 - Prints 1s to the right ...
The first valid D.N that you hit upon is 313,325,317. Let’s call this machine M1. Its output is an unending sequence (as it is circle-free) of 0s. So the first digit in the output of H will be 0 (the first digit of M1).
Similarly, the second valid D.N that you will find will be 3,133,225,317, for M2. Its output will be an unending sequence of 1s. Hence the second digit in the output of H will be 1 (the second digit of M2).
Here, Turing points out that the machine H has to be circle-free by our construction. It is composed on D and U. D is circle-free by our assumption. U only computes finite digits of outputs for circle-free machines. Hence H, the combination of both has to be circle-free.
Now the problem arises when H reaches its own D.N, let’s call it K. Now, how will D classify K? It cannot be circular (or unsatisfactory, u) as pointed out above. But if it is circle-free, U has to print the R(K)th digit of H.
How many digits would have H printed by the time it reaches K? R(K – 1).
Now, it order to compute its R(K)th (i.e. R(K – 1) + 1) digit, it needs to evaluate itself using U, compute R(K) digits and then print its R(K)th digit. When U evaluates H and reaches the R(K)th digit, it needs to evaluate H again. This leads to an infinite recursion!
By contradiction, Turing proves that there cannot be a general process by which we can determine whether a given S.D is that of a circle-free machine.
This is the crux of Turing’s proof. A variation of this problem is what has come to be known as the Halting Problem (defined by Martin Davis in 1958).
From here, Turing proves the 2nd step by reducing it to the first. Then he proves the 3rd step by reducing it to the 2nd and so on. His machines do not have a notion of algebra or even numbers to begin with. Like I said earlier, there is extensive use of first-order predicate calculus to formulate a number system and to actually get to the decision problem from here.
Maybe that can be a subject for a second post later.
As mentioned before, this terminology was coined later and was never used by Turing himself. In Computability Theory, a system is said to be Turing Complete if it can be used to simulate any Turing machine. It implies that the system is theoretically capable of expressing all tasks accomplishable by computers. Most general-purpose programming languages are Turing Complete.
Turing and Lambda Calculus
This book brought a lot of clarity to the relationship between Lambda Calculus and Turing. Before that, for some reason, I had always assumed Lambda Calculus to be more recent than Turing.
In reality, Alonzo Church, creator of Lambda Calculus beat Turing by a very close margin in solving the decision problem. Church’s paper A Note on the Enscheidungsproblem which also proved that the decision problem was unsolvable was received by The Journal of Symbolic Logic on April 15, 1936, six weeks before Turing’s submission to the London Mathematical Society on May 28, 1936!
Turing then had to publish a 3-page appendix to his paper (in August 1936) proving that his notion of computability and Church’s notion of effective calculability are both effectively equal. Alonzo Church had got his Ph.D in mathematics from Princeton and had taught there as a professor from 1929 to 1967. In the fall of 1936, Turing went to Princeton to get his Ph.D under Alonzo Church. Among his students, Church had other famous mathematicians such as Stephen Kleene and Martin Davis who expanded on both Turing’s and Church’s works later.
Lambda Calculus has had a direct influence on functional programming languages such as Lisp, Haskell, Scheme, Clojure etc.