Turing trees

Sunday, June 15^th, 2025

The clearest way into the Universe is through a forest wilderness.
—John Muir

I sometimes think about models of computation as an infinite network of infinite binary trees. I call the trees Turing trees and the network the Turing web. These objects offer an elegant graphical perspective on various concepts in the theory of computable functions. In this note, I explore this perspective. (I assume readers are already familiar with Turing machines.)

Contents:

From Turing machines to Turing trees
Turing sub-trees and universality
From Turing trees to the Turing web
Conclusion

Related work: Markus Müller studied a subgraph of the Turing web in “Stationary algorithmic probability”. This is what prompted me to think about the whole Turing web and the trees that comprise it. Turing trees are similar to (program) codes in (algorithmic) information theory, however, there the emphasis seems to be almost exclusively on prefix-free codes, whereas I allow labels on internal nodes of the tree. Conversations with Edmund Lau helped me clarify this perspective, especially on the Turing web.

Thanks to Daniel Murfet, Edmund Lau, Billy Snikkers, and Gemini 2.5 Pro for helping me to improve this note.

From Turing machines to Turing trees

A Turing machine is an abstract model of a computer. A Turing machine’s input/output behaviour can be encoded as a computable partial function on binary strings. A Turing tree is an infinite, partially-labelled, complete binary tree that encodes a computable partial function on binary strings. This section unpacks these concepts and describes the encoding.

Turing machines and computable partial functions. There are many slightly different definitions of Turing machines. In this note, for convenience, I’ll consider a Turing machine with input alphabet $\Sigma = \{{\tt0}, {\tt1}\}$ and a single, bidirectional working tape.

To run a Turing machine $M$ on input $x \in \Sigma^\ast$ , we initialise the input tape with $x$ starting from under the tape head and surrounded by blank symbols. Then, we let the machine run its course according to its transition dynamics. If the machine halts, the output $M(x)$ is the binary contents of the tape at that time (otherwise, $M(x)$ is undefined). Thus, we have $M : \Sigma^\ast \rightharpoonup \Sigma^\ast$ (where $\rightharpoonup$ denotes a partial function).

An example Turing machine and its corresponding partial function.

Infinite, partially-labelled, complete binary trees. Note that in a partially-labelled tree, each node has an optional finite binary string label $y \in \Sigma^\ast$ . Some nodes may not have any label (note that this is different from the node having the empty string $\varepsilon\in \Sigma^\ast$ as a label).

Moreover, in an infinite, complete binary tree, each node is uniquely identified by a finite binary string address $x \in \Sigma^\ast$ , describing the path for reaching the node from the root (descending left for every $\tt0$ and right for every $\tt1$ in $x$ ).

Left: Each node in a binary tree has a binary string address. Right: Each node in a partially-labelled tree may optionally have a binary string label. — Left: Each node in a binary tree has a binary string *address.* Right: Each node in a partially-labelled tree may optionally have a binary string *label.*

Turing trees. Given a Turing machine $M$ , we can construct a Turing tree as follows. For each address $x \in \Sigma^\ast$ , run $M$ on input $x$ . If $M$ halts, take the output $M(x)$ and use this as the label for the node with address $x$ . If $M$ fails to halt, ascribe no label to the node with address $x$ .¹ Here is a detailed example of a Turing machine, its partial function, and its Turing tree.

From left to right: a Turing machine, the corresponding partial function, and the corresponding Turing tree

Notes. In general, the mapping from Turing machines to partial functions discards mechanistic information about how the Turing machine implements the behaviour. This means that multiple Turing machines give rise to each Turing tree.

Moreover, not all partial functions on binary strings are implementable by a Turing machine.² Accordingly, there are some infinite, partially-labelled, complete binary trees that are not Turing trees. Call these trees uncomputable trees.

Turing sub-trees and universality

Every node of an infinite tree is the root of its own infinite sub-tree comprising all of the descendants of that node. For Turing trees, it turns out that each of these sub-trees is also a Turing tree. Conversely, for the Turing trees corresponding to universal Turing machines, all Turing trees arise as sub-trees.³

All sub-trees of Turing trees are Turing trees. Formally, given a Turing tree $T$ , each address $x \in \Sigma^\ast$ corresponds to an infinite, partially-labelled, complete binary sub-tree denoted $T_x$ . For $T_x$ , the node with address $x' \in \Sigma^\ast$ has the same label (or lack thereof) as the node with address $xx'$ in the original Turing tree $T$ .

Each node of a Turing tree is the root of its own infinite, partially-labelled, complete binary sub-tree. — Each node of a Turing tree is the root of its own infinite, partially-labelled, complete binary *sub-tree.*

I claim that all sub-trees of a Turing tree are themselves Turing trees. To prove this, we can show how a Turing machine encoded by the original Turing tree can be converted into a Turing machine for an arbitrary sub-tree. I sketch this construction as follows.

Consider a Turing tree $T$ . Let $M$ be its corresponding Turing machine. Consider an address $x \in \Sigma^\ast$ . We want to show that the sub-tree $T_x$ is a Turing tree. If $x = \varepsilon$ , $M$ is the corresponding Turing machine. Otherwise, construct a new Turing machine $M_x$ as follows.

The state machine of $M_x$ begins by immediately moving the tape head one space to the left, to the blank space before the start of its input.
The state machine then transitions through a hard-coded sequence of states that prepend $x$ to the input on the tape. First writing the final symbol of $x$ and then moving left, then the penultimate symbol, and so on until writing the first symbol of $x$ .
Finally, the machine positions the tape head over the first symbol of $x$ and transitions into a copy of the state machine of $M$ (starting in the initial state).

The effect of this construction is that, given an input $x' \in \Sigma^\ast$ , $M_x$ prepends $x$ to the input before running $M$ on $xx'$ . Therefore, the machine $M_x$ corresponds to the Turing tree $T_x$ .

Given a node with address , we can always construct a Turing machine that prepends the address to any input before running the original tree’s Turing machine. This new machine shows that the sub-tree is a Turing tree. — Given a node with address $x = x_1 x_2 \cdots x_{n-1} x_n \in \Sigma^\ast$ , we can always construct a Turing machine that prepends the address to any input before running the original tree’s Turing machine. This new machine shows that the sub-tree is a Turing tree.

All Turing trees are sub-trees of a universal Turing tree. Let me define a universal Turing machine as a Turing machine $U$ with the following property. For every Turing machine $M$ , there exists a string $y \in \Sigma^\ast$ such that for every string $x \in \Sigma^\ast$ , the outcome of running $M$ on the input $x$ is the same as the outcome of running $U$ on the input $yx$ (either both machines fail to halt, or they both halt with the same output).⁴

The connection with Turing sub-trees is immediate. The universality property ensures exactly that, for each $M$ , the Turing tree of $M$ must be a sub-tree of the Turing tree of $U$ . That is, the Turing tree of a universal Turing machine contains all Turing trees as sub-trees. Call these Turing trees universal Turing trees.

A universal Turing tree has sub-trees corresponding to each Turing tree.

From Turing trees to the Turing web

So, we have Turing trees, and we have descended them to find their sub-trees. To complete the picture, imagine overlaying all of the trees such that their isomorphic sub-trees align.⁵ This creates the Turing web. In this section, I derive this object and discuss some of its basic properties.

Defining the Turing web. More formally, construct the Turing web as follows. For each Turing tree, there is one node with two outgoing edges. Given a Turing tree, $T$ , consider the two depth-1 sub-trees, $T_{\tt0}$ and $T_{\tt1}$ . There is a directed edge from the node for $T$ to each of the nodes for $T_{\tt0}$ and $T_{\tt1}$ . The resulting infinite graph is the Turing web.

Note some basic properties of the Turing web. First, by construction, all nodes have out-degree two. Second, the in-degree of each node is infinite, because for each Turing tree we can construct an infinity of Turing trees that contain it as, say, the immediate left sub-tree, with a distinct root node and/or right sub-tree.

A small part of the Turing web, with nodes corresponding to a Turing tree and its first few sub-trees.

Rolling up trees and unrolling the web. An alternative construction of the Turing web is as follows.

Start with an arbitrary universal Turing tree (which contains all Turing trees as sub-trees).
Then, orient the edges away from the root, so that the tree becomes a directed graph.
Now, define an equivalence relation on the nodes of this infinite tree, where two nodes are equivalent if they are the roots of isomorphic sub-trees.
Finally, take the quotient of the graph by this equivalence relation.

The effect is to merge all nodes with the same sub-tree, creating an infinite graph with the same set of nodes as the Turing web. The edges are the same as the Turing web, since all isomorphic sub-trees start with edges to the same pair of left and right sub-sub-trees.

This alternative construction also clarifies the meaning of paths in the Turing web. Paths in the Turing web correspond to paths in some (any) Turing tree. To say one node is reachable from another is to say that the Turing machine corresponding to the source node can be programmed to behave like the Turing machine corresponding to the destination. From each node, ‘unrolling’ (or tree unfolding) the Turing web into a tree of paths recovers the Turing tree represented by that node in the Turing web.

Another small part of the Turing web, with the first few layers of several Turing trees emphasised.

This casts the Turing web as a canonical, topological representation of the forest of Turing trees, and, hence, the space of all computable partial functions, in which paths correspond to programming one behavioural equivalence class of Turing machines to simulate the input/output behaviour of another.

The universal strongly connected component. Markus Müller studied a similar graph in the paper “Stationary algorithmic probability”. In particular, he constructed the sub-graph of the Turing web restricted to nodes that correspond to universal Turing machines, interpreting this network as a Markov chain in an attempt to define a “natural” universal Turing machine by finding the stationary distribution (unfortunately, the stationary distribution does not exist).

Highlighting this subset of nodes corresponding to universal Turing machines offers a big-picture view of the structure of the Turing web. This set of nodes forms a strongly connected component of the entire network. Each of these nodes has a path to all of the others in the component, because each universal Turing tree must contain all other universal Turing trees as sub-trees, by definition. The set is maximal with this property, because, while the trees have paths to every outside Turing tree as well, any outside Turing tree cannot have a path into the component, or it would be a universal tree itself.

It follows that this strongly connected component is a source in the directed acyclic graph of connected components, and all other connected components are reachable from this one. This leads to the following abstract picture of the bulk shape of the Turing web.

A conceptual illustration of the high-level structure of the Turing web, with a source strongly-connected component containing nodes corresponding to all universal Turing trees, with paths to various other connected components comprising nodes corresponding to non-universal Turing trees.

Conclusion

The Turing web captures the network of emulation relationships between all possible Turing machine input/output behaviours. A particular Turing machine’s descendants in this network unroll to form its Turing tree. For simple, non-universal Turing machines, the Turing tree offers a narrow field of view covering the small slice of the space of computable functions that the machine can be programmed to implement. For universal Turing machines, the Turing tree offers a panoramic view over the entire universe of possible input/output behaviours, a vantage point on computation unique to that universal computer.

What is revealed by this perspective on computation? That remains to be seen. For now, I’m just carrying it around with me, waiting for a chance to try it on a problem and see if it sheds any light.

This construction is not itself computable, because computing the label ascribed to a given node of the Turing tree requires knowing whether the Turing machine will halt on a given input. However, this is not a barrier to the idea being theoretically useful. The relationship between an accepting/rejecting Turing machine and its formal language is analogous.↩︎
The set of such functions is uncountably infinite, whereas the set of Turing machines is countably infinite. A specific example of an uncomputable partial function is one for which the input encodes an arbitrary Turing machine and the output is $\tt1$ if and only if the Turing machine always halts.↩︎
We also have a different kind of converse result. Suppose we have an uncomputable tree. Then it’s not necessarily the case that all sub-trees of this tree are uncomputable, but such a tree must have at least one uncomputable sub-tree at each depth.

If at a given layer $d$ of a tree there all sub-trees encode computable functions, then a Turing machine could be built to scan the first $d$ symbols from the input before handing over control to a Turing machine for the appropriate sub-tree. This would make the whole function computable.↩︎
I don’t require that the set of strings $y$ used to ‘program’ the universal Turing machine into behaving like each $M$ is prefix-free. Doing so would allows $U$ to tell unambiguously when the program $y$ stops and the input for $M$ begins, making for a slightly tighter correspondence between algorithmic information theory and classical information theory. A universal Turing machine with such a property meets my definition, but my definition is slightly simpler and slightly broader in a way that makes it easier to think about the Turing web.↩︎
I consider two infinite, partially-labelled, complete binary trees to be isomorphic if they have the same labels (or lack thereof) at each address.↩︎