Foundations of Online Structure Theory II: The Operator Approach

We introduce a framework for online structure theory. Our approach generalises notions arising independently in several areas of computability theory and complexity theory. We suggest a unifying approach using operators where we allow the input to be a countable object of an arbitrary complexity. We give a new framework which (i) ties online algorithms with computable analysis, (ii) shows how to use modifications of notions from computable analysis, such as Weihrauch reducibility, to analyse finite but uniform combinatorics, (iii) show how to finitize reverse mathematics to suggest a fine structure of finite analogs of infinite combinatorial problems, and (iv) see how similar ideas can be amalgamated from areas such as EX-learning, computable analysis, distributed computing and the like. One of the key ideas is that online algorithms can be viewed as a sub-area of computable analysis. Conversely, we also get an enrichment of computable analysis from classical online algorithms.

1. Introduction 1.1. Our Goal. Imagine you are tasked with putting objects of differing sizes into bins of a fixed size. Your goal is to minimize the number of bins you need. This is the famous Bin Packing problem which we know is NP complete (see Karp [61]). But imagine that we change the rules and you are only given the objects one at a time and you must choose which bin to put the object into before being given the next object. You are in an online situation and this is the Online Bin Packing problem. The "first fit" method is well-known to give a 2-approximation algorithm for this problem (definitions given in detail below). Alternatively imagine you are a scheduler, and your goal is to schedule requests within a computer for memory allocation amongst users. Again you are in an online situation, but here you might want to change the order of allocation depending on priorities of the requests. Or from algorithmic randomness, you have a (computable) KC-set of requests of the form (2 −n i , σ i ) with ∞ i=1 2 −n i ≤ 1, and need to build a prefix-free Turing machine M with strings τ i such that |τ i | = n i and M (τ i ) = σ i . Then the proof from e.g. Downey and Hirschfeldt [33], Theorem 3.6.1 is online in the sense that for each request at step i we generate the string τ i .
Thus an online algorithm is one which acts on a input which is given piece by piece in a serial fashion. In the case where the input is finite, Karp [62] suggested this as a sequence of "requests" r 1 , r 2 , . . . with the algorithm f specifying an action f (r 1 ), f (r 1 r 2 ), . . . . The natural model for this would be a database where a request would be an update. Note that this is quite distinct from the offline version where (in the finite case) the whole input is known in advance. It is important to realize that in practical online algorithms arising in computer science, the action needs to be specified before the next request is given. Occasionally this is varied with a lookahead or delay where typically we might get k further bits of input so r 1 , . . . , r n+k determines the next action 1 .
A brief thought on this will reveal that there are potentially hundreds of situations where we are dealing with combinatorial algorithms for tasks where we only have partial evolving information about the input data, or perhaps the data is so large that we cannot see it in total. This is the reason that there are so many algorithms for online tasks. Classical examples include insertion sort, perceptron, paging, job shop scheduling, ski rental, navigation with only local understanding, etc.; see [4]. We will discuss other possible approaches and the relevant literature in due course. At this point we only note that books in this area, such as Albers [4], all tend to be taxonomies of algorithms. Our goal is to give a theoretical basis for the theory of online algorithms and structures which relies on uniform operator approach from computable analysis and the classical notion of primitive recursion.
1.2. The Punctual Model. In [8] a related project was started aiming at providing a model-theoretical foundation to this theory. In that paper we focussed upon the intuition that online decisions in practice have lack of delay. That is, we need to pack the object into some bin immediately, before the next one is presented to us (in the Bin Packing example). This led to a theory of online structures and algorithms we generally referred to as punctual structure theory.
What is the most general reasonable form of "punctual"? In [8] we gave several pages of analysis as to why there we chose to interpret punctuality as primitive recursive. That is, we chose primitive recursive as a unifying abstraction of the notion of lack of delay.
To keep this paper self-contained we will repeat the arguments of [8] here, so the reader familiar with [8] might choose to move on to Section 2.
in 2009 was this result improved by Bosek and Krawczyk who demonstrated that it can be done with k 14 log k many chains. In the case of finite structures most work comes from comparing offline vs online performance. In this area, the typical setting is to build some kind of function which is measured relative to some size, and the goal of online algorithm design is to improve what is called the Competitive Performance Ratio of online divided by offline. For example, first fit gives a competitive ratio of 2 for the classical Bin Packing problem (see Garey and Johnson [44]).
1.5. Online vs. Turing computable. The notion of an "online" algorithm in the results mentioned above is rather specific. One may complain that, rather than saying that we must make a decision before the next vertex shows up, it is fine to wait for a bit more of a graph to be shown to us. But how much more exactly? Maybe we can wait for 17 more vertices to show up before we make a decision. Perhaps, at stage s we could ask for log(s) more vertices, etc. It is not hard to see that various answers to this question will lead to a proper hierarchy -rather, a zoo -of "online" computability notions. It is natural to ask: What is the most general notion of an online algorithm? Understanding the online content of mathematics so far has no general theory, there are only algorithms or proofs that no algorithm exists. Note that the lack of theory for online mathematics stands in stark contrast with the infinite off-line case described by the computable structure theory [5,39]. However, as we noted above, computable structure theory relies on the most general notion of a computable process that we know todaya Turing computable process. Turing computability provides us with many tools, such as the universal Turing machine and the Recursion Theorem, that are useful in proving theorems about algorithms. However, Turing computability in its full generality is not an adequate model in the online situation, because Turing computable algorithms can use an unbounded search. For instance, recall the example in which we had to online colour a tree. A Turing computable algorithm would just wait until a node gets connected to the root of the tree via a path and then will make a decision. There is no a priori bound on how long it may take for the path to be revealed, but a Turing computable algorithm does not care. More importantly, Turing computability fails to capture the "impatient" nature of an online algorithm which has to make a decision "now".
1.6. Our goal, revisited. Recall that our goal is to give a general abstract foundation for online algorithms. As we will soon see, our approach is based on one natural interpretation of "online" involving primitive recursive structures.
In [8], using some of the techniques and intuition coming from the mentioned above (Turing) computable structure theory [5,39] we developed a theory contrasting and comparing classical computable structure theory with an online "punctual" framework. In [8], we discussed the following models.
1.7. The models. We will concentrate on infinite structures. Still to do is to develop an appropriate model theory for online finite structures as asked for by Downey and McCartin [36]. In Section 9 of [8], we foreshadowed the developments of the present paper which works some way towards addressing finite online model theory.
In its most general formulation, an online algorithm would act on a structure A given in stages f (1), f (2), . . . , where f is a computable function representing timestamps. At stage Vol. 17:3 FOUNDATIONS OF ONLINE STRUCTURE THEORY II: THE OPERATOR APPROACH 6:5 f (n) we would enumerate n into the partial structure A f (n) and give complete information about how n relates to {0, . . . , n − 1}. Now the question is: What kinds of structures and time functions should be allowed? Different choices will result in different theories. Our goal is to give a general setting that also reflects the common online structures encountered. We examine some approaches from the literature: 1.7.1. Automatic structures. Khoussainov and Nerode [65] initiated a systematic study into automatically presentable algebraic structures; but these seem quite rare. For example, the additive group of the rationals is not automatic [92]. The approach via finite automata is highly sensitive to how we define what we mean by automatic. For example treating a function as a relation yields quite a different kind of automatic presentation. See [38] for an alternate approach to automatic groups. Although the theory of automatic structures is a beautiful subject, a finite automaton is definitely not a general enough model for an online algorithm.  [25,47,1,2] studied polynomial time presentable structures. We omit the formal definitions, but we note that they are sensitive to how exactly we code the domain. In many common algebraic classes we can show that all Turing computable structures have polynomialtime computable copies. One attractive result is that every computably presentable linear ordering has a copy in linear time and logarithmic space [47]. Similar results hold for broad subclasses of Boolean algebras [23], some commutative groups [24,22], and some other structures [23]. 1.7.3. Fully primitive recursive structures. As was noted in [60], many known proofs from polynomial time structure theory (e.g., [23,24,22,47]) are focused on making the operations and relations on the structure primitive recursive, and then observing that the presentation that we obtain is in fact polynomial-time.
The restricted Church-Turing thesis for primitive recursive functions says that a function is primitive recursive iff it can be described by an algorithm that uses only bounded loops. For example, we need to eliminate all instances of WHILE . . . DO, REPEAT . . . UNTIL, and GOTO in a PASCAL-like language.
It is not difficult to construct an example of a structure which is primitive recursive but does not have a polynomial-time presentation; see the introduction of [23] for one such example. Nonetheless, primitive recursion plays a rather important intermediate role in transforming (Turing) computable structures into polynomial-time structures. Furthermore, to illustrate that a structure has no polynomial time copy, it is sometimes easiest to argue that it does not even have a copy with primitive recursive operations, see e.g. [24]. In [8] the intuition above led us to systematically investigate into those structures that admit a presentation with primitive recursive operations, as defined below. Kalimullin, Melnikov, and Ng [60] proposed that an "online" structure must minimally satisfy: Definition 1.1 [60]. A countable structure is fully primitive recursive (fpr) if its domain is N and the operations and predicates of the structure are (uniformly) primitive recursive.
The main intuition is that we need to define more of the structure "without delay". Here "delay" really means an instance of a truly unbounded search. We informally call fpr structures punctually computable. We could also agree that all finite structures are also punctual by allowing initial segments of N to serve as their domains 2 .
Remark 1.2. The word "fully" in "fully primitive recursive" emphasises that the domain must be the whole of N and not merely a primitive recursive subset of N; these are provably non-equivalent assumptions. If the domain could be merely a primitive recursive subset of N then we can delay elements from appearing in the structure; this way one can easily show that each Turing computable graph has a primitive recursive copy ( [8]). We decided that structures in which elements can be delayed are not really online.
Our goal is to give a most general setting that also reflects the common online structures encountered. From a logician's point of view, where do computable structures come from? One of the fundamental results of computable structure theory is that: A decidable theory has a decidable model 3 . The proof of this elementary fact is to observe that the Henkin construction is effective, in that if the theory is decidable then the constructed model is decidable as a model. Many standard computable structures come from decidable theories.
Most natural decidable theories are elementary decidable in that the decision procedures are relatively low level. We have to go out of our way to have natural decidable theories whose decision procedures are not primitive recursive. In [8] we observed that a theory with a primitive recursive decision procedure has a model which is decidable in a primitive recursive sense. 1.7.4. The upshot. Thus in [8], we chose fully primitive recursive structures as our central model. Primitive recursiveness gives a useful unifying abstraction to computational processes for structures with computationally bounded presentations. In such investigations we only care that there is some bound. Furthermore, these models arise quite naturally through standard decision procedures.
In [8], we also noted that many results we stated in terms of primitive recursion, can likely be pushed to polynomial time structures. Furthermore, some of our counterexamples can in fact be stated in terms of any class with sufficiently nice closure properties; e.g., for a class of total computable functions having a uniformly computable enumeration and closed under composition and primitive recursion. However, this does not mean that our choice of primitive recursive algorithms as a central model is fairly arbitrary. The mentioned above generalisation to a class of total functions can be viewed as a version of the subrecursive relativisation of primitive recursion. The study of relativised versions of our results is interesting on its own right, but it is not really beyond the primitive recursive model. Kalimullin, Melnikov and Montalbán (in progress) have recently announced a number of unexpected results connecting relativised primitive recursive presentations with syntax in the spirit of Ash and Knight [5]. Also, as we see in the present paper, an 2 Although the definition above is not restricted to finite languages, we will never consider infinite languages in the paper. 3 Recall that a complete, first-order theory in a computable languadge is decidable if the collection of all expert in computable structure theory would know that relativisation is tightly connected with uniformity. Generalisations to polynomial time classes seem to require significant effort in some instances. Alaev [1,2] has recently initiated a research program focused on extending these ideas to polynomial time algebra. Dealing with polynomial time algorithms requires specific techniques and counting combinatorics; this is something we do not have to worry in our more "relaxed" model. In contrast with, e.g., automatic algorithms or polynomial-time algorithms, there is a highly convenient and clear version of Turing-Church thesis for primitive recursive functions (see above). We will use the thesis throughout the article without explicit reference. It will allow to simplify our proofs and proof sketches. Irrelevant counting combinatorics is stripped from such proofs, thus emphasising the effects related to the existence of a bound in principle (rather than specifying the bound). These effects are far more significant than it may seem at first glance. Models such as automata-based structure theory [65,53,54], are highly sensitive to presentations of the structures. For example treating the algorithms as generated by transducers yields a completely different theory to that obtained by treating functions as relations, as can be seen by comparing the approach of Khoussainov and Nerode [65], with that of Epstein et al. [38]. Also it would seem that although we can incorporate automatic processes in our theories, they are really not general enough for online algorithms in general. Similarly, polynomial time structures such as Cenzer and Remmel, Grigorieff, Alaev, and others [1,2,3,25,47] are rather presentation dependent. Finally, primitive recursive has a nice Church-Turing thesis, in that it models computable processes without unbounded loops.

The Uniform/Operator Model
Whilst the [8] model is a natural model, as we observed in Section 9 of that paper, there are aspects of online combinatorics which are not covered by the model. Imagine we need to build a colouring of a graph G which is given online. Thus, in the very simplest case, we would be given the graph G = lim s G s , where G s has s vertices. When the vertex s is introduced, we are also given at the same time precisely which vertices amongst {1, . . . , s − 1} has an edge with s (and this cannot change later). (This is the "request set" in Karp's paper.) Our task is to colour s so that no two vertices which are connected have the same colour, before the opponent presents us with G s+1 . Although in practice the task will be finite, since we have no idea how large the graph is, we can construe this as an infinite process. Imagine this online colouring of a finite graph as an infinite process where we need to colour the whole of an infinite graph G given to us as incremental induced subgraphs. We can think of each possible version of G as being a path through an infinite tree of possibilities. Each node σ of length s of the tree will represent some graph G σ with s vertices, and if σ ≺ σ then G σ is the subgraph of G σ induced by vertices {1, . . . , s}. Note that there are only primitively recursively many non-isomorphic graphs with s vertices 4 .
Then this view of an online algorithm differs from that given in [8] for the following core reason: Although G can be viewed a path on an infinite primitive recursive tree of possibilities, there is no a priori reason that we should only consider a primitive recursive graph G. There are continuum many such paths and the online graph colouring problem can be considered for an infinite countable graph of any complexity. The reader will quickly realise that the key point about online algorithms is one of continuity or uniformity. If we have a colouring of G σ and we add a new vertex s + 1, the next G will be one of the possible extensions G τ of G σ with the vertex s + 1 added. For each such G τ the colouring χ Gτ must be compatible with the colouring χ Gσ on G σ .
Computable Analysis. The conclusion is that whilst online algorithms appear to be combinatorial algorithms on finite objects and possibly infinine ones, in fact they should be formulated as a branch of computable analysis. One of the goals of the present paper is to give such a formulation. We need to specify what kinds of spaces are of relevance and what kinds of operators correspond to online algorithms. We believe that this view will allow a discourse between the discrete and the continuous which could prove fruitful. Similar relationships between the continuous and the discrete have yielded powerful results such as in the Furstenberg view of Szemeredi's Theorem. Our unifying abstraction also means that computable analysis is shown to be important in finite combinatorics (see also Avigad [6]). We will also see that our abstraction means that we can relate the proof theory of finite combinatorics with classical proof theory, obtaining refinements of result for Reverse Mathematics, and we can relate the theory of incremental computation with ideas from computable analysis such as Weihrauch reducibility. Thus although this is not the most technically difficult paper, we see it as a conceptual advance showing that many ideas can be combined into a single unifying abstraction.
Immediate actions. Since we want the action to be immediate, following the abstraction of [8] and for the reasons above, we will also demand that the action works primitive recursively, or perhaps even running in polynomial time. Thus, for the example above, the most general online colouring algorithm must satisfy the following two features: -χ Gτ must agree with χ Gσ for σ ≺ τ . -The map τ → χ Gτ must be (minimally) primitive recursive.
These will all soon be made precise and general using representations (i.e, naming systems) for online problems. In fact, there are at least three possible interpretations of the second clause above. For example, should we allow lookahead or delay in the computation of f ? In Lemma 3.5 we will prove that all three potential definitions are equivalent up to a primitive recursive change of notation, and therefore our definition is robust.

2.1.
Why primitive recursion? Should we instead require the solution to be polynomial time? As we have already mentioned above, this notion would be too notationally dependent to be unambiguous. Also, it is well-known that in practice some useful algorithms are (provably) not polynomial-time; yet they seem to perform well enough on most inputs. It is therefore not even clear if polynomial-time is the right abstraction for efficiency in the online situation. So we want our function to be in a nice complexity class but we are not yet sure what exactly this class should be. It makes sense to develop as much structural "punctual" theory as possible and then see how much of it is preserved when we restrict ourselves to some narrow complexity class. And if something fails, we will have a better idea what goes wrong in the worst possible scenaria; e.g., we will compare the positive Theorem 7.3 and the analogous negative result in polynomial-time analysis [70]. Perhaps, it will help to define generic-case online algorithms in the future. As with classical complexity theory, there is usually a natural representation for a problem we are interested in. The reader will note that in our definitions below, the actual representation does affect what we will regard as online. Nonetheless, one of the main advantages of our rather general primitive recursive approach is that we still can prove a number of notation-independent results; e.g., the "robustness" Lemma 3.5 and the abovementioned Theorem 7.3. Such results focus more on the effects related to online-ness and less on the pathologies related to a specific choice of representation. Analogous results usually fail if we restrict ourselves to, say, polynomial-time algorithms because passing from one representation to another can be computationally too hard.
On the other hand, we will also prove several results (e.g., Theorem 4.6) which show that sometimes all pathologies come from presentation because the only notation-independent online solutions are the trivial ones. Such results of the second kind will typically hold for polynomial-time or exponential (etc.) algorithms too, and via essentially the same proof. Primitive recursion serves here as a unifying abstraction rather than an idealisation.
We will also see that, modulo subrecursive relativisation, the earlier approach to online algorithms by Kiersetead, Trotter et al. [67,73,68] and Borodin and El-Yaniv [12] can be viewed as a special case of our framework. According to this earlier approach, the map τ → χ Gτ just needs to be total and does not even have to be computable. So we see that primitive recursion is not that general when compared to some other definitions in the literature.
Should we perhaps use (a use-restricted form of) general Turing computability in place of primitive recursion? It is more general, and some analogy of the above-mentioned robustness lemma (Lemma 3.5) will still hold. The key difference here is that it would hold for a completely different reason. A recursion theorist will be well-aware of how much unbounded search is abused in many such proofs. For example, we can use compactness of the representation space and wait for it to be covered by open sets. There will perhaps be no bound on how long we will have to wait, but from the point of view of Turing computability it will not make any difference.
However, primitive recursion seems just general enough for many structural results to hold, but often via a different, more subtle argument which takes into account punctuality of our procedure. A fine example of such a theorem will be given in Section 7 where we prove an online version of a well-known theorem of Weierstrass. It is very easy to show using a compactness argument that the theorem holds (Turing) computably. But it requires some thought and a completely different argument to see why it holds punctually.
As mentioned above, we realise that this material also has a connection with computable and feasible analysis, and also with the complexity theory for operators in analysis along the lines of Kawamura and Cook [63], Melhorn [74], Ko and Friedman [43], and others. We will also note connections with reverse mathematics, computational learning theory, and even algorithmic randomness. We will also see that, in this setting, the finiteness of the objects being given is not an essential restriction. In the online case, finite objects are only revealed one bit at a time, and for all intents and purposes, we may as well treat all inputs as arbitrarily large finite structures. We will prove that under the uniform operator framework, working with arbitrarily large finite structures and infinite structures are indeed the same for our setting. This allows for example, for a formal approach in which one can study finite combinatorics in reverse mathematics. As a final remark, we mention that we see this work as an extension of [8] in the following way. [8] considered online computation of primitive recursive structures, with primitive recursive functions. This is akin to The Turing-Markov [93] view of computable analysis as effective processes on the countable field of computable reals. The Gregorczyk-Kleene [46] views, called type II computability, which views computable analysis as effective operators acting on the continuum of all reals. It is also akin to the bifurcation between computable structure theory and uniform computable structure theory.

The main definition
In the following sections, we will work up to the main definition. Remember that we want to simultaneously generalize online algorithms on finite and infinite structures, and in a general setting where the domains might be any kind of structure. Because of this we will need to tour through representations (computational ways of naming infinite objects), and carefully argue why choices, such as that of primitive recursion, are made.
3.1. Representation spaces. It could be argued that for relational structures we could consider (isomorphism types of) any structure A with universe N, and we could consider A n to be the induced substructure of A with universe {1, . . . , n}. Naturally, we need to assume that this has meaning: and such substructures exist in all finite cardinalities. Also, if we choose to add function symbols we would need to only allow a small extension of the structure based on {1, . . . , n}. For simplicity, we will stick to relational structures and use the following terminology. A class C of relational structures is called inductive if A ∈ C implies A has a filtration A = ∪ s A s where each A n is finite, has universe {1, . . . , n}, and for all n > n the substructure induced by {1, . . . , n} in A n is A n . More generally, for a fixed (Turing) computable function g, we say that C is g-inductive if it has a g-filtration meaning that each A n has universe {1, . . . , g(n)}. Here we will sometimes write O(h(n))-inductive for the case where g is O(h). Our language will typically be finite and relational, and g will typically be primitive recursive 5 .
We refer to the substructure of A based on {1, . . . , n} the substructure of height h(n) = n. In the example discussed above, the height n structures are the graphs with n vertices. Another example is considered by Khoussainov [64] with a height function in his work on random infinite structures. Natural online structures tend to have natural height functions.
By abusing notation, we will let C <ω denote the class of finite substructures of C. There is also the natural induced topology. For example, in the graph case this would be compact and have the totally disconnected topology with basic open sets being the extensions of graphs of height n. 5 Richard Shore observed that the punctual case focusses attention upon functions and functional languages, whereas the operator approach seems to tie itself to relational ones. We need some care if the language has function symbols, see [58]. 3.2. Representations. A representation (a naming system) of an inductive class C of structures is a (Turing) computable surjective function δ : ω <ω → C <ω , which acts faithfully in the sense that δ(σ) = C n for |σ| = n and h(C n ) = n, and if σ τ then δ(σ) is an induced substructure of δ(τ ). Most examples of representations in the literature are witnessed by a primitive recursive δ. We thus will assume that δ is primitive recursive throughout. We can also extend this in the natural way to g-filtrations. Such a δ induces a map δ from ω ω → C, namely lim{δ(σ) | σ ≺ x}. We will call x ∈ ω ω a name or a representation for C ∈ C if δ(x) = C. Note that it is possible for a structure C to have a number of different names.
For the time being, we will regard δ as being injective. When it is possible, we will replace ω <ω with 2 <ω . We will consider functions f : C 1 → C 2 represented by functions F acting on representations δ i : Q i → C i ; we of course require that F commutes with f and δ i , We emphasise that the function F is acting on strings which are finite objects. These represent, e.g., graphs. The continuity of the action induces a map F which is the completion of the finite maps.
3.3. Online problems. Although our objects of study are not strings, we implicitly identify them with their representations, in accordance with the previous subsection. In particular, if the representation space is compact then our objects can be identified with strings over a finite alphabet.
Intuitively, to solve a problem we need to find a function f which, on input i, chooses an admissible solution from the finite set s(i) of "correct" solutions. Note that the multi-valued function s does not have to be computable in general. For instance, for a colouring problem I will be codes for finite graphs and S for finite coloured graphs. Then s(σ) will correspond to the collection of all admissible colourings, e.g., such that adjacent vertices are distinctly coloured. These colourings will form the space of admissible solutions.
Most natural problems from finite structures will obey the following convention, which we will consider in this section. Only in Section 7 we will consider more general cases.
Convention 3.2. Unless explicitly mentioned, I and S are compact with a primitive recursive modulus of compactness; i.e., it is primitively recursively branching when viewed as a tree of strings. Thus, there is a natural primitive recursive way to transform I into 2 ω (typically not height preserving). In general, in Definition 3.1 we may also requiref to satisfy some global property which cannot be always captured by s from Definition 3.1. For example, in Section 4 a solution must be an isomorphism between two presentations of the same infinite graph. In general, even if at every stage f (σ) may be extendable to some isomorphism, the map associated withf may fail to be surjective in the limit. Also, in another example in Section 4 we will require our solution to work only if the input is a presentation of some fixed infinite graph, which is also a property off rather than of any finite approximation to it. In particular, in this case admissibility off cannot be captured by s in Definition 3.1; at least not in general.
Convention 3.3. We will refer to such properties off as global and will not incorporate them into Definition 3.1.
Condition (O1) says that the output of f is an admissible solution. In (O2) we ask for is that each increment of the input yields an increment in the output, in the sense that f (σ) must be a solution to σ.
The reader should note that (O3) is somewhat ambiguous as stated because it may be interpreted in at least two different ways, namely f could be primitive recursive either as a function or as a functional: (O3) : the computation of f (σ) is based solely on σ, or (O3) : the computation of f (σ) may ask for an extension τ of σ before it halts.
Indeed, for online computations it would be natural to demand that we have a primitive recursive timestamp function g and to compute f (σ) we would look at σ of length g(|σ|) extending σ. In practical computations lookahead will typically be g(|σ|) = |σ| + k for some constant k. On the other hand, for a recursion theorist it would be more natural to consider Turing functionals acting on the representation spaces and demand that they are primitive recursive. By that we mean adding the characteristic function for the infinite string in the completion of the problem (Section 3.4) to the primitive recursive scheme of f ; to be clarified in Subsection 3.6. These two general definitions of lookahead (via timestamp and via oracle) are not equivalent when, say, I ∼ = ω <ω . Thus, we have three natural versions of (O3) which are furthermore provably not equivalent in general.
Luckily, in the next subsection we will prove that, under Convention 3.2, these three versions of the main definition are equivalent up to a primitive recursive change of notation, and therefore Definition 3.4 is robust.
3.6. The robustness lemma. As we mentioned above, there are two natural ways of interpreting what it means for f in (O3) to be primitive recursive with a lookahead. We give more details.
In the first definition, we require that it is a Turing functional that possesses a primitive recursive time-function t which, on every input σ outputs the number of steps which f takes to compute f (σ). In particular, t(σ) bounds the use of the operator, that is, the length of τ extending σ which may be used in the computation of f (σ). The length of the output f (σ) will also be bounded by t(σ). The seemingly more general definition of a primitive recursive functional says that, for each infinite path x through the space of inputs, f is primitive recursive relative to x = lim s {σ | σ ≺ x}. The latter can be formally defined by adding the characteristic function for x to the primitive recursive schema, and hence would potentially entail that f (σ) could be arbitrarily long for various extensions of σ.
These Proof. In a different terminology the proof will appear in [58]. A similar formal argument can be found in the appendix of [8].
Suppose the Turing functional f possesses a primitive recursive time function t. Using t as a universal bound on all the searches which may occur in a computation with any oracle x extending σ, we can transform the general recursive scheme (augmented with the characteristic function χ x for x) into a primitive recursive scheme augmented with χ x . This implication holds in general, i.e., without any extra assumption on I. Now, assuming I is primitively recursively branching, suppose f is a primitive recursive functional with functional oracle g in the most general relativised sense.
For the base of induction consider the following cases: where o, s and I n m are the standard elementary basic functions (e.g., [83]). The first three cases are evident since they do not refer to g, while in the case when Φ g = g take t = b, where b is the primitive recursive branching of I. Take t(x) = i≤x b(i) to make t monotonically increasing in its input.
The inductive step splits into two different cases depending on whether the last iteration is composition or an instance of primitive recursion.
Define a primitive recursive time bound for Ψ by the rule which can be rewritten into a primitive recursive schema using the standard techniques. Now suppose that Ψ is defined using an instance of primitive recursion, more specifically , where Φ and Θ are primitive recursive operators which have corresponding primitive recursive time functions t 0 and t 1 . Define t by the rule t(x, y + 1) = t 1 (x, y, t(x, y)) + t(x, y); assuming that t is monotonically increasing in its input this gives the desired upper bound.
For the last part of the lemma, recall that the tree I is primitive recursively branching. Thus, we can inductively form a new tree I whose level n nodes are in a (primitive recursive) 1-1 correspondence with the nodes at level t(n) = max{t(σ) : |σ| = n} in I. Then the algorithm f on I works on I as a strict algorithm.
Of course, if f is primitive recursive functional without lookahead then f can be viewed as simply a primitive recursive function mapping finite strings to finite strings. More formally, we have:

Generalisations and refinements of the main definition.
It is important to understand that primitive recursion smoothens many difficulties related to notation. In particular, the robustness lemma from the previous section will typically fail for polynomialtime operators. Thus, different interpretations of (O3) in Definition 3.4 will potentially lead to different refinements of the main definition to more narrow complexity classes. On the other hand, different versions of relativisation (such as general Turing and sub-recursive) will lead to potentially non-equivalent generalisations of the main definition. We will not develop these topics in too much detail, but some notions and notation introduced in this subsection will be important in the later sections.
3.7.1. Strict solutions, obT operators, and totality. We may want to stick with a given notation (I, S, s) because changing it may be either inconvenient or computationally too hard. If the space I is not primitively recursively branching or not even compact, then Lemma 3.5 no longer holds. Thus, in this case the most general version of Definition 3.4 becomes ambiguous. In contrast, Fact 3.6 does not rely on compactness of I, let alone its primitive recursive branching, and therefore the stronger strict version of Definition 3.4 still makes sense even for non-compact I. Thus, the situation described in Fact 3.6 deserves a special attention.
Definition 3.7. In the simpler situation that no lookahead is allowed in Definition 3.4 we will call f a strict punctual solution.
By Fact 3.6, this situation can be considered an analog of a classical ibT-reduction (to be defined), but acting on compact spaces with primitive recursive branchings with the branches of level n being the structures of height n, instead of 2 ω . Classically, ibT refers to an oracle procedure Γ B = A with the use γ(x) = x for all x, and here we are identifying sets with their characteristic functions as usual [33]. ibT functionals and the induced reduction have been studied quite intensively [28,7,33,35,88] and even used in (classical) differential geometry [28,78].
Definition 3.8. For a fixed filtration representing I we will call such a procedure Γ induced by a strict online solution an obT (online bounded Turing) reduction. Of course, the classical ibT reduction is usually viewed as working on 2 ω . If our space does not have primitive recursive branching then we no longer can transform it effectively into a copy of 2 ω . But as mentioned earlier, we see this aspect as a feature of the model, and not a flaw. One should expect online-ness to be generally representation dependent, at least to some extent.
Although there are notions of a polynomial-time functional in the literature [70], Definition 3.7 is much more convenient if we want to define what it means for a punctual solution to be polynomial time.
We can also use Definition 3.7 to give an explicit connection of our definition with the above-mentioned approach in [67,12] which relies on total (not necessarily computable) functions. As has been observed in [58], a total function can be viewed as a function primitive recursive relative to some oracle. More formally, we have: For an online problem, the following are equivalent: (1) The problem has a total strict solution f (in the sense of [67,12]); (2) The problem has a strict solution primitive recursive relative to some oracle (in the sense of [58]).
In computability theory many arguments tend to be uniform enough to be relativizable to any oracle. The relativisation phenomenon partially explains why the seemingly crude approach via totality [67,12] often captures some features of online computation.
3.8. General computable and efficient solutions. Many of our results are valid for computable online solutions, and in fact for any total solutions. For example, certainly a result showing that no computable solution is possible is very strong. For example, the proof that online colouring forests from Gasarch [45] requires Ω(log n) many colours shows that no computable f is possible. We arrive at the following generalisation of the main definition. One potential extra feature which is captured by this generalisation is that it also covers partial solutions and, potentially, partial representations. We will not make it overly formal and leave this to the reader. For example, in the case that the representations are partial, f would perhaps only need to work well on a valid input (cf. Convention 3.3). For example, in the case when ω ω is representing Cauchy sequences, imagine we are seeing an online way to compute some continuous function. Then we would only need to produce a solution for those sequences which actually corresponded to convergent sequences. We could then require this solution be in some sense punctual when restricted to valid inputs.
We could on the other hand make the definition more efficient, for example: Definition 3.11. A polynomial-time solution to (a representation of) an online problem (I, S, s) is a punctual strict solution which is furthermore polynomial-time.
The definition above is of course heavily notation-dependent. For the look-ahead case the situation becomes even more complex. Although there are definitions of a polynomial-time functional in the literature [63,70] they tend to be unconvincingly technical. Also, recall that the robustness lemma fails for polynomial-time simply because changing notation tends to be exponential time. Therefore the definition will also depend on which version of the main punctual definition we choose to make polynomial-time; recall there were three such versions.
3.9. Multiple solutions. Notice that in actual practice, we might also need a further generalisation of the above. Sometimes we might compute a (bounded) collection of solutions at least one of which is correct at any stage and at height n. This occurs in, for example, using automata to compute minimization problems for graphs of bounded pathwidth (or k-interval graphs, see section 5.2) given the path decomposition. We will be computing a table of f (k) many solutions at each level n. For example, for finding maximal clique you would have a collection of 2 k many possible solutions. However, it appears that a suitable choice of the space of outputs S can cover this seemingly more general case too.

Oracle computation and uniformity
4.1. Graph oracles do not help. The main goal of this subsection is to show that a graph-oracle cannot significantly help in computing a function online. For that, we consider online functionals and online oracle computations.
The space 2 ω can be replaced with a primitively recursively branching totally disconnected space. Identifying f with its representation F , we can unambiguously write this as f (α u(n)) = f (α) n, and (in view of Lemma 3.5) this should cause no problems in the case of primitively recursively branching spaces of strings. We may also allow more than one input in f .

Notation 4.2.
It is natural to write f α u(i) (i) instead of f (α u(i)) and view α as an oracle.
The output of f α u(i) (i) can also be interpreted as a natural number, when necessary.
There are obvious refinements of this. For example, it is natural to restrict ourselves to functionals f whose running time is a polynomial in the length of α. Also, having in mind some particularly nice primitive recursive function u, f is u-online computable if f (α u(n)) = f (α) n. An obvious case is when u(n) = n + k, which would be online with delay k. An illustration of this notion can be seen from Section 7, where we look at online real valued functions. We note that addition of reals is online computable with delay 2, meaning that to compute the sum of x and y to within 2 −n needs x and y with precision 2 −(n+2) . Similar delay considerations come from other procedures in polynomial time analysis such as integration (see [70]). When u(n) = n then the notions can be restated in terms of strict (ibT primitive recursive) functionals, while online with delay k corresponds to Lipschitz reducibility. Computable Lipschitz reducibility comes from algorithmic randomness ( [33], Chapter 9) where it is shown that if f is online computable Lipschitz acting on 2 ω , then it preserves the Kolmogorov complexity of all sequences in the sense that for all n, K(α n) ≥ + K(f (α) n); that is K(α n) ≥ K(f (α) n) ± O(1). We will consider online functionals acting on algebraic or combinatorial structures, e.g., α could be viewed as a description of a finite segment of an infinite structure of some fixed finite relational signature, e.g., a graph. The extensions of α n are the finitely many possible relational structures on n + 1 elements extending the structure described by α n.
The intuition is that f α u(i) (i) is expected to compute correctly only if α is an initial segment of a graph G. This is a global property; see Convention 3.3. In other words, h is allowed to use any presentation of some fixed G as its (online) oracle.

Example 4.5.
To see how much extra computational power algebraic oracles can give, consider the following example. Let X be an arbitrary subset of N, and define A(X) to be an algebraic structure in the language of one unary function s, one unary predicate p, and one constant o, and which has the following isomorphism type. When restricted to s and o, it is just N with s(x) = x + 1 and o interpreted as 0. Now define p(x) ⇐⇒ x ∈ X. Given any presentation α of A(X), we can decide X. So, in particular, computation from an isomorphism type is potentially as powerful as just the usual oracle computation.
In view of the example above, the reader will likely find the theorem below unexpected. Its proof is however not difficult; it can be viewed as a variation of an argument in Kalimullin, Melnikov, and Montalbán [58].  Proof. By Ramsey's theorem, G either has an infinite clique or an infinite anti-clique; without loss of generality, suppose it is a clique. Since g(i) = f α u(i) (i), where α is any representation of G, we can assume that the first u(i) bits of α describe a clique. Since the space of all presentations of G is primitively recursively branching, the use u is primitive recursive (see Lemma 3.5). Thus, the oracle can be completely suppressed and the trivial description of an infinite clique can be incorporated into a new procedure f 0 which does not use any oracle. On input i the procedure produces a string of length u(i) which describes a finite clique, and then refers to this finite string (viewed as a partial function) whenever it needs to use the characteristic function of the oracle. This procedure is easily seen to be primitive recursive (as a function).
Informally, the result says that, from the perspective of online computation, graphs cannot code any non-trivial information into their isomorphism type; i.e., up to a change of their presentation. Both the theorem above and the main result in [34] imply that graphs are 6 The point is that care is needed with which representations are allowed. Polynomial time functionals for (0, 1) typically use the so-called signed digit representation, but even for R there is some problem with the notion of the size of the input as discussed in, for instance, [63]. However, for any reasonable representation of graphs of size n this becomes relatively straightforward using, e.g, the standard matrix representation as in [44]. not universal for punctual computability -a notion which we will not formally define here (see [8]). See Kalimullin, Melnikov, and Montalbán [58] for a generalisation of Theorem 4.6 to structures in an arbitrary finite relational language.

4.2.
Interactions with punctual structure theory. In [8] we described the foundations of online structure theory. The main objects in this theory are infinite algebraic structures in which operations and relations are primitive recursive. As we argued in [8], there are natural strong connections of this new theory and the theory of polynomial-time algebraic structures (see also Alaev [1] and Alaev and Selivanov [3]) with applications to automatic structures [9]. Earlier we argued that this kind of punctual structure theory is akin to Turing-Markov computable analysis, in the objects are given effectively. In this paper structures themselves do not have to be primitive recursive. However, the frameworks are closely related via, e.g., Theorem 4.9 below.
A presentation of a countably infinite algebraic structure in a finite language is an isomorphic copy of the structure upon the domain N. For simplicity, we may assume that the structures in this section are all relational. In this case it becomes consistent with our framework; in particular, the space of all presentations I of a fixed structure in a finite relational language is primitively recursively branching.
Each such presentation α ∈ [I] can be viewed as an isomorphic copy of the structure upon the domain of N. Some of these presentations will be computable in the sense that the relations on α will be computable predicates over N. It is well-known that a structure may have non-computably isomorphic computable presentations. When we restrict ourselves to primitive recursive presentations and primitive recursive isomorphisms the situation becomes even more complex because the inverse of a primitive recursive function does not have to be primitive recursive. See [8] for a detailed exposition of the theory of punctually categorical structures.
The following notion is not restricted to primitive recursive presentations. A more general version of the definition below was first discussed briefly in [60] and then also mentioned in [59]. An even more general model-theoretic version of the definition can be found in [58]. Definition 4.8. A structure G is strongly online categorical if there is an online strict operator f which, on input α and β arbitrary representations of G outputs an isomorphism from α onto β.
In other words, there exists a primitive recursive functional f α;β with both uses being the identity function, such that the associated function h(i) = f α i;β i (whose output is interpreted as a natural number) induces an isomorphism from α onto β; recall the latter two are isomorphic copies of G upon the domain N. Equivalently, we could replace the functional by a primitive recursive function of three inputs σ, τ, i where |σ| = |τ | = i and finite strings are identified with their indices (under some fixed natural enumeration).
The theorem below can be viewed as a variation of another result of Kalimullin, Melnikov, and Montalbán [58] on punctual categoricity, but in our strongly online case the proof will be significantly simpler. Recall that a structure G is homogeneous if for any tuplex in G and any pair of elements y, z ∈ G, we have that y is automorphic to z overx. Proof. Each homogeneous structure is trivially strongly online categorical. Now suppose G is strongly online categorical. Suppose the structure is not homogeneous, and letx be shortest (of length n) such that for some z, y we have that z is not in the same automorphism orbit as y overx. Construct α and β as follows. First, copyx into both and calculate the online isomorphism f from α n to β n. If we identify α n and β n withx, then f induces a permutation of β n; by the choice of n any permutation ofx can be extended to an automorphism of the whole structure. Adjoin z to α and find a y which plays the role of y over β n under any automorphism extending the permutation β n ↔ f (α n). Then necessarily f (z) = f (y ), because f has already shown its computation on the first n bits. However, by the choice of z and y , f cannot be extended to an isomorphism no matter how we extend the presentations further.
Note that we used only totality of the strict functional in the proof. In the case when the language has functional symbols the theorem no longer holds. Of course, the notion of strongly online and of a presentation will have to be adjusted. But regardless, strong homogeneity will no longer capture the property (whatever it may be exactly). . According to any reasonable definition of (strong) online categoricity for functional structures, this structure has to be (strongly) online categorical. However, it is not homogeneous.
We leave open: Problem 4.11. Is it possible to find a reasonable algebraic description of (strongly) online categorical algebraic structures in an arbitrary finite language?
We suspect that such a description exists, and that the solution will likely boil down to setting the definitions right. If we replace strict with primitive recursive operators in Definition 4.8 we will obtain the more general notion of (uniform) online categoricity. With quite a bit of effort Theorem 4.9 can be extended [58] to this more general notion, and even beyond.

Weihrauch reduction and online algorithms
Weihrauch reduction is one of the central notions in computable analysis. It was named by Brattka and Gherardi [14]. Weihrauch reducibility can be viewed as a natural generalisation of computable Wadge reducibility [95]. Henceforth will use f ≤ W g to denote Weihrauch reducibility. f ≤ W g has the following intuition. We have some problem we wish to solve by computing an instance f (x) of some function f . To do this we produce another instance x and solve g(x ) for g, and then convert g(x ) back to of f (x). In more detail, for functions f and g on ω ω -represented spaces X and Y , f ≤ W g, is defined to mean that there are computable A and B on ω ω , such that for any p x , and any representation G of g, A(p x , G(B(p x ))) realizes f (i.e. is a name for f (x)). (This is defined here for single-valued functions, but does have a multi-valued version we won't need.) This should be thought of as follows for the archetypal case of a computable metric space. For a computable metric space, we take a Cauchy sequence converging to x, use B to convert this into a one converging to B(x), and hence one converging to g(B(x)), and finally using the one converging to x and this one, to one converging to A(x, g(B(x))). The definition has a number of natural variations; some of these will be discussed below.

Weihrauch reduction and incremental computation.
In this subsection we establish a formal connection between computable analysis and computer science. More specifically, we show that a version of Weihrauch reduction borrowed from computable analysis [97] is equivalent to incremental reduction between online problems suggested in Miltersen et al. [76]. We first state Weihrauch reductions in the online setting. Suppose P, Q are online problems.
Definition 5.1. We say that P is strongly Weihrauch reducible to Q, written P ≤ sW Q, if there exist Turing functionals Φ and Ψ such that, whenever σ ∈ I P is an instance of P, Φ σ = τ ∈ I Q is an instance of Q, and whenever ρ ∈ s(Φ σ ) is a solution to Φ σ then θ = Φ ρ ∈ s(σ) is a solution to σ.
Here the reduction is strong in the sense that there is a provably more general definition of (plain) Weihrauch reduction which will be given in due course. Note that, according to the definition above, all functionals involved are strict, but this condition can be relaxed giving a less tight reduction.
Notation 5.2. We write P ≤ C sW Q if both strict functionals (in our sense) Φ and Ψ in the definitions above belong to a complexity class C having sufficiently strong closure properties (e.g., polynomial-time, polylogspace, primitive recursive, etc.).
Remark 5.3. The reader might wonder why we will restrict ourselves to strict-type reductions, or slight variations, for the online setting. The reason is the following. Suppose that we have two (represented) online problems I 1 and I 2 . In an online way we want to use I 2 to solve I 1 . Now suppose that we have some online algorithm for I 2 . We could take a σ of length n representing an instance G n of height n of I 1 , and convert it into an instance σ of I 2 , and use it to produce a solution s(σ ) of I 2 , which could be converted back into a solution s(σ) = A(s(σ )) of I 1 . The key issue we will investigate is how tight the relationships of sizes of the representations are. Ideally |σ | = |σ|.
and s is merely a predicate on I. This is the same as to say that any solution simply decides whether a predicate holds on a string or not. We say that σ ∈ I is a positive instance of I if s(σ) = 1. Milterson et al. [76] analysed complexity classes for online algorithms, and in a slightly more general situation than our monotone one where, for example, the objects only get bigger. Miltersen et al. [76] investigate online algorithms in which input data may change with time. For example, in a graph a vertex or an edge can disappear. Their reduction takes into account the potential changes of the input.
Definition 5.4. Let C be a complexity class. A decision problem P is C-incrementally reducible to another decision problem R, denoted P ≤ C incr R, if the following two conditions hold: (1) There is a transformation T : I P → I R in C which maps instances of P to instances of R such that s P (σ) = s R (T (σ)) (i.e, σ is a positive instance iff its image is a positive instance). (2) There is a transformation Q in C which, given σ ∈ I P and the incremental change δ to σ, where δ changes σ to σ of the same length 7 , constructs the incremental change δ to T (σ) (where δ changes T (σ) to T (σ )).
Remark 5.5. We will here only consider C to be the class of polynomial time computable functions, and hence use ≤ P incr accordingly. Milterson et al. [76] also considered e.g. C to be Logspace. In [76] the authors specify the exact time bounds for all computations involved. This is the reason why they need the seemingly redundant part 2 of the definition above. Also, they look at auxiliary data structure generated for each instance and at the changes induced to the structure. However, from the perspective of general (e.g.) polynomial time computation this extra information is not necessary since these auxiliary bounds are evidently polynomial time.
The proposition below shows that P ≤ P incr Q is a variation of Weihrauch reduction from computable analysis which was independently rediscovered by computer scientists. We note that complexity restricted versions of Weihrauch reducibility were first introduced and studied by Kawamura and Cook in [63]. Recall that strong Weihrauch reduction is witnessed by a pair of functionals Φ and Ψ.
Fact 5.6. Suppose P and Q are online decision problems. Then P ≤ P incr Q iff P ≤ P sW Q with Ψ = Id {0,1} .
Proof. Suppose P ≤ P incr Q. Then the transformation T from the definition of incremental reduction can be used as Φ in the definition of ≤ P sW . Since σ is a positive instance iff T (σ) is, Ψ = Id {0,1} .
Conversely, suppose P ≤ P sW Q via (Ψ, Id {0,1} ), where Ψ is a polynomial functional from the space of inputs I P of P to the space of inputs I Q of Q. Then the first part of the definition of incremental reduction follows from the assumption that Ψ is a functional in C. By the continuity of Ψ and the fact that we used Id as the second functional, it suffices to deduce a polynomial time bound on the changes in the inputs of Ψ(σ) based on the changes in σ. But this bound is just a big-O of the bound given by Ψ.
Following [76], we can impose specific bounds on the number of steps required for example, calculating δ based on δ. The expectation is that it should be easier to make the change than to simply recompute T (σ ) "from scratch". All these specialised bounds can also be expressed in terms of strong Weihrauch reduction; we omit details. As an application of Theorem 5.6 and various results in [76], we can obtain a number of polynomial time and polylogtime Weihrauch reductions in the study of online algorithms.

5.2.
Weihrauch reduction and online graph colouring. Before we discuss the role of Weihrauch reduction in online colouring problem we give a brief overview of the latter.

5.2.1.
Online graph colouring. Many problems can be re-cast as colouring problems, for example Bin Packing. Indeed, colouring can be thought of as avoiding configurations. In basic graph colouring, we are simply avoiding an edge connecting vertices of the same colour, but we could instead avoid, for example, triangles or any finite set of configurations in some kind of constraint satisfaction problem. However, as this is an introductory paper we will stick to basic graph colouring. There is a large literature on this area such as Kierstead [67]. Graph colouring is quite a flexible tool, and many algorithmic meta-theorems such as for monadic second order logic (like Courcelle's Theorem (see [32,50])) can be viewed as colouring with constraints. We believe that this material has great online potential.
We will mention some of this in this subsection. As an illustrative example, we will online colour finite or infinite trees and forests. So the objects of interest are forests being enumerated one vertex at a time. Along with the set of vertices the enumeration will also need to include the adjacency relation amongst the vertices already enumerated. In other words, at the (n + 1)-th step the enumeration will provide us an index for the (n + 1)-th vertex v n as well as a finite binary string encoding whether v n v j ∈ E(G) for each j < n. To represent the space of all enumerations of a finite graph with n vertices, we can use a finite branching tree of height n; and so to represent the space of enumerations of all (finite or infinite) graphs we can use a representation δ with domain a compact subset T ⊂ ω ω .
The result below is an easy (restated in our notation) result from the folklore essentially following from Bean [10]. It works for any (not necessarily primitive recursive) total online procedure.
Proposition 5.7. For every online algorithm A there is a σ ∈ T of length 2 t−1 such that the respective graph δ(σ) cannot be coloured by A in fewer than t colours.
We will write χ A (G σ ) for the number of colours used to colour G when processed by the online algorithm A. The above is nearly optimal, in that we have the following: Theorem 5.8 (Lovasz, Saks and Trotter [73]). There exists an online algorithm A such that for every 2-colourable graph G, if G has n vertices then χ A (G) ≤ 1 + 2 log n.
This brings us to the notion of a performance ratio. Most algorithms taught in a standard combinatorics class are offline. This means that given a finite structure H as input, the offline algorithm is allowed to read the whole of H before performing its calculations and giving the output. This is in contrast to an online algorithm, which must produce the next bit of the output after scanning the next bit of the (encoding of) the input H. This clearly puts the online algorithm at a disadvantageous position, for an input H can hide critical information in its global structure which an offline algorithm (but not an online one) can see before beginning to write the output. This motivates the definition below where we compare how much an online algorithm is disadvantaged compared to an offline one.
Consider the situation of an inductive problem in a class C, and suppose we have an optimisation problem. Then associated with a σ ∈ I, τ ∈ S will be a cost function c(σ, τ ), measuring the cost of solution τ for problem instance σ, where the cost is a certain metric used to judge how good a solution to a problem is. Now the performance ratio of an algorithm f is the ratio of c(σ, o(σ)) with c(σ, f (σ)) where o(σ) is an optimal solution for σ; meaning the offline solution.
We illustrate this with colourings. In this case, the problem would be a graph, the solution would be a colouring of the graph, and the cost of a solution would be the number of colours used by the colouring; the fewer colours used by a solution, the better it is. Thus the cost of an optimal solution to a given problem G is the offline chromatic number of G. The offline chromatic number of a graph G will be denoted by χ off (G) and for forests, we would have χ off (G) = 2, as it is well-known that trees and forests can be offline coloured in two colours.
Definition 5.9 (Sleator and Tarjan [85]). The performance ratio of an algorithm A in a represented space is defined to be Here we are stating the definition for graph colouring but the definition applies to any online optimisation problem, as above. In the case of colouring forests, we see that the approximation ratio is O(log(|σ|)). In the infinite case, the relevant approximation ratio is the growth rate of r(σ) for all paths in the tree T representing the problem.
For example, a graph is called d-inductive (or d-degenerate) if the vertices of G can be ordered as {v 1 , . . . , v n } so that for every i ≤ n, |{j > i | v i v j ∈ E}| ≤ d. For example, by Euler's formula, all planar graphs are 5-inductive. For those who are familiar with graph theory, d-inductive graphs also include all graphs of treewidth d, an extremely important class in algorithmic graph theory (see Downey and Fellows [32], for example). Again note that d-inductive graphs have a compact representation.
Theorem 5.10 (Irani [56,57]). Let σ represent a d-inductive graph of height n. Then first fit will use at most O(d log n) many colours to colour G σ . Moreover, for any online algorithm A, there is a d-inductive G σ such that χ A (G σ ) is Ω(d log n).
Sometimes, this growth rate reaches a limit, as in problems with constant approximation ratios.
The classical example is Bin Packing, which can be viewed as a graph colouring problem. We can think of bins as colours, and the objects as having sizes, and the constraint being that we cannot have more objects of a specific colour than the bin constraint (bin size). That is, Bin Packing takes as input sizes a i ∈ N and a parameter V (representing each bin size), and assigned colour c(a i ) subject to c(a i )=c a i ≤ V for each c. Here we seek to minimize the number of colours (i.e., the number of bins used).
Notice that Bin Packing is another example of colouring with constraints.
Theorem 5.11 (see [44]). First fit gives a performance ratio of 2 for online Bin Packing.

5.2.2.
Online reduction. In this subsection we define a new version of Weihrauch reduction, and we also give a non-trivial example of such a reduction between two distinct online problems. Let X and Y be spaces represented by names in 2 ω (for convenience). Again we think of f and g as being solutions for minimisation problems corresponding to X and Y , respectively. Thus, for example, we are thinking of X and Y as inductive structures with filtrations {X n | n ∈ N} and {Y n | n ∈ N} respectively. Then the strings of length n represent the structures of height n, and f (σ) will represent a solution to the problem represented by σ. Thus they will have an associated cost which in the case of graph colouring is the number of colours used so far, denoted as c(·). We will denote f off and g off as offline solutions. That is f off (σ) would be the solution to the minimisation problem X n of height n with δ(σ) = X n , and similarly g off . We state the below for single valued functions, but again there is an analogous multivalued version, where the solution produced for g should be within the correct ratio. The idea of the following is that on input α n, we want to compute (a representation of) f (α n) 8 . To this we will apply the algorithm B to generate an input to (a representation of) an input for g, and then use the algorithm A to translate this back to give f (α n). Again we emphasis that this is all working with representations, and should be read this way.
Definition 5.12. Let f, g be functions on 2 ω . Then f is called ratio preserving online reducible to g, f ≤ r O g, if there are (type II) online computable functions A and B with and a constant d, such that for all n, f (α n) = A(α n, g(B(α n)), and the ratio of c(f (α n)) to c(f off (α n)) is at most d times the ratio of c(g(B(α n))) to c(g off (B(α n))).
The fact below isolates the most important feature of the reduction. .
To give a non-trivial example of an online reduction we need several definitions. In classical colouring, Kierstead investigated online colouring of Interval Graphs. A graph G = (V, E) is called a k-interval graph if each vertex v of G can be represented by a closed subinterval of [0, 1] such that if I v represents v and I w represents w, then if vw ∈ E, I v ∩ I w = ∅, such that the largest number of intersecting intervals (the cutwidth) is at most k. These are exactly the graphs which have Pathwidth ≤ k, a graph metric coming from the Robertson-Seymour minors project (see [82,32]).
Definition 5.14. Let ColInt k denote the online problem of colouring a k-interval graph. (We leave the precise representation of the problem to the reader.) The other online problem is on covering of an interval partial ordering by chains. A partial ordering (P, ≤) is called an interval ordering if P is isomorphic to (I, ≤) where I is a set of intervals of the real line and x ≤ y iff the right endpoint of x is left of the left endpoint of y. Interval orderings can be characterised by the following theorem. The width of an interval ordering (P, ≤) is defined naturally to be the minimum over all presentations of the maximum number of intervals covering some point of [0, 1]. Given an interval ordering (P, ≤) of width k, our goal is to cover it with as few chains as possible; the chains do not have to be disjoint. 8 We want to avoid explicit representations, but of course we should have F representing f with F acting on 2 ω , and for any α ∈ 2 ω , limn F (α n) realizes (represents) f (α). Recall that a chain in a partial ordering is a ≤-linearly ordered subset. A collection of chains {C 1 , . . . , C q } covers (P, ≤) if each element of P lies in one of the chains. An antichain is a collection of pairwise ≤-incomparable elements.
Definition 5.16. Let ChInt k denote the online problem of covering an interval ordering (P, ≤) of width k by chains (which are not necessarily disjoint). We leave the precise representation of the problem to the reader.
The theorem below gives a non-trivial example of an online ratio-preserving reduction between online problems. The proof of the theorem below is essentially an analysis of the clever argument given in Kierstead and Trotter [68].
Theorem 5.17. For any positive k ∈ N there is an online solution g to ChInt k with a constant performance ratio which can be transformed into an online solution f to ColInt k with the property f ≤ r O g via a constant d = 1. Corollary 5.18 (Kierstead and Trotter [68]). There is an online algorithm to colour k interval graphs with a constant competitive ratio.
Proof of Corollary. Kierstead and Trotter [68] showed that every online interval ordering of width k can be online covered by 3k − 2 many chains.
Since ColInt k ≤ r O ChInt k and is witnessed via a reduction with constant d = 1, it remains to apply Fact 5.13.
Proof of Theorem 5.17. The basic idea is quite simple. Take our online k interval graph, turn it into an online interval ordering of width k, and then consider that chain covering as a colouring. However, to see that this idea works, we need to argue that there is an online solution g to the interval chain covering problem which uses only the information about comparability of various elements, and not their ordering.
We first prove the following. Suppose that (P, ≤) is a online interval ordering of width k. Then P can be online covered by 3k − 2 many chains. We need the following lemma whose proof is fairly straightforward. For a poset P , and subsets S, T , we can define S ≤ T iff for each x ∈ S there is some y ∈ T with x ≤ y. (Similarly S|T etc.) Lemma 5.19. If P is an interval order and S, T ⊂ P are maximal antichains the either S ≤ T or T ≤ S.
The algorithm for chain covering uses induction on k. We consider the vertices as 1, 2, . . . with p added at step p. If k = 1 then P is a chain, and there is nothing to prove. Suppose the result for k, and consider k = 1. We define B inductively by Here B p denotes the amount of B constructed by step p of the online algorithm. Then B is a maximal subordering of P or width k. By the inductive hypothesis the algorithm will have covered B by 3k − 2 chains. Let A = P − B. Now it will suffice to show that A can be covered by 3 chains, and then these will be covered by the greedy algorithm.
To see this it is enough to show that every elements of A is incomparable with at most two other elements of A. Then the greedy algorithm will cover A, as we see elements not in B.
Lemma 5.20. The width of A is at most 2. Proof. To see this, consider 3 elements q, r, s ∈ A. Then there are antichains Q, R, S in P of width k with q|Q, r|R and s|S. Moreover these can be taken as maximal antichains. Applying Lemma 5.19, we might as well suppose Q ≤ R ≤ S. Suppose that r|q and r|s.
Then we prove that q < s. Since q|r and width(P ) ≤ k + 1, there is some r ∈ R with q and r comparable. Since q|Q, r ∈ Q. Since the width of B is ≤ k, there is some q ∈ Q q and r comparable. Since Q ≤ R, there is some r 0 ∈ R with q ≤ r 0 . Since eR is an antichain, q ≤ r . Since q|q , q ≤ r . Similarly, there exists r ∈ R with r ≤ s. Since P does not have any ordering isomorphic to 2 + 2, we can choose r = r , and hence q < s. Now we suppose that r, q, s, t are distinct elements of A with q|{r, s, t}. Then without loss of generality r < s < t since the width of A is at most 2. Since s ∈ A there is an antichain S ⊂ B of length k with s|S. Since s|q, and width(P ) ≤ k + 1, q is comparable with some element s ∈ S. If s < q, then s |r and hence the suborder {s , q, r, s} is isomorphic to 2 + 2. Similarly, q < s implies s |t and then the subordering {q, s , s, t} is isomorphic to 2 + 2. Thus there cannot be 4 elements r, q, s, t of A with q|{r, s, t}. Hence A can be covered by 3 chains.
It is easily see that the procedure above uses only comparability of intervals. Thus, the theorem follows.
Problem 5.21. Investigate online reduction between online algorithms in the literature.
We also expect that the online reduction may lead to new online algorithms based on the already existing ones.
Also graphs with constrained decompositions such as those of bounded treewidth, pathwidth, clique-width, etc have been extensively studied in the literature, and particularly combine well with algorithmic meta-theorems (see e.g. Downey-Fellows [32], Flum and Grohe [42], Grohe [50] for a sample).
One example is given by k-interval graphs met above which are those of pathwidth ≤ k. A G of pathwidth k has a path decomposition which is a collection of sets of vertices V 1 , . . . , V n all of size ≤ k + 1 such that for all vertices v ∈ V (G), there is at least one i with v ∈ V i , if xy ∈ E(G), then for some i, {x, y} ⊆ V i and finally if x ∈ V i and x ∈ V j (with i < j) the for all q ∈ [i, j], x ∈ V q . The last property is called the interpolation property, and says that pathwidth is kind of a measure of how far you are from being either a grid or a clique. Now given such a path decomposition, and some optimisation property we want to solve (such as for the largest clique), if the property is definable in monadic second order logic (even with counting), then we can solve the problem by dynamic programming (actually using special automata) beginning at V 1 and finishing at V n by the methods of Courcelle [32,42,50]. Problem 5.22. Investigate the extent to which this dynamic programming is online. Presumably, it will be online for properties defined by monadic second order counting logic with counting modulo some kind of delay.
Moreover, as we have seen above for the special case of colouring above, we get a constant ratio approximation algorithm, for a graph of pathwidth k, no matter how we are given the online presentation. The difference is that if we are a given a path decomposition as the presentation, then k + 1 colours will suffice. But perhaps the methods for colouring are more general. The point is that graphs of bounded pathwidth have very constrained structure. Problem 5.23. Investigate the approximability of monadic second order definable properties on graphs of bounded pathwidth, but given as arbitrary online presentations.
The same can be asked for graphs of bounded treewidth which has the same definition as pathwidth, but the structure of the decomposition is a tree and not a path. These also have dynamic programming algorithms, but are always leaf to root, whereas even given a tree decomposition as an online root to leaf structure, presumably some kind of algorithm will work, but it will no longer be automatic. This seems a great area to pursue.
Also related seems the idea of online parameterised problems [36,37], where we want an online solution to a problem with a fixed parameter. For example, k-Vertex Cover asks for a collection of vertices where each edge of a graph includes at least one of the vertices, and this is polynomial time for a fixed k. This is also online polynomial time for a fixed k by the following simple algorithm (so long as we are allowed 2 k many possible solutions). We can use the following simple method of building a tree of height 2 k by taking an edge, and branching on that edge, and then deleting the covered edges, and repeating. This process is also online. In [36,37] Downey and McCartin showed that the online view brings to the other parameters such as what they call persistence which characterises the extent to which a path decomposition does not resemble a fuzzy ball. The point is that online algorithms point at new parameters of a problem which deserve attention, in the same way that parameterised complexity showed that parameters allow a more fine grained understanding of the computational complexity of a combinatorial problem.
6. ∆ 0 2 processes, finite reverse mathematics, and Weihrauch reduction Imagine we are in a situation where the data we are dealing with is so large that we cannot see it all. At each stage s our goal is to build a solution f to some problem. But there might be no hope of giving a fixed solution at each stage n, and like a Triage Nurse making an ordering for patients to obtain medical attention, we would update our solution as more information becomes available. So for each n ≤ s we would be computing f (n, s) from the finite information σ with |σ| = n. For simplicity we state the next definition for combinatorial problems with totally disconnected representations, and take 2 ω as the representing example. Definition 6.1. A limiting online algorithm on 2 ω is a computable function A such that for each s, A(α s) computes a string {f A (n, s) | n ≤ s} such that lim s f A (n, s) exists for each n.
As usual we would have A(α g(s)) for the g-online version.
We can then compare combinatorial problems by how fast their limits converge. This gives a fine grained measure of the complexity of combinatorial problems. For example, consider the "theorem" that every finite binary tree of height n has a path of length n. Then we can consider the existence of a uniform function A which takes a given binary tree of height n to a path. This is an online limit problem where the underlying space X is that with nodes generated by the collection of binary trees of height n at level n. The completion of this will represent paths through infinite binary trees. Remark 6.3. We could argue that the Reverse Mathematics principle W KL 0 which states that every infinite binary tree has a path, is equivalent to the statement that there is a limiting online algorithm for finding paths which works on X. We call this limiting online paths.
A binary tree T of height n is called separating if for each j ≤ n − 1, for any node σ on T of height j, and i ∈ {0, 1}, if σ * i does not have an extension in T of height n, then for all τ of length j, neither does τ * i. Let X S be the totally disconnected space representing the collection of all separating finite trees. The following is a online interpretation and refinement of the classical fact that Weak König's Lemma is equivalent to Weak König's Lemma for separating classes. Proposition 6.4. There is a (2 n+1 − 2)-limiting online reduction which finds limiting online paths in X from those in X S .
Proof. We remind the reader of how this proof works. Suppose we have a tree T s of height s. In an online fashion, we will generate a tree H of height 2 s+1 . This is done inductively. At step 1, we can think of the nodes labeled 0 and 1 in T as being represented by 0 and 1 in H. At step 2, in T it is possible for us to have 00, 01, 10, 11 and these are represented by 4 levels in H, with height 2 representing 00, level 3 01, level 4 10, and level 5 11. Now we continue inductively. This makes level n of T correspond to trees of height 2 + 4 + · · · + 2 n = 2 n+1 − 2.
As the construction proceeds, if some σ fails to have an extension at length s, in T s , there will be some shortest σ σ which fails to have a length s extension in T s . Then in H s , we don't extend to length s (from length s − 1) all paths corresponding to ν * j with j representing σ in H s−1 .
Consider any limiting online algorithm for finding a path for path α corresponding to H, in X 0 , This naturally and in a online way allows us from level 2 s+1 to generate an online path in T s , and is clearly a limiting online reduction. Problem 6.5. Figure out the smallest g in place of 2 s+1 in the reduction above, which would give a precise measure of how tight the reverse mathematics relationship is.
There seems a whole research programme available here. For example, we could be given an online bipartite graph B σ for σ ≺ α. We either have to build a complete matching or demonstrate that Hall's condition fails. One representation of this problem will involve a compact space where the nodes are bipartite graphs of height 2n, say, and where the paths all represent graphs which obey Hall's condition. The online operator will act on this compact tree of representations for graphs B σ . Now as the process goes along, we might have to update the solution at hand. That is, the online process has B σ → M σ , One intriguing example is that of finding a basis in a vector space. In the case that the vector space is over the rationals, then presumably this will correlate to some principle like ACA 0 . But consider a finite field such as GF (2). We know that RCA 0 proves that we can find a basis for a vector space for this field. But it is not hard to construct an online vector space over GF (2) for which there is no online algorithm to do this, unless we have a computable delay. Comparing the online complexity of such problems with such computable delay would see to give significant insight into the fine structure of reverse mathematics. In this particular case, we also note that a polynomial time algorithm for finding a basis of a polynomial time vector space was proven to be equivalent to P = N P suggesting intriguing connections with complexity theory. There is some relevant work by Hirst [52], who have proved that finding the basis of a vector space has the Weihrauch complexity of lim, i.e., is on the second level of the Borel hierarchy.
We remark that there are many processes that have been investigated and fall under the model we have introduced. One such example is algorithmic learning theory, such as EX-learning (Gold [48]). Here one is presented with a 0 , a 0 , . . . values for a function f (0), f (1), . . . , and we need to eventually print out an index for ϕ e = f from some point onwards. This is clearly an example of an online algorithm, and fits into this section as a limiting algorithm. There are interesting connections between these ideas and reverse mathematics; see, e.g., [18,19,20,55]. For online learning in computer science, see [87].
Another area which could be incorporated would be asynchronous computing. Here we have a series of agents A 1 , . . . , A k communicating through asynchronous channels, and attempting to compute a set of functions f 1 , . . . , f k , where there might be e.g. some kind of crash failure meaning that one of the agents dies and stops sending signals. For example, the Consensus problem asks for all the f i 's which have not crashed to give the same value. A run could be represented in a space of possible communications and failures. There are a number of reductions which have been produced in this area, showing that Consensus is a certain kind of minimal failure, and other problems can be solved if Consensus can (Chandra and Toueg [26]). It would be interesting to see if these results can be placed in the hierarchy of online limiting reductions, since they appear to look like online limiting reductions.
Finally, one exciting possibility would be to include randomization in this setting. Randomized online algorithms are quite common in practice (see e.g. Albers [4]). For this we could use the theory of algorithmic randomness (see [33,72,79]) easily. For example, an online algorithm with randomized advice (i.e. representing a coin toss at each stage) could be done via (using 2 ω as a representative space) by considering online algorithms from 2 ω × 2 ω → S, with S some solution space, with the first copy of 2 ω representing the problem, the second representing "advice" strings and S the solution space. The online algorithm could take (σ, τ ) → s n , and would run on extensions of τ provided that [τ ] avoids some algorithmic randomness test, such as a Martin-Löf test. Using oracles we could also tie this to the theory of algorithmic randomness using the "fireworks" method of Shen (see Bienvenu and Patey [11]). Similar approach has been implemented for offline algorithms in [15], and for several interesting results more closely related to our online setting see [16]. In the online situation, these ideas remain to be further explored.

Real functions.
So far all objects of study have been discrete and spaces compact. However, there is a perfectly reasonable extension of these ideas to continuous objects such as the space of continuous functions on the unit interval. There has been a lot of work on complexity theory of real functions; see, e.g., Ko [70]. In terms of applications, a natural object of study would be online analysis; analytic processes which run quickly and only use local knowledge of the precision of the inputs. As we observe below with natural representations, addition of reals x, y with precision 2 −n only needs x and y to within 2 −(n+2) . Integration and other standard processes have similar commentry, but we leave this to a later paper. Also there are other online processes on non-compact spaces, such as EX-learning, or the KC theorem discussed earlier. We also defer discussion of such topics for later papers, and here stick to analysis. The main goal of this section is to demonstrate the role of primitive recursion as a useful abstraction. The content of this section is not technically hard, but one can easily imagine a much deeper general framework that could emerge from these basic ideas.
Recall that a Cauchy sequence (r i ) i∈N of rationals is fast if |r i − r i+1 | < 2 −i , for every i. These are the names which represent the space. A function f : [0, 1] → R is computable if there is a Turing functional Φ such that, for each x ∈ [0, 1] and for every fast Cauchy sequence χ converging to x, the functional Φ enumerates a fast Cauchy sequence for f (x) using χ as an oracle. In particular, using the terminology, we would be generating a representation of the function via names of Cauchy sequences in such a way that it is representation independent. That is, (Φ χ (n)) n∈N is a fast Cauchy sequence for f (x). This in particular means that, on input (r i ) i∈N , the use of Φ (r i ) i∈N (j) corresponds to δ when = 2 −j+1 in the standard -δ definition of a continuous function.
It is well-known that Weierstrass approximation theorem is effectivisable in the sense of Turing computability [80]. This means that a function f : [0, 1] → R is computable iff there is a computable sequence of polynomials (p i ) i∈N with rational coefficients with the property sup x∈[0,1] |f (x) − p i (x)| < 2 −i , for every i.
We have seen that the most general definition of being online for combinatorial structures involves being g-online for some primitive recursive function g. That is, there is a translation between using g(n) many bits of α to compute n bits of f (α). We have also seen that for most natural online situations, we can translate this to a wider tree where α n represents α g(n), so we can use strict (ibT primitive recursive) procedures. It is not completely clear if this is natural in the setting of analysis, since we might wish to stick to standard representations of the spaces, like 2 ω and ω ω , as above.
We first consider the most general setting where we allow g-online for a primitive recursive g, so using g(n) bits to decide the output for length n. We will call this punctually computable. In this case, there are two natural definitions of what it would mean for such an f to be "online" computable in the most general sense of primitive recursion. The first notion is the most straightforward sub-recursive version of the standard definition. Definition 7.1. A function f : [0, 1] → R is punctually computable if there is a primitive recursive functional Φ such that, for each x ∈ [0, 1] and for every fast Cauchy sequence χ converging to x, the functional Φ enumerates a fast Cauchy sequence for f (x) using χ as an oracle.
By restricting ourselves to dyadic rationals, we can assume that fast Cauchy sequences come from a compact totally disconnected space of the names of dyadic rationals in [0, 1]. Thus, Lemma 3.5 can be applied to ensure that there is no ambiguity in the notion of a primitive recursive functional in this case. In particular, the definition has a natural polynomial-time version which we omit (see [70]); the same applies to any natural complexity class which may be of interest.
The second version filters through the theorem of Weierstrass. It views f as a primitive recursive point in the metric space (C[0, 1], sup) rather than as a functional. Clearly, there is a natural polynomial-time modification of the definition above which we omit.
Every uniformly punctually computable f is punctually computable. Are these two definitions equivalent? It is not completely evident why Weierstrass approximation theorem should hold primitively recursively. Indeed, in the standard Turing computable proof we would wait for a cover of [0, 1] by δ i -balls B i such that f (B i ) has diameter < , for every i. It seems that even when f is punctual this search could be unbounded.
Nonetheless, the theorem below shows that these definitions are equivalent. This result is not really new. With some effort its proof can be extracted from [70], but the book is mainly focused on polynomial time and exponential versions of the definitions above. There is much combinatorics specific to complexity theory which significantly obscures the idea behind the proof. Primitive recursion strips away complex counting combinatorics thus clarifying the idea. Proof sketch. The idea here is similar to that in the proof of Lemma 3.5. Fix n and consider the functional Ψ x n = Φ x (n) which uniformly primitively recursively outputs the first few bits of f (x) up to error 2 −n , for any input x. Since Ψ n is given a primitive recursive scheme (with parameter n), we can work by induction on the complexity of the scheme and emulate all its possible computations at once, as in Lemma 3.5. Since the space of dyadic presentations of rationals is primitively recursively compact, this will lead to a primitively recursively branching tree of possible computations whose height is determined by the syntactical complexity of the primitive recursive scheme. By the choice of Ψ n , one of these computations must work for an arbitrary x ∈ [0, 1]. Thus, we have primitively recursively calculated an open cover [0, 1] by basic open intervals J 1 , . . . , J k , such that whenever x, y ∈ J i we have |f (x) − f (y)| < 2 −n+1 . If z i is the center of J i , then define (the graph of a) piecewise linear function h n by connecting points (z i , Ψ z i n ) and (z i+1 , Φ z i+1 n ), i = 1, . . . , n-1. Note that the values of the Φ z i n have already been calculated. Since the intervals are overlapping, this piecewise linear function h n approximates f with precision 2 −n+2 . We can primitively recursively smoothen h n by replacing it with a polynomial p n such that sup x∈[0,1] |p n (x) − f (x)| < 2 −n+3 .
See Chapter 8 of [70] for a detailed analysis of the polynomial-time versions of Weierstrass approximation theorem. Recall that in the proof sketch above we generated the tree of possible computations. For a polynomial-time operator this tree may be exponentially large at worst. This difficulty cannot be circumvented and the polynomial-time analogy of the theorem above fails as explained in great detail in [70].
We see that punctual analysis fits somewhere in-between computable analysis and polynomial-time analysis, and there is likely much depth in the subject. Such a theory could provide us with a stronger technical link between computable and feasible analysis. Some basic foundations of elementary primitive recursive analysis was established in the 1950s and the 1960s; we cite [89,90,27] and the book [49]. Nonetheless, is seems there has been no recent dedicated study of primitive recursive continuous functions. Primitive recursive presentations of analytic separable spaces (such as, say, the Urysohn space) have not been systematically studied either. Now in the case that we want to look at the strictly online model, we are stuck with using, for instance, the bit representation of a real x, and would be working, for example, with 2 ω . Then to compute f (x) with precision 2 −n we would need x n. We might ask for delay k so might use 2 −(n+k) . Now in this case, we see that, for example addition is online (on 2 ω × 2 ω ) with delay 2, and if f is a given online computable function which is bounded then x 0 f (x)dx would also be online computable with delay 2. We remark that this model would seem to be one emulating classical numerical analysis. We cite [96] and Chapter 7 of [97] for some closely related results in computable analysis, and see [86,91,77] for results on online arithmetic in computer science.