When Can We Answer Queries Using Result-Bounded Data Interfaces?

We consider answering queries on data available through access methods, that provide lookup access to the tuples matching a given binding. Such interfaces are common on the Web; further, they often have bounds on how many results they can return, e.g., because of pagination or rate limits. We thus study result-bounded methods, which may return only a limited number of tuples. We study how to decide if a query is answerable using result-bounded methods, i.e., how to compute a plan that returns all answers to the query using the methods, assuming that the underlying data satisfies some integrity constraints. We first show how to reduce answerability to a query containment problem with constraints. Second, we show"schema simplification"theorems describing when and how result-bounded services can be used. Finally, we use these theorems to give decidability and complexity results about answerability for common constraint classes.


Introduction
This paper studies how to query web services which expose programmatic interfaces to data. We can model a web service as a function call (denoted here as an access method) which given a set of arguments for some attributes of a relation returns all the matching tuples for the relation.
Example 1.1. Our running example is a web datasource that exposes university employee information. Information on professors is available via a Profinfo access method, requiring as input a professor's id, and returning the last name and salary of the professor. The datasource also exposes a method Udirectory that requires no input, and returns the id, address, and phone numbers of all university employees.
Our goal is to answer queries using the services provided by the datasource. In the setting of Example 1.1, the user queries are posed on the relations Profinfo and Udirectory corresponding directly to the data returned by the two services. To answer these queries, we will rely on the semantics of the data: in this example, we rely on the obvious referential constraint from Profinfo to Udirectory on the attribute id, namely, every id in Profinfo is in Udirectory.
Consider the query Q 1 asking for the names of professors earning salary at least 10000. We can implement this via a plan that first accesses Udirectory to get the set of all ids, and then accesses Profinfo with each id to obtain the salary, filtering the results to return only the names with salary at least 10000. This plan reformulates the query over the interfaces: it uses the interface and it is equivalent to the given query for all instances satisfying the referential constraint.
Prior work has already studied how to handle both integrity constraints and interface restrictions specified via access methods (e.g. [18,8]). But web services often additionally impose result bounds on what they return. For example, the Udirectory method in Example 1.1 may be bounded, returning at most 100 entries. If this is the case, then the plan proposed above is not complete: the access to Udirectory may be missing some result tuples, so our query result would be incomplete. Hence, it is challenging to reformulate queries against result-bounded services, because their nondeterminism makes it difficult to understand whether the plan will always return the correct answer.
An obvious question is whether result-bounded services can ever be of use to answer any queries in a complete way, i.e., without missing any results. However, it is not difficult to find cases where result-bounded methods can still be useful: Example 1.2. Consider the schema from Example 1.1 and query Q 2 asking if there is some university employee. We can clearly answer Q 2 with a plan that calls the Udirectory access method and returns true if the output is non-empty. It is not a problem that the method may omit some result tuples, because we only want to know if the result is nonempty. This gives a first intuition: result-bounded methods are useful to check for the existence of matching tuples.
Result-bounded methods can also be useful to answer queries when there are constraints involving keys, or functional dependencies: Example 1.3. Consider a variant of the schema in Example 1.1, where we have a Udirectory 2 service taking an id as input and returning the addresses and phone numbers in tuples with this id. Also assume that each employee id has exactly one address, but possibly many phone numbers. Last, assume that Udirectory 2 is result-bounded and returns at most one answer whenever it is given an id.
We may still use the Udirectory 2 service to answer queries. Consider the query Q 3 asking for the address of the employee with id 12345: we can answer Q 3 by first calling Udirectory 2 with 12345 and projecting onto the address field. Due to the functional dependency of address on id, we know this will contain the employee's address, even though only one of the phone numbers will be returned.
This gives a second intuition: result-bounded methods are useful when there is a functional dependency that guarantees that a projection of the output will have complete results.
In this paper, we formally study the situations in which result-bounded methods can be useful, and prove decidability results for answering queries in the presence of such methods. Formally, we study the answerability problem: given a schema with resultbounded access methods and with integrity constraints, and given a query, we wish to determine if we can answer the query using the methods. The latter means that we can execute a query plan on the methods which will correctly return all results, for every possible underlying database and for any way to resolve the nondeterminism of resultbounded accesses. We study this problem and show decidability and complexity results for this problem for many constraint classes (see Table 1 for a summary).
The first step of our study is to reduce the answerability problem to a problem of query containment with constraints. Such a reduction is well-known in the context of reformulation of queries over views [26], and in answering queries with access methods without bounds [7], but had not been studied yet in the context of nondeterministic access methods such as result-bounded ones. We show that this reduction technique can still be applied in the presence of result bounds. However, the resulting query containment problem involves complex cardinality constraints, so it does not immediately lead to decidability results.
Our second step is thus to show schema simplification results: we establish that some of the result bounds can be ignored for the answerability problem. These results capture and generalize the ways in which result-bounded services were useful in the examples above. For instance, we show that when the constraints are given by inclusion dependencies (i.e., referential constraints), then result-bounded methods are only useful as an existence check, as in Example 1.2. We also show that when the constraints are functional dependencies, result-bounded methods are only useful to access the functionallydetermined part of the method output, as in Example 1.3. Our results also capture more involved ways to use result-bounded services for other constraint languages. For instance, under guarded dependencies. we show that the value of the bounds make no difference for answerability of queries: we can equivalently assume that result-bounded methods return at most one tuple.
These schema simplification results yield decidability of the answerability problem for many constraint classes, but they do not always give tight complexity bounds. Hence, the third and final step in our analysis is to study more closely the simplified containment problems that arise from answerability for some common constraint languages. For this we bootstrap the result of Johnson and Klug [22], which shows that query containment with inclusion dependencies is in PSPACE in general and in NP when the "width" (number of exported variables) is bounded. Following the approach introduced in [21], we identify a class of dependencies that can be linearized, in the sense that containment with these constraints reduces to containment with linear TGDs, a mild generalization of inclusion dependencies. We show that the constraints emerging from some of our answerability problems fall into this linearizable class. In the process, we provide new complexity bounds on query containment with TGDs, and new bounds on the complexity of answerability with access methods even in the absence of result bounds. We summarize our main contributions as follows: • We introduce the notion of result-bounded methods, a natural tool to reason about limited accesses on Web data, and the corresponding problem of determining whether a query is answerable via such methods, in the presence of constraints.
• We give a reduction (Theorem 3.2) of the query answerability problem to query containment with constraints, a widely studied problem in knowledge representation and database theory.
• We show (Theorems 4.2, 4.5, 4.6, 4.7) that for common classes of constraints, we can significantly limit the possible ways in which result-bounded methods can be used to answer queries. This gives insight into how result-bounded methods and constraints interacts, and is also important for our study of the corresponding containment problem. Our results rely on the technique of blowing up models, i.e., we "pump" models to increase the number of outputs of an access, without violating constraints or changing query answers.
• We analyze the containment problems associated to answerability with access methods and bounded-width IDs, using the limitative results above, results of Johnson and Klug [22], and a refinement of the linearization technique of [21] that reduces query containment with guarded TGDs to query containment with linear TGDs. This gives finer bounds on the complexity of answerability for a number of constraint classes (Theorems 6.6 and 6.11), both with and without result bounds.
Related work. Our paper relates to a line of work about finding plans to answer queries using access methods. The initial line considered finding equivalent "executable rewritings" -queries where the atoms are ordered in a way compatible with the access patterns. This was studied first without integrity constraints [25,24], and then with disjunctive TGD constraints [18]. Later [8,7] formulated the problem of finding a plan that answers the query over the access patterns, distinguishing two notions of plans with access methods: one with arbitrary relational operators in middleware and another without the difference operator. [8,7] studies the problem of getting plans of both types in the presence of integrity constraints, extending the method of [18], which reduces the problem of whether a query can be rewritten to an executable query to query containment with constraints. Further [8,7] relates the reduction to a semantic notion of determinacy, originating from the work of Nash, Segoufin, and Vianu [26] in the context of views. Section 3 will further extend the reduction to containment and its connection with determinacy notions to the presence of result bounds, relying heavily on the techniques of [18,26,8,7]. The complexity of the answerability problem has also been considered in the setting with cardinality constraints [20]. Surprisingly, we will show (Section 4) that it is not necessary to reason with cardinality constraints to decide answerability with result bounds. None of these prior works has considered the impact of result bounds on services. Unlike cardinality constraints, result bounds do not restrict the source instances, but rather make the access API non-deterministic. Non-determinism in query languages has been studied in other contexts [3,2]. However, to the best of our knowledge, the ability of a non-deterministic data API to implement deterministic queries has not been studied in the past.
The techniques we use to get finer complexity bounds for the answerability problem (Section 6) relate to the Datalog ± research program of getting bounds for query answering with restricted classes of constraints [10,23,13]. As in [10], we deal with guarded rules. Refining a technique from [21], we isolate classes that can be reduced to well-behaved classes of Linear TGDs, where more specialized bounds [22] can be applied.
Paper structure. Section 2 presents our model of accesses and plans over services, and gives the definition of a query being answerable by a plan. Section 3 describes the reduction of answerability to query containment which we use in our later results. Section 4 presents our schema simplification results, while Section 5 applies them to show decidability of the answerability problem. Section 6 introduces our linearization technique, and applies it to provide tighter complexity bounds on answerability. The paper closes in Section 7 with conclusions and discussion of future work. The complete versions of most proofs are deferred to the appendix.

Preliminaries
We consider queries over data that satisfies integrity constraints. The data is over a relational signature σ, consisting of a set of relations with an associated arity, which is a positive integer. Any number between 1 and the arity of R is a position of R. An instance of a relation of arity n is a set of n-tuples, while an instance I of σ consists of instances for each relation in it. We equivalently see instances as sets of facts: a fact R(a 1 . . . a n ) of a relation R indicates that (a 1 . . . a n ) is in the instance of R. The active domain of an instance I, denoted Adom(I), is the union of all values occurring in facts of I. An expression R(x 1 . . . x n ), with R a relation of arity n and x 1 . . . x n variables or constants, is a relational atom.
where the A i are relational atoms. We focus for simplicity on Boolean CQs, i.e., CQs with no free variables. Such a query Q is true in an instance I exactly when there is a homomorphism of Q to I: a mapping h from variables of Q to Adom(I) such that for every atom R(x 1 . . . x n ) in Q, the atom R(h(x 1 ) . . . h(x n )) is a fact of I. We consider various kinds of integrity constraints to express restrictions on instances. All of these are subsets of first-order logic (FO), with the active-domain semantics. We drop outermost universal quantifications for brevity in writing these constraints. We consider only queries and constraints that do not contain constants. The main integrity constraints that we will consider are dependencies, focusing on tuple-generating dependencies (TGDs), and functional dependencies (FDs).
A tuple-generating dependency (TGD) is an FO sentence of the form: φ( x) → ∃ y ψ( x, y) where φ and ψ are conjunctions of relational atoms: φ is the body of the TGD while ψ is the head. The exported variables are the variables of x which occur in the head. A full TGD is one with no existential quantifiers in the head. A guarded TGD (GTGD) is a TGD where φ is of the form A( x) ∧ φ ′ ( x) where A is a relational atom containing all free variables of φ ′ . A linear TGD is a GTGD where φ is a single atom. An inclusion dependency (ID) is a linear TGD where both φ and ψ consist of a single atom with no repeated variables: it intuitively expresses that tuples in one relation refer to tuples in another relation. The width of an inclusion dependency is the number of exported variables, and an ID is unary (written UID) if it has width 1: that is, there is exactly one exported variable. For example, R(x, y) → ∃z w S(z, y, w) is a UID.
A functional dependency (FD) is an FO sentence of the form: . . , n} is called the determinant, j ∈ {1, . . . , n} is called the determined position, and the x 1 . . . x n and x ′ 1 . . . x ′ n are all distinct variables. Intuitively, such an FD asserts that whenever two R-facts match on the positions of D, then they must match on position j as well: the FD is written D → j for brevity.
Chase proofs. We will show that deciding whether a plan exists can be reduced to deciding query containment under constraints, checking whether a CQ Q follows from another CQ Q ′ and some constraints Σ. For a Boolean CQ, this means that any instance that satisfies Q and Σ also satisfies Q ′ . We denote this as Q ⊆ Σ Q ′ .
In the case where Σ consists of dependencies, query containment under constraints can be resolved by searching for a chase proof [19]. Such a proof starts with an instance called the canonical database of Q and denoted CanonDB(Q): it consists of facts for each atom of Q, and its elements are variables of Q. The proof then proceeds by firing dependencies, as we explain next.
A homomorphism τ from the body of a dependency δ into an instance I is called a trigger for a dependency δ. We say that τ is an active trigger if τ cannot be extended to a homomorphism from the head of δ to I. In other words, an active trigger τ witnesses the fact that δ does not hold in I. We can solve this by firing the dependency δ on the trigger τ , which we also call performing a chase step, in the following way. If δ is a TGD, the result of the chase step on τ for δ in I is the superinstance I ′ of I obtained by adding new facts corresponding to an extension of τ to the head of δ, using fresh elements to instantiate the existentially quantified variables of the head: we call these elements nulls. If δ is an FD with x i = x j in the head, then a chase step yields I ′ which is the result of identifying τ (x i ) and τ (x j ) in I. A chase sequence is a sequence of chase steps, and it is a chase proof of Q ⊆ Σ Q ′ if it produces an instance where Q ′ holds.
It can be shown [19] that whenever Q ⊆ Σ Q ′ there is a chase proof that witnesses this. If all chase sequences are finite we say that the chase with Σ on Q terminates. In this case, the chase gives us a decision procedure for the problem of query containment under constraints.
There are well-known reductions between query containment with TGDs and the problem of certain answers under TGDs [19,10]. We will not need the definition of certain answers in this paper, but we will implicitly use this equivalence to transfer upper and bounds that are stated in the literature in terms of certain answers (e.g. from [10,4]) to the query containment problem. Query and access model. A service schema (or schema) Sch consists of (i) a relational signature; (ii) integrity constraints (given as FO sentences); and (iii) a collection of access methods (or simply methods). Each access method mt is associated with a relation R and a subset of positions of R called the input positions of mt. The other positions are called output positions of mt.
We optionally associate to each access method a result bound, a positive integer. Any method with such an annotation is a result-bounded method. We will also consider a weaker kind of annotation on a method, a result lower bound, also given by an integer. Informally, a result lower bound of k indicates a method that always returns at least k matching tuples when such exist, and a result bound of k further indicates that the method returns at most k matching tuples.
An access on an instance I consists of a method mt on some relation R and a binding AccBind for I, i.e., a mapping of the input positions of mt to values in I. The matching tuples of the access are the tuples for relation R in I that match AccBind on the input positions of R, and an output of the access is a subset P of the matching tuples. If there is no result bound or result lower bound on the method, then there is a unique output, which must by definition contain all matching tuples. If there is a result bound k on the method, then a valid output is any set J of matching tuples such that: • J has size at most k • for any j ≤ k, if I has ≥ j matching tuples, then J has size ≥ j If there is a result lower bound of k on the method, then only the second item above is imposed on outputs.
We give specific names to two kinds of methods. First, a method is free if it has no input positions. Second, a method is Boolean when all positions are input positions. Note that accessing a Boolean method with a binding AccBind just checks if AccBind is in the relation associated to the method (and bounds have no effect).
Plans. Prior works on access methods have studied several notions of "execution plan compatible with the methods". We review here RA-plans, following the syntax of [8,9,7]. An RA-plan is a program given as a sequence of access commands and middleware query commands: • Query middleware commands are of the form T := E, with E a relational algebra expression.
• Access commands are written T ⇐ OutMap mt ⇐ InMap E, and are specified by a target table T , a method mt on some relation R, a relational algebra expression E, an input mapping InMap mapping output attributes of E to input positions of mt, and an output mapping OutMap mapping positions of R to attributes of T . We often drop the mappings for brevity, writing an access command as: The output table T of the plan is indicated by writing Return T at the end, or Return E with an expression E, which stands for T := E and Return T . For example, the plan mentioned in Example 1.2 would be written: Return π ∅ T 1 ; RA-plans are so named because they allow arbitrary RA expressions in middleware and access commands. A monotone plan is a plan where the expressions E do not use the relational difference operator (but may use inequality joins); equivalently they make use of safe first-order logic expressions that do not use negation.
The meaning of a plan without result-bounded access methods is straightforward. To define its meaning with result bounds, we call an access selection a function mapping each access on I to a set of facts forming a valid output for the access, as defined above. Having chosen an access selection ς, we can associate to each instance I, each plan PL, and each table T set in PL an output by evaluating its commands in order. For an access command T ⇐ OutMap mt ⇐ InMap E, the output for T is obtained by evaluating E to get a collection of tuples and then performing an access with mt using each tuple, putting the union of the corresponding outputs selected by ς into T . For a middleware query command, we evaluate it as usual for relational algebra. The set of possible outputs of a plan with output table T is: given I we take the union, over all access selections ς, of the output on T for I and ς.
Our semantics for plans with result bounds assumes that performing the same access twice must return the same results (as they are given by the access selection); in other words, the access selection is fixed once and for all, and it cannot change throughout the execution of the plan. However, all our results still hold without this assumption: we present the alternative semantics in Appendix A.1 and prove that it yields the same notion of answerability.
Answerability. Let Sch be a schema consisting of access restrictions and integrity constraints, and let Q be a query over the vocabulary of Sch. A plan PL answers Q under schema Sch if the following holds: for all instances I (finite or infinite) satisfying the constraints, for every choice of access selection ς, the output of PL evaluated under ς on I is equal to the query output Q(I). In other words, evaluating the plan on any instance returns the correct output for Q, no matter which access selection is used to return tuples. Of course, a plan can have a single possible output (and answer a query) even if some intermediate subplan has multiple possible outputs.
We say that Q is monotone answerable when there is a monotone plan that answers it. In the body of the paper we only study monotone answerability, but most results extend to answerability with RA-plans: see Appendix H.

3.Reducing to Query Containment
We show how to reduce the monotone answerability problem to the problem of query containment under constraints, extending the approach of [18,8,7] to result bounds, and extending the connection between answerability and determinacy notions of [26,7]. The query containment problem corresponding to monotone answerability will capture the idea that if an instance I 1 satisfies Q and another instance I 2 has more accessible data than I 1 , then I 2 should satisfy Q as well. We first explain the notion of accessible data below. We then use accessible data to define a property, called access monotonicdeterminacy. We show that this property is equivalent to the existence of a plan, and that it can be expressed as a query containment problem.
Access monotonic-determinacy. Given a schema with result-bounded methods and an instance I, an accessible part of I is any subinstance obtained by applying access methods until we reach a fixpoint. Formally, we define an accessible part by choosing an access selection ς and inductively defining sets of facts AccPart i (ς, I) and set of values accessible i (ς, I) by: AccPart 0 (ς, I) := ∅ and accessible 0 (ς, I) := ∅ AccPart i+1 (ς, I) := mt method, AccBind binding in accessible i (ς,I) ς(mt, AccBind), accessible i+1 (ς, I) := Adom(AccPart i+1 (ς, I)) Above we abuse notation by considering ς(mt, AccBind) as a set of facts, rather than a set of tuples. These equations define by mutual induction the set of values that we can retrieve by iterating accesses (accessible) and the set of facts that we can retrieve using those values (AccPart).
We let AccPart(ς, I) := i AccPart i (ς, I). By monotonicity of the process, a fixpoint occurs at a finite iteration if I is finite or at the union of all finite iterations if I is infinite. As result-bounded methods are non-deterministic, there can be many accessible parts, depending on ς. In particular, after one iteration, the possible values of AccPart 1 are the possible results that can be obtained by performing free accesses.
We now define what it means for an instance I 2 to have "a larger accessible part" than instance I 1 . Informally, this means that there is a valid way to choose results in I 1 that can be extended to a valid choice in I 2 . Formally, for a schema Sch, a subinstance I Accessed of I 1 is access-valid for I 1 (on Sch) if, for any access performed with a method of Sch with values of I Accessed , there is a set of matching tuples in I Accessed which is a valid result for the access in I 1 . Note that an accessible part is always access-valid, but the converse is not necessarily true as an access-valid subinstance may contain tuples that are not actually accessible.
It turns out that, if I 1 and I 2 have a common subinstance I Accessed which is access-valid for I 1 , then I 1 has an accessible part contained in an accessible part of I 2 , and conversely: Proposition 3.1. The following are equivalent: (i) I 1 and I 2 have a common subinstance I Accessed that is access-valid for I 1 . (ii) There are A 1 ⊆ A 2 such that A 1 is an accessible part for I 1 and A 2 is an accessible part for I 2 .
Given a schema Sch with constraints Σ and result-bounded methods, we say that a query Q is access monotonically-determined (AMonDet, for short) if for any two instances I 1 , I 2 satisfying Σ, if I 1 and I 2 have a common subinstance I Accessed that is access-valid for I 1 , then Q(I 1 ) ⊆ Q(I 2 ).
The following result justifies the definition: Without result bounds, this equivalence of monotone answerability and access monotone determinacy is proven in [8,7], using a variant of Craig's interpolation theorem. Theorem 3.2 shows that the equivalence extends to schemas with result bounds (see Appendix B.1).
Elimination of result upper bounds. The characterization of monotone answerability in terms of AMonDet allows us to prove a key simplification in the analysis of monotone answerability. Recall that a result bound declares both an upper bound on the number of returned results, and a lower bound on them (for all j ≤ k, if there are j matches, then j must be returned). We can show that the upper bound makes no difference for monotone answerability. Formally, for a schema Sch with integrity constraints and access methods, some of which may be result-bounded, let Relax(Sch) have the same vocabulary, constraints, and access methods as in Sch, but for an access method mt in Sch with result bound of k, mt has instead a result lower bound of k in Relax(Sch), i.e., does not impose the upper bound. We can then show: Proposition 3.3. Let Sch be a schema with arbitrary constraints and access methods which may be result-bounded. A CQ Q is monotone answerable in Sch if and only if it is monotone answerable in Relax(Sch).
Proof. The proof will illustrate how we can reason about monotone answerability using AMonDet (thanks to Theorem 3.2). Consider arbitrary instances I 1 and I 2 that satisfy the constraints, and show that any common subinstance I Accessed of I 1 and I 2 is accessvalid for I 1 in Sch iff it is access-valid for I 1 in Relax(Sch): this implies the claimed equivalence.
In the forward direction, if I Accessed is access-valid for I 1 in Sch, then clearly it is accessvalid for I 1 in Relax(Sch), as any result of an access in I Accessed which is valid for I 1 in Sch is also valid in Relax(Sch).
In the backward direction, if I Accessed is access-valid for I 1 in Relax(Sch), it means that for any access in I Accessed and bound k, we can obtain a response in I Accessed with at least k tuples, which is a valid result for the access in I 1 . If the number of tuples returned is ≤ k, we see immediately that the same response is a valid result for the corresponding access in Sch. If it is greater than k, clearly any choice of k tuples among the response gives a valid result for the corresponding access in Sch. This establishes the backward direction and concludes the proof.
Thanks to this, in our study of monotone answerability in the rest of the paper, we only consider result lower bounds.
Reducing to query containment. Now that we have reduced our monotone answerability problem to AMonDet, and eliminated result upper bounds, we explain how to restate AMonDet as a query containment problem. For each relation R, we introduce two copies R Accessed and R ′ with the same arity. For any formula φ, set of formulas Σ, or CQ Q over the original schema, we let φ ′ or Σ ′ or Q ′ be formed by replacing any relation R with R ′ . We let accessible be a new unary predicate. The AMonDet containment for Q and Sch is the CQ containment Q ⊆ Γ Q ′ where the constraints Γ include the original constraints Σ, the constraints Σ ′ on the R ′ relations, as well as the following accessibility axioms: where x denotes the input positions of mt in R.
The AMonDet containment simply formalizes the definition of AMonDet. The idea is that R and R ′ represent the interpretations of the relation symbol R in I 1 and I 2 ; R Accessed represents the interpretation of R in I Accessed ; and accessible represents the active domain of I Accessed . Γ includes Σ and Σ ′ , corresponding to the assumption that I 1 and I 2 both satisfy Σ. The first two accessibility axioms enforce that the selection is accessvalid for I 1 : for non-result-bounded methods, accesses to a method mt on a relation R return all the results, while for result-bounded methods it respects the lower bounds. The last accessibility axiom enforces that I Accessed is a common subinstance of I 1 and I 2 and that accessible includes the active domain of I Accessed . Hence, from the definitions, we have: Proposition 3.4. Q is monotone answerable with respect to a schema Sch iff the AMonDet containment for Q and Sch holds.
Note that, in the case without result bounds, the accessibility axioms above reduce to the following (used in [8,7]): Arity-two schemas. The above results already imply that monotone answerability is decidable for some constraint languages. An example is the guarded two-variable logic with counting quantifiers, GC 2 . This is a logic over relations with arity at most two, which allows assertions such as "for any x, there are at least 7 y's such that R(x, y)". The only thing the reader needs to know about GC 2 is that query containment under GC 2 constraints is decidable [27], and that for an arity-two schema the additional axioms in Γ above are in GC 2 . Thus the reduction to containment immediately gives: Theorem 3.5. We can decide if a CQ Q is monotone answerable with respect to a schema Sch when relations have arity at most 2 and constraints are expressible in GC 2 .
This decidability result applies only to arity-two schemas. For higher arity, the additional axioms in the second bullet item of Γ above are problematic for decidability. However, when each lower bound is 1, these additional axioms have the property that there is a single atom in the body that contains every exported variable. That is, they are guarded TGDs, which is significant because query containment is decidable for such constraints [12]. Thus the reduction to query containment is particularly helpful when we can argue that only result lower bounds of 1 need be considered. Identifying such cases will be one of our goals in the next section.

Simplifying result-bounded schemas
The results in Section 3 allow us to reduce the monotone answerability problem to a query containment problem. However, for result bounds greater than 1, the containment problem involves cardinality constraints, and thus we cannot apply standard results on query containment with constraints to get decidability "out of the box", except in very special cases, such as the arity-two schemas and constraints of Theorem 3.5. In this section we show how to simplify schemas following the examples in the introduction. Specifically, for several constraint languages, we show that if we can find a plan for a query on a result-bounded schema, find a plan in a simplification of the schema, called an approximation, with simpler result bounds or none at all. These approximation results give insight about the use of result bounds, but also help us getting a more tractable query containment problem.
The introduction anticipated two kinds of approximations: existence checks in Example 1.2 (using result-bounded methods to check for the existence of tuples); and FD approximation in Example 1.3 (using them to retrieve functionally determined information). However, with arbitrary FO constraints, we cannot hope to capture the possible uses of result-bounded methods with such simple intuitions: Example 4.1. Consider a schema that has a binary relation P having a free access method mt P with result bound 5, along with a unary relation U having a free access method mt U with no result bound. The constraints Σ say that P has exactly 7 tuples, and if one of these tuples has its first position in U , then four of these tuples must have first position in U .
Consider the query Q asking if P contains a tuple with first position in U . This can be answered by a plan that accesses P and then checks if the value in the first position is in U . No matter the result of the access on P , any output will contain at least one tuple with first position in U , thanks to Σ.
Note that, unlike in Example 1.2, we needed to actually obtain the output of the resultbounded method, not just know that it returned something. Further, the exact cardinalities in the result bounds were important. For example, if the result bound for mt P were 1, then all we can do in this schema is access mt U , returning all of U , and access mt P , returning a single tuple. If the first position of that tuple is not in U , we will have no information on whether or not Q is true. It is easy to show that we can not answer the query in this case.
Fortunately, such complicated situations cannot arise with many common constraint classes. In the rest of this section, we substantiate this claim by presenting our different kinds of approximation and proving schema simplification results.
Existence check approximation. We formalize the first way in which result-bounded methods can be useful: checking the existence of tuples, as in Example 1.2. To this end, we replace the result-bounded methods by Boolean methods that only check for existence.
Given a schema Sch with result-bounded methods, its existence-check approximation Sch ′ is formed as follows: • We add new relations of Sch: for each result-bounded method mt on relation R with input positions j 1 . . . j m , we add a relation CheckView mt of arity m.
• We extend the constraints to add, for each result-bounded method mt, a constraint (expressible as two IDs): . where x denotes the input positions of mt in R.
• The access methods of Sch ′ are the methods of Sch that have no result bounds, along with a Boolean method on each CheckView mt , called the existence-check method associated with mt.
In Example 1.2, query Q 2 had a plan using the existence-check approximation of the schema. Clearly, every plan that uses the existence-check approximation Sch ′ of a schema Sch can be converted into a plan using Sch, by replacing the accesses on the Boolean method of CheckView mt to non-deterministic accesses with mt (and only checking whether the result of these accesses is empty). We want to understand whether the converse is true, i.e., for which schemas it is the case that any plan on the existence-check approximation Sch ′ yields a plan on the original schema Sch. We say that the schema is existence-check approximable when this is the case for all queries Q: this intuitively means that "result bounded methods are only useful for existence checks".
Showing existence-check approximability. We present an approximability result of this kind for constraints that consist of inclusion dependencies: Let Sch be a schema with ID constraints, and let Q be a CQ that is monotone answerable in Sch, Then Q is monotone answerable in the existence-check approximation of Sch.
Note that, in the existence-check approximation of such a schema, there are no result lower-bounds, and the only constraints are IDs. Thus the entailment for AMonDet only involves guarded TGDs, which implies that monotone answerability for a schema with IDs is decidable even in the presence of result bounds. We will show a more general decidability result in the next section (Theorem 5.1).
To prove Theorem 4.2, we show that if Q is not AMonDet in Sch ′ , then it cannot be AMonDet in Sch. This suffices to prove the contrapositive of the result, because AMonDet is equivalent to monotone answerability (Theorem 3.2).
We prove all our approximation results with a general method of blowing up models. We assume that AMonDet does not hold in the approximation Sch ′ , and consider a counterexample to AMonDet for Sch ′ : two instances I 1 , I 2 both satisfying the schema constraints, such that I 1 satisfies Q while I 2 satisfies ¬Q, and I 1 and I 2 have a common subinstance I Accessed which is access-valid for I 1 . We enlarge them, by adding additional facts, to a counterexample to AMonDet for the original schema. Formally, the method is as follows: Lemma 4.3. Let Sch and Sch ′ be schemas and Q a CQ that is not AMonDet in Sch ′ . Suppose that for some counterexample I 1 , I 2 to AMonDet for Q in Sch ′ we can construct instances I + 1 ⊇ I 1 and I + 2 ⊇ I 2 that satisfy the constraints of Sch and have a common subinstance I Accessed that is access-valid for I + 1 on Sch, and such that I + 2 has a homomorphism to I 2 . Then Q is not AMonDet in Sch.
Let us sketch how the blowing-up process of the lemma is used to prove the existencecheck approximation result in Theorem 4.2. Suppose that we have a counterexample (I 1 , I 2 ) to AMonDet for Q in the approximation Sch ′ . That is, I 1 and I 2 satisfy the constraints Σ ′ of Sch ′ , I 1 satisfies Q and I 2 violates Q, and I 1 and I 2 have a common subinstance I Accessed that is access-valid for I 1 . We will show how to "blow up" I 1 and I 2 to I + 1 and I + 2 which have a common subinstance I + Accessed that is access-valid for I + 1 in the original schema Sch. We must ensure that each access in I + Accessed to a result-bounded method returns either no tuples or more tuples than the bound.
Intuitively, we form I + Accessed in two steps. First, we "obliviously chase" I Accessed using the constraints generated by the existence check views; specifically, the IDs going from each relation CheckView mt to the relation R of the method mt, to create infinitely many witnesses for each access. Saying that the chase is oblivious means that we create witnesses of the head, even though such witnesses already exist, i.e., even for non-active triggers. Let I * Accessed be the result of this process. In a second step, we solve the constraint violations that may have been added by creating these new facts. We do so by chasing I * Accessed on the newly-generated elements with all ID constraints of Σ. This yields I + Accessed . We form I + 1 by unioning I + Accessed with I 1 , and similarly form I + 2 as the union of I + Accessed and I 2 . As the constraints are IDs, we can argue that I + 1 and I + 2 satisfy Σ, because I 1 , I 2 , and I + Accessed do. We can also construct homomorphisms of I + 1 back to I 1 and I + 2 back to I 2 , guaranteeing that no new CQs are satisfied in the process. Finally, we use I + Accessed as the common access-valid subinstance. This concludes the proof of Theorem 4.2.
Choice approximability. It is then natural to ask whether we can show existence check approximation for more general constraints than IDs. However, as we now show, it quickly fails for more expressive constraints: Consider a schema with constraints T (y) ∧ S(x) → T (x) and T (y) → ∃x S(x). We have a free access method mt S on S with result bound 1 and a Boolean access method mt T on T . Consider the query Q = ∃y T (y). We can show that Q is monotone answerable, using the following simple plan: That is, we access S and return true if the result is in T . On the other hand, consider the existence-check approximation of the schema. It has an existence-check method on S, but we can only test if S is non-empty, giving no indication whether Q holds. So Q is not answerable in the approximation.
We next show that, even though we cannot remove cardinality bounds on access methods entirely, for a broad class of constraints the actual value of the bounds is not important.
Given a schema Sch with result-bounded methods, its choice approximation is defined by keeping the relations and constraints as in Sch, but changing every result-bounded method to have bound 1. That is, every result-bounded method returns ∅ if there are no matching tuples for the access, and otherwise selects and returns one matching tuple. A schema is choice approximable if any CQ that has a monotone plan over a schema has one over its choice approximation: this means that the value of the result bounds never matters.
Showing choice approximability. Example 4.1 shows that choice approximability does not generally hold in schemas with complex first-order constraints including equality. In contrast, we show it holds for constraints without equality: Theorem 4.5. Let Sch be a schema whose constraints are in equality-free first-order logic (e.g., TGDs), and let Q be a CQ that is monotone answerable in Sch. Then Q is monotone answerable in the choice approximation of Sch.
The result is shown using a simpler variant of the "blow-up" method of Theorem 4.2. We enlarge counterexample models to AMonDet in the approximation by cloning the output tuples of each result-bounded access, including all facts that hold about these output tuples.
We will see in the next section (Theorem 5.1) that choice approximation immediately gives decidability of monotone answerability for schemas whose constraints fall in common classes of dependencies.
Choice approximability with UIDs and FDs. The previous results only apply to equalityfree FO logic, which cannot express FDs. However, for FDs with UIDs, we can extend our choice approximation result of Theorem 4.5, and show: Theorem 4.6. Let Sch be a schema whose constraints are UIDs and arbitrary FDs, and Q be a CQ that is monotone answerable in Sch. Then Q is monotone answerable in the choice approximation of Sch.
To prove Theorem 4.6, we use a strengthening of the enlargement process of Lemma 4.3 which constructs I + 1 and I + 2 from I 1 and I 2 in successive steps, to fix accesses one after the other. The construction that performs the blow-up is more complex (see Appendix C.3): it involves copying access outputs and chasing with UIDs in such a way as to avoid violating the FDs. Theorem 4.6 thus reduces monotone answerability of a CQ Q to a query containment problem with two copies of the UIDs and FDs, and a certain set of guarded TGDs (GTGDs). To show the decidability of this problem, we can prove that the resulting dependencies can be made "separable" [15] allowing us to reduce the analysis to the UIDs and GTGDs alone. From this we obtain decidability of monotone answerability. A stronger result (Theorem 6.11) is proven in Section 6.
FD approximability. When our constraints include FDs, we can hope for another kind of simplification, generalizing the idea of Example 1.3 in the introduction: the nondeterminism in result-bounded methods may be resolved on a projection of the outputs, due to the presence of a functional dependency.
Given a set of constraints Σ, a relation R mentioned in Σ, and a subset P of the positions of R, we write DetBy(R, P ) for the set of positions determined by P , i.e., the set of positions i such that Σ implies the FD P → i. Note in particular that P ⊆ DetBy(R, P ). If mt is an access method on R, we let DetBy(mt) denote DetBy(R, P ) where P is the set of input positions of mt. Given a schema Sch with result-bounded methods, its FD approximation Sch ′ is formed as follows: • We extend the relations of Sch: for each result-bounded method mt on relation R, we add a relation R mt of arity |DetBy(mt)|.
• We extend the constraints by adding, for each result-bounded method mt, the ID constraints: R( x, y, z) → R mt ( x, y) and R mt ( x, y) → ∃ z R( x, y, z) where x denotes the input positions of mt, and y denotes the other positions of DetBy(mt).
• The access methods of Sch ′ are the methods of Sch that have no result bounds, plus the following: for each result-bounded method mt on relation R with input positions j 1 . . . j m , a method mt ′ on R mt whose input positions are the corresponding m positions of R mt . From the FDs on R and the constraints relating R to R mt , we see that any access to mt ′ is guaranteed to return at most one result.
A schema is FD approximable if every CQ having a monotone plan over the schema has one in its FD approximation. As for existence-check, if a schema is FD approximable, we can decide monotone answerability by reducing to the same problem in a schema with FDs instead of result bounds. Note that FD approximability implies in particular choice approximability, as all accesses on the FD approximation can also be performed in the choice approximation.
We use a variant of our "blowing-up process" to show FD approximability for schemas with only FD constraints: Theorem 4.7. Let Sch be a schema whose constraints are FDs, and let Q be a CQ monotone answerable in Sch. Then Q is monotone answerable in the FD approximation of Sch.

Decidability of monotone answerability
Thus far we have seen a general way to reduce monotone answerability problems with result bounds to query containment problems (Section 3). We also have seen schema simplification results for many constraint languages, which give us insight into how resultbounded methods can be used (Section 4). For some classes of constraints, the reduction to containment and simplification results combine to give decidability results, along with tight complexity bounds.
Decidability using choice approximation. Theorem 4.5 on choice approximation can be used to obtain decidability results for schemas whose constraints are equality-free and where query containment is decidable. An example is the class of frontier-guarded TGDs: these are TGDs whose body contains a single atom including all exported variables (in particular, they capture guarded TGDs). Combining our previous reductions yields a tight bound to decide monotone answerability: Theorem 5.1. We can decide whether a CQ Q has a monotone plan with respect to a schema with result bounds whose constraints are frontier-guarded TGDs. Further, the problem is 2EXPTIME-complete.
Proof. By Theorem 4.5 we can assume that all result bounds are one, and by Proposition 3.3 we can replace the schema with the relaxed version containing only result lower bounds. Note that a result lower bound of 1 can be expressed as an ID. Theorem 3.2 implies that we have reduced the analysis of AMonDet to a query containment problem with additional frontier-guarded TGDs, and this is decidable in 2EXPTIME (see, e.g., [6]). 2EXPTIME hardness follows from a reduction from query containment with frontier-guarded TGDs (see, e.g. Prop. 3.16 in [7]), already in the absence of result bounds.
Decidability using existence-check approximation. Theorem 4.2 implies decidability for schemas whose constraints consist of IDs, but we can show a better complexity bound: Theorem 5.2. We can decide whether a CQ has a plan with respect to a schema Sch with result bounds whose constraints are IDs. The problem is EXPTIME-complete.
Proof. Moving to the existence check approximation only introduces additional IDs, so Theorem 4.2 reduces monotone answerability with a plan (or equivalently, AMonDet) to the same problem for a schema with IDs and without result bounds. A 2EXPTIME bound thus follows as in the case of GTGDs. But a more refined analysis of the chase in [5] showed the monotone answerability problem with IDs but without result bounds is in EXPTIME. Hardness follows from the lower bound on monotone answerability without result bounds [5].
Decidability for FDs. We now consider schemas where the constraints consist only of FDs. We start with an analysis of monotone answerability in the case without result bounds: Proposition 5.3. We can decide whether a CQ Q has a monotone plan with respect to a schema without result bounds whose constraints are FDs. The problem is NP-complete.
Proof. By Theorem 3.2 the problem reduces to the query containment problem for AMonDet. This can be decided by performing the chase process on CanonDB(Q) using axioms which consist of only FDs and full inclusion dependencies from a relation R to the corresponding relation R ′ . The chase terminates in polynomially many rounds, from which the desired upper bound easily follows. The lower bound follows from the lower bound for determining whether a query has a plan without constraints [24].
We now return to the situation with result bounds. We know that schemas with FDs are FD-approximable. From this we get a reduction to query containment with no result bounds, but introducing new axioms. The resulting constraints are still FD and IDs, as in the case without result bounds. We can show that the additional axioms involving R mt and R do not harm chase termination, allowing us to deduce that AMonDet is not only decidable, but is no harder than CQ evaluation: Theorem 5.4. We can decide whether a CQ Q has a monotone plan with respect to a schema with result bounds whose constraints are FDs. The problem is NP-complete. NP-hard (see above) and in EXPTIME (Theorem 6.11)

Improved complexity bounds for monotone answerability
In this section we focus on the case where we have IDs of bounded width, and the case where we have UIDs and FDs. Decidability follows easily from the approximation results in Section 4, coupled in the case of UIDs and FDs with separability results, showing that after some adjustments to the UIDs and auxiliary axioms, the FDs can be eliminated from consideration. These arguments suffice to give a reduction to query containment with GTGDs, which gives a 2EXPTIME bound. The main goal of this section is to explain how to get a tighter bound. We will do so by observing the constraints resulting from the reduction are nearly linear in these cases. Specifically, recall from Section 3 that the constraints produced consist of two copies of the original dependencies and the accessibility axioms. For constraints with choice approximation (e.g., IDs, or UIDs and FDs, by Section 4), these are GTGDs where all atoms are unary except the guard. We show that these constraints can be linearized, reducing to a query containment problem with linear constraints. This technique was introduced in [21], focusing on GTGDs with bounded arity signature and bounded number of relations. Since we do not make this assumption, we need a refined linearization procedure. We first introduce the idea in the context of bounded-width IDs and access methods, using it to give an NP bound for the monotone answerability problem in this case. Second, we present a more general linearization result and apply it to give an EXPTIME bound in the case of UIDs and FDs.
These results are new even in the absence of result bounds; further, the main techniques of this section are independent of the reductions given previously.

Linearization for answerability with bounded width IDs
We introduce the idea of our variant of linearization by giving it in the context of monotone answerability with bounded width IDs, where we use it to show an NP-bound.
Semi-width. Johnson and Klug [22] showed that query containment under boundedwidth IDs is in NP. We will need a simple extension of their result for linear TGDs that are "almost" of low-width.
The basic position graph of a set of TGDs Σ is the directed graph whose nodes are the positions of relations in Σ, with an edge from R[i] to S[j] if and only if there is a dependency δ ∈ Σ with exported variable x occurring in position i of R in the body of δ and position j of S in the head of δ. We say that a collection of TGDs Σ has semi-width bounded by w if it can be decomposed into Σ 1 ∪ Σ 2 where Σ 1 has width bounded by w and the position graph of Σ 2 is acyclic.
An easy modification of Johnson and Klug's analysis gives: Proposition 6.1. For fixed w, there is an NP algorithm for containment under linear TGDs of semi-width at most w.
Linearization. If we start with constraints consisting of IDs, the reduction of Proposition 3.4, like the earlier reductions without result bounds of [18,8,7], produces a query containment problem that includes constraints that are not IDs. The problem is the accessibility axioms, which are full GTGDs, and in addition have multiple atoms in the head. We will explain how to linearize these constraints, reducing to the problem with only IDs. In doing this, we will be able to control the semi-width of the resulting constraints, which will allow us to use Proposition 6.1. We will consider truncated accessibility axioms, which are rules of the form: where R is a relation and P is a subset of the positions of R. Some of these rules are directly given by the schema: those where P is the input positions of some method mt on R. We call them the original truncated accessibility axioms. However, there are other truncated accessibility axioms that are not part of the schema, but are implied by the original truncated accessibility axioms and by the constraints Σ. We call them the derived truncated accessibility axioms.
The breadth of a truncated accessibility axiom is the size of P . Note that the number of possible truncated accessibility axioms of breadth b is at most r · a b+1 , where r is the number of relations in the signature and a is the maximal arity of a relation. Our goal is to compute the derived truncated accessibility axioms of a given breadth. If our constraints are IDs of width w, we can use a simple "saturation" algorithm: Proposition 6.2. For any fixed w ∈ N, there is a polynomial time algorithm that takes as input a set of IDs of width w and a set of truncated accessibility axioms, and computes all of the derived truncated accessibility axioms of breadth at most w.
We now state a further proof normalization that facts in a chase proof with truncated accessibility axioms can be simulated by applying derived axioms of small breadth in a "greedy fashion". We assume Σ consists of IDs of width w and truncated accessibility axioms.
A short-cut chase proof on an initial instance I 0 with Σ uses two alternating kinds of steps: • ID steps, where we fire an ID on a trigger τ to generate a fact F : we put F in a new node n which is a child of the node n ′ containing the fact of τ ; and we copy in n all facts of the form accessible(c) that held in n ′ about any element c that was exported when firing τ .
• Breadth-bounded saturation steps, where we consider a newly created node n and apply all derived truncated accessibility axioms of breadth at most w on that node until we reach a fixpoint and there are no more violations of these axioms on n.
We continue this process until a fixpoint is reached. The atoms in the proof are thus associated with a tree structure: it is a tree of nodes that correspond to the application of IDs, and each node also contains accessibility facts that occur in the node where they were generated and in the descendants of those nodes that contain facts to which the elements are exported. The name "short-cut" intuitively indicates that we short-cut certain derivations that could have been performed by moving up and down in the chase tree: instead, we apply a derived truncated accessibility axiom. Lemma 6.3. For any set Σ of IDs of width w, given a set of facts I 0 and a chase proof using Σ that produces I, letting I Lin 0 be the closure of I 0 under the original and derived truncated accessibility axioms, there is I ′ produced by a short-cut chase proof from I Lin 0 with Σ that has a homomorphism from I to I ′ .
We are now ready to present our linearization technique. We show that for fixed w ∈ N, the truncated accessibility axioms for IDs of width at most w can be simulated by IDs of semi-width w.
Consider a schema Sch with IDs Σ of width w and truncated accessibility axioms. For each relation R of arity n, each subset P of the positions of R of size at most w, and also for P = {1 . . . n}, we form a relation R P of arity n, which intuitively denotes an R-fact where the elements in the positions of P satisfy accessible. We let the constraints Σ Lin be as follows: • (Apply ID) Consider an ID δ ∈ Σ R( u) → ∃ z S( z, u) exporting variables from positions P R in R to positions P S in S, and a subset P ⊆ P R . Then we have a new rule where P ′ is the subset of P S corresponding to positions of P in P R .
• (Breadth-bounded Saturation) Suppose we have either a derived truncated accessibility axiom of breadth at most w of the form where P ′ is any subset of P ∪ {j} of size at most w. Further, for an original truncated accessibility axiom, we also have the rule Given a CQ Q, we let I Lin 0 be formed by adding atoms to CanonDB(Q) as follows: • Apply all of the original axioms and derived truncated accessibility axioms of breadth w to CanonDB(Q) to obtain I ′ .
• For each relation R of the signature σ (except accessible), for each fact R(a 1 , . . . , a n ) of I ′ , letting P be the set of the i ∈ {1, . . . , n} such that accessible(a i ) holds in I ′ , for every P ′ ⊆ P of size at most w, add to I Lin 0 the fact R P ′ (a 1 , . . . , a n ). Further, if P = {1 . . . n}, add the fact R {1...n} (a 1 , . . . , a n ).
It follows from Lemma 6.3 that for a schema Sch consisting of IDs Σ of width w and truncated accessibility axioms, the IDs above can simulate the chase with Σ and the truncated accessibility axioms. Formally, given a set of facts I over Σ Lin , let UnLin(I) be formed by taking any R P ( c) and replacing it by {R( c)} ∪ i∈P {accessible(c i )}. We can then show: Theorem 6.4. For any schema that consists of IDs Σ of width w and truncated accessibility axioms for any set of facts I derivable from CanonDB(Q) by chasing with Σ and the truncated accessibility axioms, there is a set of facts I ′ that can be derived from I Lin 0 using Σ Lin such that UnLin(I ′ ) is a homomorphic image of I.
Proof. We know that it suffices to consider short-cut proofs, and we can simulate these proofs by a derivation using Σ Lin in the obvious way.
Reduction of answerability with bounded width to query containment with bounded semi-width. We are now ready to give our first application of the machinery: Theorem 6.5. Given a CQ Q over a schema that has bounded-width IDs and access methods without result bounds, we can decide in NP if Q is monotone answerable.
Proof. By Proposition 6.1, it suffices to show that we can reduce in polynomial time to query containment with bounded semi-width IDs. In the absence of result bounds, the AMonDet query containment problem Q ⊆ Γ Q ′ has the following simpler form [8,7]: Γ contains Σ, Σ ′ and Splitting up the heads of rules, we can replace the above by: The former rules represent the propagation of accessibility facts, while the latter represent the transfer from unprimed to primed facts. Now, we will handle the action of Σ and of (Truncated Accessibility) with the linearization algorithm given before, and then split the axioms into bounded-width and acyclic portions to apply Proposition 6.1. Formally, let Γ Bounded consist of the (Apply ID) rules obtained from linearizing Σ and the truncated accessibility axioms using Theorem 6.4, along with Σ ′ . Let Γ Acyclic consist of the (Breadth-bounded Saturation) axioms of the linearization and of: for any relation R of arity n having an access method. A simple application of Theorem 6.4, given in the appendix, shows that AMonDet is equivalent to the containment Q Lin ⊆ Γ Bounded ∪Γ Acyclic Q ′ The semi-width of Γ Bounded ∪ Γ Acyclic is then w, since Γ Bounded consists of width w IDs and Γ Acyclic of acyclic IDs.
Extending to bounded-width IDs and result bounds. A variant of the argument above establishes NP membership in the presence of result bounds. Theorem 6.6. Given a CQ Q over a schema that has bounded-width IDs and access methods (possibly with result bounds), we can decide in NP if Q is monotone answerable.

More general linearization for answerability with UIDs and FDs
We now state a more general linearization result, reducing query containment with IDs and full GTGDs to query containment with linear TGDs. IDs and full GTGDs can simulate arbitrary GTGDs, for which containment is 2EXPTIME-complete, and EXPTIMEcomplete for constant signature arity [10]. However, we will be able to show an EXPTIME bound without this assumption, by just bounding the arity of the signature used in side atoms (i.e., non-guard atoms): this can handle, e.g., accessibility facts in truncated accessibility axioms. We will also show an NP bound under more assumptions. Our linearization technique resembles that of [21], but their results only apply when bounding the arity and number of relations in the whole signature, which we do not assume.
Generalized linearization. We consider constraints that consist of IDs and full GTGDs on a specific side signature with a restriction on the arity of head relations: Definition 6.7. Let γ be a full GTGD. The head arity of γ is the number of variables used in the head of γ. Given a sub-signature σ ′ ⊆ σ, we say that γ has side signature σ ′ if there is a choice of guard atom in the body of γ such that all other atoms are relations of σ ′ .
The result below uses the notion of semi-width, defined in Section 6.1. For a set of constraints Σ, we write |Σ| for their size (e.g. in a string representation), and extend the notions of head arity and side signature in the expected way.
Theorem 6.8. For any a ′ ∈ N, there are polynomials P 1 , P 2 such that the following is true. Given: • A signature σ of arity a; • A subsignature σ ′ ⊆ σ with n ′ relations and arity ≤ a ′ ; • A CQ Q on σ; • A set Σ of non-full IDs of width w and full GTGDs with side signature σ ′ and head arity h; We can compute the following: • A set Σ ′ of linear TGDs of semi-width ≤ w and arity ≤ a, in time P 1 (|Σ| , 2 P 2 (w,h,n ′ ) ), independently from Q; The constraints Σ ′ and the CQ Q Lin ensure that for any CQ Note that we assume that IDs are non-full, i.e., they must create at least one null. Of course, full IDs can be seen as full GTGDs with empty side signature, so they are also covered by this result, but they may make the head arity increase if included in the full GTGDs.
Theorem 6.8 is proven in Appendix F.4 by generalizing the linearization argument of Theorem 6.4. We compute derived axioms of a limited breadth (generalizing the notion of the previous section), and then use them in a short-cut chase, which avoids passing facts up and down. We then show how to simulate the short-cut chase by linear TGDs. The main difference with [21] is that we exploit the width and side signature arity bounds to compute only a portion of the derived axioms, without bounding the overall signature.
We can use Theorem 6.8 by fixing the head arity h, the width w, and the entire side signature σ ′ , to deduce: Corollary 6.9. There is an NP algorithm for query containment under bounded-width IDs and full GTGDs of bounded head arity on a fixed side signature.
While the side signature σ ′ is constant, the arity of σ is not constant above; however, relations in σ \ σ ′ can only be used in the bounded-width IDs and as guards in the full GTGDs.
Proof. Apply the reduction of Theorem 6.8, which computes in PTIME an equivalent set of linear TGDs of constant semi-width and a rewriting of the left-hand-side query. Then, conclude by Proposition 6.1.
The result also implies an EXPTIME bound for query containment with a more general language of IDs and GTGDs. Corollary 6.10. There is an EXPTIME algorithm for query containment under IDs and GTGDs on a bounded arity side signature.
Proof. One can simulate GTGDs by IDs and full GTGDs, via additional relations. Thus we can assume the GTGDs are full. Now, apply the reduction of Theorem 6.8, which computes in EXPTIME an equivalent set of linear TGDs and computes a rewriting of the left-hand-side query. Now, consider each one of the exponentially many possible first-order rewritings of the right-hand-side query under these linear TGDs (see [14,12]), and for each of them, see whether it holds in the closure.
Complexity with UIDs and FDs. We now use the previous corollary to derive complexity results for monotone answerability with result-bounded access methods, for constraints that include both UIDs and FDs. Theorem 6.11. For a schema with access methods (possibly with result bounds), where the constraints involve only UIDs and FDs, monotone answerability is in EXPTIME.
Compared to Theorem 6.6, this result restricts to UIDs rather than IDs, and has a higher complexity, but it allows FD constraints. To the best of our knowledge, this result is new even in without result bounds. We now prove Theorem 6.11: Proof. By Theorem 3.2, a CQ Q has a monotone plan if and only if it is AMonDet. By applying choice approximability for UIDs and FDs, (Theorem 4.6), we can assume that all result bounds are 1. Thus we have reduced to the query containment problem for AMonDet for such a schema. Recall that this asks if Q is contained in its copy Q ′ on the primed relations. When all result bounds are 1, the constraints Γ in the entailment problem are Σ, its copy Σ ′ , and: Our first step will be to argue that we can pre-process these constraints so that, up to applying the FDs to the canonical database of Q (i.e., minimizing Q under the FDs), then we can drop the FDs in Σ and Σ ′ without impacting entailment. That is, we adapt the separability technique of [15].
Hence, observe that the itemized constraints above can be rewritten as follows, by inlining to eliminate R Accessed : Let us modify the second set of axioms so that in going from R to R ′ they preserve not only the input positions of mt, but also the positions of R that are determined by input positions of mt. The use of these "expanded result-bounded constraints" does not impact the soundness of the chase, since chase step with these constraints could have been mimicked by a step with an original constraint followed by FD applications. We show in Appendix F.3, using a simple induction on proof length, that after this rewriting the application of constraints will never cause any FD violation on R or R ′ .
Let Q * be the result of applying the FDs to Q, and let Γ Sep denote the revised constraints, without the FDs. We have shown that monotone answerability is equivalent to Q * ⊆ Γ Sep Q ′ . Note that since the constraints in Γ Sep are all GTGDs, we can already infer decidability in 2EXPTIME using [10]. However, we can apply the bound in Corollary 6.10 to get an EXPTIME bound, since the side signature is fixed (consisting only of accessible). This completes the proof.

Summary and Conclusion
We formalized the problem of obtaining complete results to queries by accessing services that may return only a bounded subset of the data, in the presence of integrity constraints. We showed how to reduce this to a standard reasoning problem: query containment with constraints. We have further shown that, for many classes of constraints, we can limit the ways in which a query can be answered using result-bounded plans, thus simplifying the corresponding entailment. By coupling this schema simplification with an analysis of the chase, we have derived complexity bounds for monotone answerability with several classes of constraints. Table 1 summarizes which kind of simplification we have shown for each constraint class, as well as our decidability and complexity results.
We have studied answerability over all instances, finite and infinite; but most of our complexity results deal with languages that are "finitely controllable" (e.g., this is the case for IDs and frontier-guarded TGDs); thus these results hold if one restricts to finite instances. We have also restricted to monotone plans (i.e., without relational difference) throughout the paper. As explained in Appendix H, the reduction to query containment still applies to RA-plans (those that can use negation). Our expressiveness results (e.g. existence-check approximation, etc.) also extend easily to answerability with such plans, but lead to a more involved query containment problem. Hence, we do not know how to show decidability of the answerability problem for UIDs and FDs with such plans. We also leave open the question of whether choice approximation holds for general FDs and IDs (not UIDs).
We think that the blowing-up technique used in schema simplification, which exploits the inability of well-behaved classes to detect pumping of result outputs, could be used in a wider context. We will also be studying further applications for the linearization technique introduced in this work.

A.1. Alternative Semantics for Result-Bounded Plans
In the body of the paper we defined a semantics for plans using access selections, which assumed that multiple accesses with a result-bounded method always return the same result. We also claimed that all our results held without this assumption. We now show the alternative semantics where this assumption does not hold, and show that indeed the choice of semantics makes no difference. We will call idempotent semantics in this appendix the one that we use in the main body of the paper, and non-idempotent semantics the one that we now define. Intuitively, the idempotent semantics, as used in the body, assumes that the access selection function is chosen for the entire plan, so that all calls with the same input to the same access method return the same output. The non-idempotent semantics makes no such assumption, and can choose a different access selection for each access. In both cases, the semantics is a function taking an instance I for the input schema and the input tables of the plan, returning as output a set of possible outputs for each output table of the plan.
Formally, given a schema Sch and instance I, an access selection is a function mapping each access on I to a valid output for the access, as defined above. Given an access selection ς, we can associate to each instance I and each plan PL an output by induction on the number of commands. The general scheme for both semantics is the same: for an access command T ⇐ OutMap mt ⇐ InMap E the output is obtained by evaluating E to get a collection of tuples and then performing an access with mt using each tuple, putting the union of the corresponding output selected by ς into T . The semantics of middleware query commands is the usual semantics for relational algebra. The semantics of concatenation of plans is via composition.
The difference between the two semantics is: for the idempotent semantics, given I we take the union over all access selections ς of the output of the entire plan for I and ς; for the non-idempotent semantics, we calculate the possible outputs of each individual access command as the union of the outputs for all ς, we calculate the output of a query middleware command as usual, and then we calculate the possible outputs for a plan via composition.
Example A.1. Consider a schema with a free access method mt with result bound 5 on relation R. Let PL be the plan that accesses mt twice and then determines whether the intersection of the results is non-empty: As T 1 and T 2 are identical under the idempotent semantics, PL just tests if R is nonempty. Under the non-idempotent semantics, PL is non-deterministic, since it can return empty or non-empty when R contains at least 10 tuples.
Note that, in both semantics, when we use multiple access methods on the same relation, there is no requirement that an access selection be "consistent": if an instance I includes a fact R(a, b) and we have result-bounded access methods mt 1 on the first position of R and mt 2 on the second position of R, then an access to mt 1 on a might return (a, b) even if an access to mt 2 on b does not return (a, b). This captures the typical situation where distinct access methods use unrelated criteria to determine which tuples to return.
It is clear that if a query that has a plan that answers it under the non-idempotent semantics, then the same plan works under the idempotent semantics. Conversely, Example A.1 shows that that a given plan may answer a query under the idempotent semantics, while it does not answer any query under the non-idempotent semantics. However, if a query Q has some plan that answers it under the idempotent semantics, we can show that it also does under the non-idempotent semantics: Proposition A.2. For any CQ Q over schema Sch, there is a monotone plan that answers Q under the idempotent semantics with respect to Sch iff there is a monotone plan that answers Q under the non-idempotent semantics. Likewise, there is an RAplan that answers Q under the idempotent semantics with respect to Sch iff there is an RA-plan that answers Q under the non-idempotent semantics.
We first give the argument for RA-plans (i.e., non-monotone plans, which allow arbitrary relational algebra expressions). If there is a plan PL that answers Q under the non-idempotent semantics, then clearly PL also answers Q under the idempotent semantics, because there are less possible outputs.
In the other direction, suppose PL answers Q under the idempotent semantics. Let cached(PL) be the function that executes PL, but whenever it encounters an access mt on a binding AccBind that has already been performed in a previous command, it uses the values output by the prior command rather than making a new access, i.e., it uses "cached values". Executing cached(PL) under the non-idempotent semantics gives exactly the same outputs as executing PL under the idempotent semantics, because cached(PL) never performs the same access twice. Further we can implement cached(PL) as an RA-plan PL ′ : for each access command T ⇐ mt ⇐ E in PL, we pre-process it in PL ′ by removing from the output of E any tuples previously accessed in mt, using a middleware query command with the relational difference operator. We then perform an access to mt with the remaining tuples, cache the result for further accesses, and post-process the output with a middleware query command to add back the output tuples cached from previous accesses. Thus PL ′ answers Q under the idempotent semantics as required.
Let us now give the argument for monotone plans (i.e., USPJ-plans), which are the plans used throughout the body of the paper. Of course the forward direction is proven in the same way, so we focus on the backward direction. Contrary to plans that can use negation, we can no longer avoid making accesses that were previously performed, because we can no longer remove input tuples that we do not wish to query. However, we can still cache the result of each access, and union it back when performing further accesses.
Let PL be a plan that answers Q under the idempotent semantics. We use Proposition 3.3 about the elimination of result upper bounds (in the idempotent semantics), which we show later in the text (using other results about the equivalence between AMonDet and plan existence, again established in the idempotent semantics), to assume without loss of generality that PL answers Relax(Sch), the relaxation of Sch where all result bounds are replaced with result lower bounds only.
We define the plan PL ′ from PL, where access commands are modified in the following way: whenever we perform an access for a method mt in an access command i, we cache the input of access command i in a special intermediate table Inp mt,i and its output in another table Out mt,i , and then we add to the output of access command i the result of unioning, over all previously performed accesses with mt for j < i, the intersection of Inp mt,i with Inp mt,j joined with Out mt,j . Informally, whenever we perform an access with a set of input tuples, we add to its output the previous outputs of the accesses with the same tuples on the same methods earlier in the plan. This can be implemented using USPJ operators. For each table defined on the left-hand side of an access or middleware command in PL, we define its corresponding table as the table in PL ′ where the same result is defined: for middleware commands, the correspondence is obvious because they are not changed from PL to PL ′ ; for access commands, the corresponding table is the one where we have performed the postprocessing to incorporate the previous tuple results.
We now make the following claim: Claim A.3. Every possible output of PL ′ in the non-idempotent semantics is a subset of a possible output of PL in the idempotent semantics, and is a superset of a possible output of PL in the idempotent semantics.
This suffices to establish that PL ′ answers the query Q in the non-idempotent semantics, because, as PL answers Q in the idempotent semantics, its only possible result on an instance I in the idempotent semantics is Q(I), so Claim A.3 implies that the only possible output of PL ′ on I is also Q(I), so PL ′ answers Q under the non-idempotent semantics, concluding the proof. So it suffices to prove Claim A.3. We now do so: Proof. Letting O be a result of PL ′ under the non-idempotent semantics on an instance I, and letting ς 1 , . . . , ς n be the choice of access selections used for each access command of PL ′ to obtain O, we show that O can be obtained as a possible output of PL in the idempotent semantics.
To show the first inclusion, let us first consider the access selection ς − on I defined in the following way: for each access binding AccBind on a method mt, letting ς i be the access selection for the first access command of PL where an access on AccBind is performed on mt, we define ς − (mt, AccBind) := ς i (mt, AccBind); if the access is never performed, define ς according to one of the ς i (chosen arbitrarily). We see that ς − is a valid access selection for I, because each ς i is a valid access selection for i, and for each access ς − returns the result of one of the ς i , which is valid. Now, by induction on the length of the plan, it is clear that for every table in the execution of PL on I with ς − , its contents are a subset of that of the corresponding table in the execution of PL ′ on I with ς 1 , . . . , ς n . Indeed, the base case is trivial, the induction case on middleware commands is by monotonicity of the USPJ operators, and the induction case on access commands is simply because we perform an access with a subset of bindings, and for each binding AccBind, if this is the first time we perform the access for this method on AccBind, we obtain the same result in PL as in PL ′ , and if this is not the first time, in PL we obtain the result of the first time, and in PL ′ we still obtain it because we retrieve it from the cached copy. The conclusion of the induction is that the output of PL on I under ς − is a subset of the output O of PL ′ on I under ς 1 , . . . , ς n .
Let us now show the second inclusion by considering the access selection ς + on I defined in the following way: for each access binding AccBind and method mt, we define ς + (mt, AccBind) := 1≤i≤n ς i (mt, AccBind). That is, ς + returns all results that are returned in the execution of PL ′ on I in the non-idempotent semantics with ς 1 , . . . , ς n . This is a valid selection function, because for each access and binding it returns a superset of a valid output, so we are still obeying the result lower bounds (and this is where we use the elimination of the result upper bounds). Now, by induction on the length of the plan, analogously to the case above, we see that for every table in the execution of PL on I with ς + , its contents are a superset of that of the corresponding table in the execution of PL ′ on I with ς 1 , . . . , ς n : the induction case is because each access on a binding in PL ′ cannot return more than the results of this access in all the ς i , and this is the result obtained with ς + . So we have shown that O is a subset of a possible output of PL, and that it is a superset of a possible output of PL, concluding the proof of the claim. We first prove the "easy direction": Proposition B.1. If Q has a (monotone) plan PL that answers it w.r.t. Sch, then Q is AMonDet over Sch.
Proof. Suppose I 1 , I 2 satisfy the constraints of Sch and I Accessed is a common subinstance that is access-valid for I 1 . Let ς 1 be an access selection for I 1 that always selects elements in I Accessed and ς 2 an access selection for I 2 that extends ς 1 . We argue that for each temporary table of PL, its value when evaluated with ς 1 , I 1 , is contained in its value when evaluated with ς 2 , I 2 . We prove this by induction on PL. As the plan is monotone, the property is preserved by query middleware commands, so inductively it suffices to look at an access command T ⇐ mt ⇐ E with mt an access method on some relation. Let E 1 be the value of E when evaluated in I 1 , ς 1 and E 2 be the value when evaluated in I 2 , ς 2 . Then by the monotonicity of the negation-free query E and the induction hypothesis, E 1 ⊆ E 2 . Given a tuple t in E 1 , let O 1 t be the set of "matching tuples" (tuples for the relation R extending t) in I 1 selected by ς 1 . Similarly let O 2 t be the set selected by ς 2 in I 2 . By assumption, which completes the induction. Now assume that PL answers Q. Then it must give the same result for any access selection on I 1 , and similarly for I 2 , which completes the argument.
For the other direction, we first make use of the corresponding result in the case without result bounds: Theorem B.2. [7,8] For any CQ Q and schema Sch (with no result bounds) whose constraints Σ are expressible in active-domain first-order logic, the following are equivalent: 1. Q has a monotone plan that answers it over Sch 2. Q is AMonDet over Sch.
Thus for schemas without result-bounded methods, existence of a monotone plan is the same as AMonDet, and both can be expressed as a query containment problem. [8] further shows that a monotone plan can be extracted from any proof of the query containment for AMonDet. This reduction to query containment is what we will now extend to the setting with result-bounded methods in the main text.
We adapt the above result to the setting of result-bounded methods with a simple construction that allows us to rewrite away the result-bounded methods (expressing them in the constraints): Elimination of result-bounded methods. Given a schema Sch with constraints and access methods, possibly result-bounded, we consider an auxiliary schema ElimRB(Sch) where for every method mt with result bound k on relation R we have a new relation R mt whose arity agrees with that of R. Informally, R mt stores only up to k result tuples for each input. The constraints include all the constraints of Sch (on the original relation names). In addition, we have for every method mt with input positions j 1 . . . j m and result bound k, the following axioms: • A soundness of selection axiom stating that R mt is a subset of R.
• An axiom stating that for any binding of the input positions, R mt has at most k distinct matching tuples • For each 1 ≤ i ≤ k, a result lower bound axiom stating that, for any c j 1 . . . c jm such that R contains at least i tuples c extending c j 1 . . . c jm , R mt contains at least i such tuples.
In this schema we have the same access methods, except that any mt with a result bound over R is removed, and in its place we add an access method with no result bound over R mt .
Given a query Q over Sch, we can consider it as a query over ElimRB(Sch) instances by simply ignoring the additional relations. If Q is given by a logical formula over Sch relations, the corresponding query is syntactically identical. We will use Q to denote both queries.
We claim that in considering Q over ElimRB(Sch) rather than Sch, we do not change monotone answerability.
Proposition B.3. For any query Q over Sch, there is a monotone plan that answers Q over Sch iff there is a monotone plan that answers Q over ElimRB(Sch).
Proposition B.3 thus shows that we can eliminate result bounds, at the cost of including new constraints.
Proof. Suppose a monotone plan PL over Sch answers Q. Let PL ′ be formed from PL by replacing method calls with method mt on relation R with access to R mt . We claim that PL ′ answers Q over ElimRB(Sch). Given an instance I ′ for ElimRB(Sch), we drop relations R mt to get an instance I for Q. Using the relations R mt we get a selection function ς for each method of Sch, and we can show that PL evaluated with ς over I gives the same output as PL ′ over I. Since the former evaluates to Q(I), so must the latter.
Conversely, suppose monotone plan PL ′ answers Q over ElimRB(Sch). Construct PL from PL ′ by replacing accesses to R mt with accesses to R. We claim that PL answers Q over Sch. Consider an instance I for Sch, and a particular access selection ς. Consider all the accesses made by PL when evaluated using ς on I, along with the responses to these accesses. Note that for every access via result-bounded method mt on R, the number of responses is at most the result bound of mt and if it is strictly less than the number of responses is equal to the number of extensions in R. We can generate an instance I ′ of ElimRB(Sch) by interpreting R mt as follows: for each tuple t in I(R), project it on the input positions j 1 . . . j m of mt. If the projection occurs in the list of access inputs above, take any such access and include all of the corresponding outputs in R mt . Otherwise select up to k tuples in I(R) that match the projection arbitrarily, if any such exist. I ′ satisfies the constraints of ElimRB(Sch). Since PL ′ answers Q, PL ′ when evaluated on I ′ gives the same result as Q does on I ′ . We can show that the accesses made by PL ′ on I ′ are the same as those made by PL on I relative to ς, and give the same results. Thus PL evaluated according to ς on I gives the same result as PL ′ does on I ′ . Since we are applying the idempotent semantics, which unions over all selection functions ς, we know that the set of results of PL on I is equal to a singleton: the result of PL ′ on I ′ . Q is defined over the original relations of Sch, hence Q evaluated on I ′ is the same as Q evaluated on I. Thus the result follows.
The equivalence of a schema Sch with result bounds and its variant ElimRB(Sch) easily extends to AMonDet.
Proposition B.4. For any query Q over Sch, the corresponding query is AMonDet over ElimRB(Sch) if and only if Q is AMonDet over Sch.
Proof. Suppose Q is AMonDet over ElimRB(Sch) and consider instances I 1 satisfying the constraints with accessible part A 1 and associated selection functions ς 1 , I 2 having accessible part A 2 with associated selection function ς 2 , where A 1 ⊆ A 2 . We create an instance I ′ 1 for ElimRB(Sch) by interpreting each R mt as the tuples of R in A 1 . Similarly we create I ′ 2 for ElimRB(Sch) by interpreting each R mt as the tuples of R in A 2 . The constraints of ElimRB(Sch) are easily seen to be satisfied. Furthermore, we can modify the selection function ς 1 for I ′ 1 so that in R mt it selects exactly the tuples that ς 1 selected in R. We can similarly modify ς 2 for I ′ 2 , selecting from R mt as ς 2 had selected from R in I 2 . With these modifications, we can see that containment of the accessible parts is preserved. Thus by access monotonic-determinacy over ElimRB(Sch), . But Q is a function of the relations in Sch, and thus Q( , and we are done. Conversely, suppose Q is AMonDet over Sch and consider instances I ′ 1 and I ′ 2 for ElimRB(Sch) with selection functions ς ′ 1 and ς ′ 2 giving accessible parts A ′ 1 ⊆ A ′ 2 . We create an instance I 1 for Sch from I ′ 1 by dropping the relations R mt , and similarly create I 2 from I ′ 2 . Clearly both satisfy the constraints of Sch. We modify ς ′ 1 to obtain a selection function ς 1 for I 1 , reversing the process in the paragraph above, selecting from each R as ς ′ 1 did from R mt . We similarly define a selection function ς 2 for I 2 . We can see that the corresponding accessible parts are still contained. Thus by access monotonicdeterminacy over Sch, Q(I 1 ) ⊆ Q(I 2 ). Again this implies Q(I ′ 1 ) ⊆ Q(I ′ 2 ).
Putting together Proposition B.3, Proposition B.4 and Theorem B.2, we complete the proof of Theorem 3.2.

B.2. Proof of Proposition 3.1: Equivalence Between Accessible Part Notions
Recall the statement of Proposition 3.1: The following are equivalent: (i) I 1 and I 2 have a common subinstance I Accessed that is access-valid for I 1 . (ii) There are A 1 ⊆ A 2 such that A 1 is an accessible part for I 1 and A 2 is an accessible part for I 2 .
Proof. Suppose I 1 and I 2 have a common subinstance I Accessed that is access-valid for I 1 .
As I Accessed is access-valid for I 1 , we can define an access selection ς 1 that takes any access performed with values of I Accessed and a method of Sch, and maps it to a set of matching tuples in I Accessed that is valid in I 1 . We can extend ς 1 to a function ς 2 which returns a superset of the tuples returned by ς 1 for accesses with values of I Accessed , and returns an arbitrary set of tuples from I 2 otherwise, such that the access results are valid in I 2 . We have AccPart(ς 1 , I 1 ) ⊆ AccPart(ς 2 , I 2 ), and thus the first item implies the second. Conversely, suppose there is an access selection ς 1 for I 1 and ς 2 for I 2 such that AccPart(ς 1 , I 1 ) ⊆ AccPart(ς 2 , I 2 ). Let I 0 := AccPart(ς 1 , I 1 ). Given an access AccBind, mt in I 0 , we know that there is i such that AccBind is in AccPart i (ς 1 , I 1 ), hence by definition of the fixpoint process and of the access selection ς 1 there is a valid access result in AccPart i+1 (ς 1 , I 1 ), hence in I 0 . Thus we can choose a valid response for I 1 in I 0 , and this response must also be a subset of AccPart(ς 2 , I 2 ). Taking the union of all these responses gives us the required I Accessed .

B.3. Details on Decidability for Two-Variable Logic with Counting
As mentioned in the body, for certain classes of constraints the reduction to AMonDet and the formalization of AMonDet as a query containment problem will give decidability of monotone answerability, even without any schema simplification results. An example is the guarded two-variable logic with counting quantifiers, GC 2 . This is a logic over relations with arity at most two, which allows assertions such as "for any x, there are at least 7 y's such that R(x, y)". The only thing the reader needs to know about GC 2 is that query containment under GC 2 constraints is decidable [27], and that if we start with GC 2 constraints and perform the reduction given by the prior results, we still remain in GC 2 .
We now give a few more details on Theorem 3.5: We can decide if a CQ Q is monotone answerable with respect to a schema Sch when relations have arity at most 2 and constraints are expressible in GC 2 .
Proof. We apply Theorem 3.2 to this schema to reduce monotone answerability to deciding query containment with constraints. In the entailment problem for AMonDet we have two copies of the above constraints, and also the additional axioms. The additional axioms are easily seen to be in GC 2 as well. Access monotonic-determinacy in the schema is equivalent to containment of Q ′ by Q w. Let Sch be a schema with ID constraints, and let Q be a CQ that is monotone answerable in Sch, Then Q is monotone answerable in the existence-check approximation of Sch.
We will show the contrapositive of this statement. Assume that Q does not have a monotone plan in the existence-check approximation Sch ′ of Sch. We will show that this implies that Q is not AMonDet in Sch: this allows us to conclude because, by Theorem 3.2, this implies that Q has no monotone plan in Sch.
Thus it suffices to show: Consider a schema Sch whose constraints are IDs, and let Q be a CQ that is AMonDet with respect to Sch. Then Q is also AMonDet in the existence-check approximation of Sch.
We now prove the theorem, using Lemma 4.3: Proof. Let Sch be the original schema and Sch ′ be the existence-check approximation.
To use the lemma, suppose that we have a counterexample (I 1 , I 2 ) to AMonDet for Q and the approximation Sch ′ , i.e., I 1 and I 2 satisfy the constraints Σ ′ of Sch ′ , I 1 satisfies Q and I 2 violates Q, and I 1 and I 2 have a common subinstance I Accessed that is access-valid for I 1 . As mentioned in the main text of the paper, we will show how to "blow up" each instance to I + 1 and I + 2 which have a common subinstance which is access-valid for I + 1 , i.e., we must ensure that each access to a result-bounded method with a result bound in I + 1 returns either no tuples or more tuples than the bound. In the blowing-up process we will preserve the constraints Σ ′ and the properties of the I i with respect to the CQ Q.
We now explain how I + 1 and I + 2 are formed. The first step is "obliviously-chasing with the existence-check constraints": for any existence-check constraint δ of the form ∀x 1 . . . x m CheckView mt ( x) → ∃y 1 . . . y n R( x, y) and any homomorphism h of the variables x 1 . . . x n to I Accessed , we extend the mapping by choosing infinitely many fresh witnesses for y 1 . . . y n , naming the j th value for y i in some canonical way depending on (h(x 1 ), . . . h(x m ), δ, j, i)., and creating the corresponding facts. We let I * Accessed be I Accessed extended with these facts. The second step is "standard chasing with the original dependencies": we chase I * Accessed in a standard way in rounds with all dependencies of Σ, yielding a possibly infinite result. We let I + Accessed be the result of extending I * Accessed by this chasing process. Note that I + Accessed then satisfies Σ by definition of the chase. We now construct I + 1 := I 1 ∪ I + Accessed . First observe that I 1 ⊆ I + 1 , so that I + 1 still satisfies Q. Further We argue that I + 1 satisfies Σ. As Σ consists only of IDs, its triggers consist of single facts, so it suffices to check this on I 1 and on I + Accessed separately. For I + Accessed , we know that it satisfies Σ by definition of the chase. For I 1 , we know it satisfies Σ ′ , which is a superset of Σ, hence it satisfies Σ.
We similarly define I + 2 := I 2 ∪ I * Accessed . We see that I + 2 satisfies Σ, using the same argument used for I + 1 in the paragraph above. We must now justify that I + 2 has a homomorphism h to I 2 , which will imply that I + 2 still does not satisfy Q. We will define h to be the identify on I 2 . So it suffices to show the existence of a homomorphism from I + Accessed to I 2 which is the identity on I Accessed , because I + Accessed ∩ I 2 = I Accessed . We next define h on I * Accessed \ I Accessed . Consider a fact F = R( a) of I * Accessed \ I Accessed created by obliviously chasing a trigger on an existence-check constraint δ on I Accessed . Let F ′ = S( b) of I Accessed be the fact in the image of the trigger: that is, the fact that matches the body of δ. We know that δ holds in I 2 and thus there is some fact F ′′ := R( c) in I 2 that serves as a witness for this. Writing Arity(R) to denote the arity of R, we define h(a i ) or each 1 ≤ i ≤ Arity(R) as h(a i ) := c i . In this way, the image of the fact F under h is F ′′ . This is consistent with the stipulation that h is the identity on I Accessed , because whenever a i ∈ I Accessed then it must be exported between F ′ and F , hence a i is also exported between F ′ and F ′′ so we have c i = a i . Further, all these assignments are consistent across the facts of I * Accessed \ I Accessed because all elements of I * Accessed \ I Accessed which do not occur in Adom(I Accessed ) occur at exactly one position in one fact of I * Accessed \ I Accessed . We now define h on facts of I + Accessed \ I * Accessed by extending it on the new elements introduced throughout the chase. Whenever we create a fact F = R( a) in I + Accessed for a trigger mapping to F ′ = S( b) for an ID δ in I + Accessed , we explain how to extend h to the nulls introduced in F . Consider the fact h(F ′ ) = S(h( b)) in I 2 . The body of δ also matches this fact, and as I 2 satisfies Σ ID there must be a fact F ′′ = R( c) in I 2 which extends this match to the head of δ, since δ holds in I 2 . We define h(a i ) := c i for all 1 ≤ i ≤ Arity(R). We show that this is consistent with the current definition of h. Whenever an element a i of a already occurred in I + Accessed , it must have been exported between F ′ and F , so h(a i ) was also exported between h(F ′ ) and F ′′ , so we already have h(a i ) = c i . Further, this assignment is well-defined for the nulls introduced in F , because each null occurs only at one position. The resulting h is a homomorphism because the image of previous facts is unchanged, and the fact R(h( a)) = F ′′ is a fact of I 2 as required.
This concludes the proof of the fact that there is a homomorphism from I + Accessed to I Accessed which is the identity on I 2 . Hence I + 2 still violates Q. It remains to justify that the common subinstance I + Accessed in I + 1 and I + 2 is access-valid for I + 1 . Consider one access in I + 1 performed with some method mt of a relation R, with a binding AccBind of values in I + Accessed . It is clear by definition of I + Accessed that, if some value of AccBind is not in the domain of I Accessed , it must be a null introduced in the chase to create I + Accessed . In this case the only possible matching facts in I + 1 are in I + Accessed \ I Accessed and there is nothing to show. Hence, we focus on the case when all values of AccBind are in I Accessed . If mt is not a result-bounded access, then we can simply use the fact that I Accessed is access-valid for I 1 to know that all matching tuples in I 1 were in I Accessed , so the matching tuples in I + 1 must be in I Accessed ∪ (I + 1 \ I 1 ), hence in I + Accessed . If mt is a bounded access, then consider the fact CheckView mt (AccBind) corresponding to this access in I 1 . We know by definition that I * Accessed , hence I + Accessed , contains infinitely many suitable facts R( x, y) with x = y. Letting k be the bound of mt, we choose k facts among those, and obtain a valid output for the access with AccBind on mt in I + 1 . Hence, we have shown that I + Accessed is access-valid for I + 1 . Using Lemma 4.3, we have completed the proof of Theorem 4.2.

C.2. Proof of Theorem 4.5: Choice Approximation for Equality-Free FO
Recall the statement of Theorem 4.5.
Let Sch be a schema whose constraints are in equality-free first-order logic (e.g., TGDs), and let Q be a CQ that is monotone answerable in Sch. Then Q is monotone answerable in the choice approximation of Sch.
Using our equivalence with AMonDet, we see that it suffices to show: Let schema Sch have constraints given by equality-free first-order constraints, and Q be a CQ that is AMonDet w.r.t. Sch. Then Q is also AMonDet in the choice approximation of Sch.
We will again use the "blowing-up" construction of Lemma 4.3. Fix a counterexample to AMonDet (I 1 , I 2 ) in the approximation, such that I 1 satisfies the query, I 2 violates the query, I 1 and I 2 satisfy the equality-free first order constraints Σ, and I 1 and I 2 have a common subinstance I Accessed which is access-valid for I 1 . We will expand them to I + 1 and I + 2 that have a common subinstance which is access-valid for I + 1 in Sch.
For each element a in the domain of I 1 , introduce infinitely many fresh elements a j for j ∈ N >0 , and identify a 0 := a. Now, define I + 1 := Blowup(I 1 ), where we define Blowup(I 1 ) as the instance with facts {R(a 1 i 1 , . . . , a n in ) | R( a) ∈ I 1 , i ∈ N n }. Define I + 2 from I 2 in the same way.
We will now show correctness of this construction. We claim that I 1 and I + 1 agree on all equality-free first-order constraints, which we show using a variant of the standard Ehrenfeucht-Fraïssé game without equality [16]. In this game there are pebbles on both structures; play proceeds by Spoiler placing a new pebble on some element in one structure, and Duplicator must respond by placing a pebble with the same name in the other structure. Duplicator loses if the mapping given by the pebbles does not preserve all relations of the signature. If Duplicator has a strategy that never loses, then one can show by induction that the two structures agree on all equality-free first-order sentences.
Duplicator's strategy will maintain the following invariants: 1. if a pebble is on some element a j ∈ I + 1 , then the corresponding pebble in I 1 is on a; 2. if a pebble is on some element in I 1 , then the corresponding pebble in I + 1 is on some element a j for j ∈ N.
These invariants will guarantee that the strategy is winning. Duplicator's response to a move by Spoiler in I + 1 is determined by the strategy above. In response to a move by Spoiler placing a pebble on b in I 1 , Duplicator places the corresponding pebble on b 0 = b in I + 1 . Clearly the same claim can be shown for I 2 and I + 2 . In particular this shows that I 1 still satisfies the query and I 2 still violates the query.
All that remains is to construct the common subinstance. Let I + Accessed := Blowup(I Accessed ). As I Accessed is a common subinstance of I 1 and I 2 , clearly I + Accessed is a common subinstance of I + 1 and I + 2 . To see why I + Accessed is access-valid for I 1 , given an input tuple t ′ in I + Accessed , let t be the corresponding tuple in I Accessed . If t had no matching tuples in I 1 , then clearly the same is true in I + 1 . If t had at least one matching tuple u in I 1 , then such a tuple exists in I Accessed because it is access-valid for I 1 , and hence sufficiently many copies exist in I + Accessed to satisfy the original result bounds, so that we can find a valid response to the access in I + Accessed . Hence I + Accessed is access-valid for I + 1 , which completes the proof.

C.3. Proof of Theorem 4.6: Choice Approximation for UIDs and FDs
Recall the statement of Theorem 4.6: Let Sch be a schema whose constraints are UIDs and arbitrary FDs, and Q be a CQ that is monotone answerable in Sch. Then Q is monotone answerable in the choice approximation of Sch.
We first formalize the idea of progressively fixing one access at a time. Given an instance I 1 and a subinstance I Accessed ⊆ I 1 which is access-valid for the choice approximation Sch ′ of the schema Sch, we say that one access mt, AccBind with AccBind ∈ I Accessed is valid for Sch if we can construct in I Accessed a response to the access which is valid for Sch (not just for Sch ′ ).
This allows us to define a "progressive" variant of the process of Lemma 4.3, which describes our high-level strategy to prove Theorem 4.6. Remember that Lemma 4.3 said that, if a counterexample to AMonDet in Sch ′ can be expanded to a counterexample in Sch, then q being AMonDet in Sch implies the same in Sch ′ . The next lemma makes a weaker hypothesis: it assumes that for any counterexample in Sch ′ and for any choice of access, we can expand to a counterexample in Sch ′ where the selected access is made valid for Sch, and all accesses formerly valid for Sch remain valid for Sch. In other words, the assumption is that we can fix the counterexample from Sch ′ to Sch by working one access at a time. We show that this is sufficient to reach the same conclusion: Lemma C.1. Let Sch be a schema and Sch ′ be its choice approximation, and let Σ be the constraints.
Assume that, for any CQ Q not AMonDet in Sch ′ , for any counterexample I 1 , I 2 of AMonDet for Q and Sch ′ with a common subinstance I Accessed which is access-valid for I 1 on Sch ′ , for any access mt, AccBind in I Accessed , the following holds: we can construct a counterexample I + 1 , I + 2 of AMonDet for Q and Sch ′ , i.e., I + 1 and I + 2 satisfy Σ, I 1 ⊆ I + 1 , I 2 ⊆ I + 2 , I + 1 has a homomorphism to I 1 , I + 2 has a homomorphism to I 2 , and I + 1 and I + 2 have a common subinstance I ′ Accessed which is access-valid for I + 1 on Sch ′ , and we can further impose that: 1. I ′ Accessed is a superset of I Accessed ; 2. the access mt, AccBind is valid for Sch in I ′ Accessed ; 3. any access in I Accessed which is valid for Sch in I 1 is also valid for Sch in I + 1 ; 4. any access in I ′ Accessed which is not an access in I Accessed must be valid for Sch. Then any query which is AMonDet in Sch is also AMonDet in Sch ′ .
Proof. We will again prove the contrapositive. Let Q be a query which is not AMonDet in Sch ′ , and let I 1 , I 2 be a counterexample, with I Accessed the common subinstance of I 1 and I 2 which is access-valid for I 1 on Sch ′ . Enumerate the accesses in I Accessed as a sequence AccBind 1 , . . . , AccBind n , . . .: they are all valid for I 1 and Sch ′ by definition of I Accessed , but initially none are assumed to be valid for Sch as well. We then build an infinite sequence (I 1 1 , I 1 2 ), . . . , (I n 1 , I n 2 ), . . . with the corresponding common subinstances A 1 , . . . , A n , . . ., with each A i being a common subinstance of I i 1 and I i 2 which is access-valid for I i 1 , by applying the process of the hypothesis of the lemma in succession to AccBind 1 , . . . , AccBind n , . . .. In particular, note that whenever AccBind i is already valid for Sch in I i 1 , then we can simply take I i+1 1 , I i+1 2 , I i+1 Accessed to be respectively equal to I i 1 , I i 2 , I i Accessed , without even having to rely on the hypothesis of the lemma.
It is now obvious by induction that, for all i ∈ N, I i 1 and I i 2 satisfy the constraints Σ, we have I 1 ⊆ I i 1 , I i 2 has a homomorphism to I 2 , and I i Accessed is a common subinstance of I i 1 and I i 2 which is access-valid for I i 1 on Sch ′ , where the accesses AccBind 1 , . . . , AccBind i are additionally valid for Sch, and where all the accesses in I i Accessed which are not accesses of I Accessed are also valid for Sch. Hence, the infinite result (I ∞ 1 , I ∞ 2 ), I ∞ Accessed of this process has all accesses valid for Sch in I ∞ 1 . Hence, I ∞ Accessed is actually a common subinstance of I ∞ 1 and I ∞ 2 which is access-valid for I ∞ 1 on Sch, so I ∞ 1 , I ∞ 2 is a counterexample to AMonDet of Q in Sch, which concludes the proof.
Consider the schema Sch and its choice approximation Sch ′ , and let Σ be the constraints.
We now explain how we will fulfill the requirements of Lemma C.1. Let Q be a CQ and assume that it is not AMonDet in Sch ′ , and let I 1 , I 2 , be a counterexample to AMonDet, with I Accessed being a common subinstance of I 1 and I 2 which is access-valid for I 1 on Sch ′ . Let (mt, AccBind) be an access on relation R in I Accessed which is not necessarily valid for Sch in I 1 ; remember that, by hypothesis, it is valid for Sch ′ in I 1 , i.e., we can construct a response to the access in I Accessed which is valid for Sch ′ in I 1 . Our goal is to build I + 1 and I + 2 such that I + 1 is a superinstance of I 1 and I + 2 homomorphically maps to I 2 ,; we want both I + 1 and I + 2 to satisfy Σ, and want I + 1 and I + 2 to have a common subinstance I ′ Accessed which is access-valid for I + 1 , where AccBind is now valid for Sch (i.e., not only for the choice approximation), all new accesses are also valid for Sch, and no other accesses are affected.
First observe that, if there are no matching tuples in I 1 for the access (mt, AccBind), then the access is already valid in I 1 for Sch and there is nothing to do, i.e., we can just take I + 1 := I 1 , I + 2 := I 2 , and I ′ Accessed := I Accessed . Further, note that if there is only one matching tuple in I 1 for the access, as I Accessed is valid for the choice approximation, then this tuple is necessarily in I Accessed also, so again there is nothing to do. Hence, we know that there is strictly more than one matching tuple in I 1 for the access (mt, AccBind); as I Accessed is valid for Sch ′ , then it contains at least one of these tuples, say t 1 , and as I Accessed ⊆ I 2 , then I 2 also contains t 1 . Let t 2 be a second matching tuple in I 1 which is different from t 1 . Let C be the non-empty set of positions of R where t 1 and t 2 disagree. Note that, since I 1 satisfies the constraints, the constraints cannot imply an FD from the complement of C to a position j ∈ C, as otherwise t 1 and t 2 would witness that I 1 violates this FD.
We We claim that I 1 ∪ N does not violate inferred FDs of the schema. If there were a violation of a FD φ, the violation F 1 , F 2 must involve some new fact R( o i ), as I 1 on its own satisfies the constraints. We know that the determiner of φ cannot include a position of C, as all elements in the new facts R( o i ) at these positions are fresh. Hence, the determiner of φ is included in the complement of C, but recall that we argued above that φ cannot then have a determined position in C. Hence, both the determiner and determinant are in the complement of C. But on this set of positions the facts of the violation F 1 and F 2 agree with the existing fact t 1 of I 1 , a contradiction. So we know that I 1 ∪ N does not violate the FDs. The same argument shows that I 2 ∪ N does not violate the FDs. Now, let W be formed from chasing N with the UIDs, ignoring triggers whose exported element occurs in t 1 . We have argued that I 1 ∪ N and I 2 ∪ N satisfy the FDs. Further we know I 1 ∪ W and I 2 ∪ W will satisfy the UIDs, since I 1 and W satisfied them in isolation, and satisfaction of IDs is preserved under union. We want to show that both the UIDs and FDs hold of I 1 ∪ W and I 2 ∪ W . Note that as we have t 1 in I 1 and in I 2 we know that any element of the domain of N which also occurs in I 1 or in I 2 must be an element of t 1 . Also note that any such element that occurs at a certain position (R, i) in N , then it also occurs at (R, i) in I 1 . We then conclude that that I 1 ∪ W and I 2 ∪ W satisfy the constraints, thanks to the following general lemma: Lemma C.2. Let Σ ID be a set of UIDs and let Σ FD be a set of FDs. Let I and N be instances, and let ∆ := Adom(I) ∩ Adom(N ). Assume that I satisfies Σ FD ∪ Σ ID , that I ∪ N satisfies Σ FD , and that whenever a ∈ ∆ occurs at a position (R, i) in N then it also occurs at (R, i) in I. Let W denote the chase of N by Σ ID where we do not fire any triggers which map an exported variable to an element of ∆. Then I ∪ W satisfies Σ ID ∪ Σ FD .
Intuitively, the lemma applies to any instance I satisfying the constraints (UIDs and FDs), to which we want to add a set N of new facts, in a way which still satisfies the constraints. We assume that the elements of I that occur in N never do so at new positions relative to where they occur in I, and we assume that I ∪ N satisfies the FDs. We then claim that we can make I ∪ N satisfy the UIDs simply by chasing N by the UIDs in a way which ignores some triggers, i.e., by adding W . (The triggers that we ignore are unnecessary in terms of satisfying the UIDs, and in fact we would possibly be introducing FD violations by firing them, so it is important that we do not fire them. ) We now prove the lemma: Proof. We assume without loss of generality that the UIDs are closed under implication [17]. This allows us to assume that, whenever we chase by the UIDs, after each round of the chase, all remaining violations of the UIDs are on facts involving some null created in the last round. In particular, in W , all remaining violations of Σ ID are on facts of N . We first show that I ∪ W satisfies Σ ID . Assume by way of contradiction that it has an active trigger τ for a UID δ. The range of τ is either in I or in W . The first case is impossible because I satisfies Σ ID so it cannot have an active trigger for δ. The second case is impossible also by definition of the chase, unless the active trigger maps an exported variable to an element of ∆, i.e., it is a trigger which we did not fire in W . Let R( a) be the fact of W in the image of τ . By the above, as the IDs are closed under implication, R( a) is necessarily a fact of N . Let a i be the image of the exported variable in a, with a i ∈ ∆. Hence, a i occurs at position (R, i) in N , so by our assumption on N it also occurs at position (R, i) in I. Let R( b) be a fact of I such that b i = a i . As I satisfies Σ ID , for the match of the body of δ to R( b) there is a corresponding fact F in I extending the match to the head of δ. But F also serves as a witness in I ∪ W for the match of the body of δ , so we have reached a contradiction. Hence, we have shown satisfaction of Σ ID .
We now show that I ∪ W satisfies Σ FD . We begin by arguing that W satisfies Σ FD . This is because N satisfies Σ FD ; it is easy to show (and is proven in [15]) that performing the chase with active triggers of UIDs never creates violations of FDs, so this is also true of W as it is a subset of the facts of the actual chase of N by Σ ID . Now, assume by way of contradiction that there is an FD violation {F, F ′ } in I ∪ W . As I and W satisfy Σ FD in isolation, it must be the case that one fact of the violation is in I and one is in W : without loss of generality, assume that we have F ∈ I and F ′ ∈ W . There are three possibilities: F ′ is a fact of N , F ′ is a fact created in the first round of the chase (so one of its elements, the exported element, is in Adom(N ), and the others are not), or F ′ is a fact created in later rounds of the chase. The first case is ruled out by the hypothesis that I ∪ N satisfies Σ FD . In the second case, by definition of W , the element from Adom(N ) in F ′ cannot be from Adom(I), as otherwise we would not have exported this element (i.e., it would be a trigger that we would not have fired); hence F ′ contains only fresh elements and one element in Adom(N ) \ Adom(I), so F and F ′ are on disjoint elements so they cannot be a violation. In the third case, F ′ contains only fresh elements, so again F and F ′ cannot form an FD violation as they have no common element.
So we now know that I + 1 := I 1 ∪ W and I + 2 := I 2 ∪ W satisfy the constraints. Let us then conclude our proof of Theorem 4.6 using the process of Lemma C.1. We first show that (I + 1 , I + 2 ) is a counterexample of AMonDet for Q and Sch ′ : • We have just shown that I + 1 and I + 2 satisfy the constraints.
• We now argue that I + 1 has a homomorphism to I 1 (the proof for I + 2 and I 2 is analogous). We first define the homomorphism from I 1 ∪ N to I 1 by mapping I 1 to itself, and mapping each fact of N to R( t 1 ) (which is consistent with what precedes); it is clear that this is a homomorphism. We then extend this homomorphism inductively on each fact created in W in the following way. Whenever a fact S( b) is created by firing an active trigger R( a) for a UID R( x) → S( y) where x p = y q is the exported variable, (so we have a p = b q ), consider the fact R(h( a)) of I 1 (with h defined on a by induction hypothesis). As I 1 satisfies Σ, we can find a fact S( c) with c q = h(a p ), so we can define h( b) to be c, and this is consistent with the existing image of a p .
• We can define I ′ Accessed := I Accessed ∪ W as a common subinstance of I + 1 and I + 2 . We now show that I ′ Accessed is valid for I + 1 and Sch ′ . Let (mt ′ , AccBind ′ ) be an access in I ′ Accessed . The first case is when (mt ′ , AccBind ′ ) includes an element of Adom(I ′ Accessed )\ Adom(I Accessed ), namely, an element of Adom(W ) \ Adom(I 1 ). In this case, clearly all matching facts must be facts that were created in the chase, i.e., they are facts of W . Hence, we can construct a valid response from W ⊆ I ′ Accessed . The second case is when (mt ′ , AccBind ′ ) is only on elements of Adom(I Accessed ), then it is actually an access on I Accessed , so, letting U ⊆ I Accessed be the set of matching tuples which are the valid response to (mt ′ , AccBind ′ ) in I 1 , we can construct a valid response to (mt ′ , AccBind ′ ) in I + 1 from U ∪ W ⊆ I ′ Accessed , because any matching tuples for this access in I + 1 must clearly be either matching tuples of I 1 or they must be matching tuples of W .
We now show the four additional conditions:

It is clear by definition that I ′
Accessed ⊇ I Accessed . 2. We must show that the access (mt, AccBind) is valid for Sch in I ′ Accessed . Indeed, there are now infinitely many matching tuples in I ′ Accessed , those of N . Thus this access is valid for Sch in I 1 : we can choose as many tuples as the value of the bound to obtain a response which is valid in I 1 .
3. We must verify that, for any access (mt ′ , AccBind ′ ) of I ′ Accessed which is also an access of I Accessed , if the access in I Accessed was valid for Sch in I 1 , then it is still valid in I ′ Accessed for Sch in I + 1 . The argument is the same as in the second case of the third bullet point above: from the valid response to (mt ′ , AccBind ′ ) in I 1 for Sch, we construct a valid response to (mt ′ , AccBind ′ ) in I + 1 for Sch.

Consider any access in I ′
Accessed which is not an access in I Accessed . The binding for this access must include some element of Adom(W ), so its matching tuples must be in W , which are all in I ′ Accessed . Hence, by construction any such accesses are valid for Sch.
So we conclude the proof of Theorem 4.6 using Lemma C.1, fixing each access according to the above process.

C.4. Proof of Theorem 4.7: FD-approximation
Recall the statement: Let Sch be a schema whose constraints are FDs, and let Q be a CQ monotone answerable in Sch. Then Q is monotone answerable in the FD approximation of Sch.
Proof. It suffices to assume that we have a counterexample I 1 and I 2 to AMonDet for the FD approximation of Sch, with Q holding in I 1 , Q not holding in I 2 , and I 1 and I 2 having a common subinstance I Accessed which is access-valid for I 1 . We will upgrade these to I + 1 , I + 2 having the same property for Sch, by blowing up accesses one after the other, using Lemma 4.3.
Consider an access using a method mt on relation R with binding AccBind having values in I Accessed . If the matching values within I 1 are a subset of those in I 2 , nothing needs to be done for this access, so we assume that this inclusion does not hold. Recall that DetBy(mt) denotes the positions determined under the FDs by the input positions of mt. Clearly there must be some positions X of R that are not in DetBy(mt), since otherwise the hypotheses would imply that the matching tuples are the same in I 1 and I 2 . We can also conclude that the matching tuples in both I 1 and I 2 must be non-empty and that the matching tuples in I 1 ∪ I 2 must all agree on positions in DetBy(mt).
We can blow up the output of this access in I 1 and I 2 . If k is the result bound of mt, we add k tuples with all positions in DetBy(mt) agreeing with the common value of all matching tuples of I 1 ∪ I 2 . For the other positions we pick values that are disjoint from each other and from other values in I 1 ∪ I 2 . Note that this does not break any FDs with determinant not contained in DetBy(mt), and by definition of X we do not break a determinant contained in DetBy(mt).
Iteratively performing this operation over all accesses gives us I + 1 , I + 2 having the required subinstance I + Accessed in Sch which is access-valid for I + 1 . Further, I + i has a homomorphism back to I i for each i ∈ {1, 2}, which implies that ¬Q holds in I + 2 . We can decide whether a CQ Q has a monotone plan with respect to a schema with result bounds whose constraints are FDs. The problem is NP-complete.

D. Proofs for
By Theorem 4.7 we have FD-approximation, meaning that we can get rid of result bounds in the following way: • We extend the constraints to add, for each result-bounded method mt, the following ID constraints: R( x, y, z) → R mt ( x, y) and R mt ( x, y) → ∃ z R( x, y, z) where x denotes the input positions of mt, and y denotes the other positions of DetBy(mt).
• The access methods of Sch ′ are the methods of Sch that have no result bounds, plus the following: for each result-bounded method mt on relation R with input positions j 1 . . . j m , a method mt ′ on R mt whose input positions are the first m positions of R mt . Using the FDs on R and the constraints relating R to R mt , we can see that any access to mt ′ is guaranteed to return at most one result.
By Proposition 3.4, we reduce AMonDet to a query containment problem involving two copies of the constraints above, on primed and unprimed copies of the schema, along with accessibility axioms for each access method (including the new methods R mt ). We can observe a few obvious simplifications of these constraints: • In the restricted chase, the constraint R mt ( x, y) → ∃ z R( x, y, z) will never fire, since a fact R mt ( a, b) is always generated by a corresponding fact R( a, b, c).
• In the restricted chase, constraints of the form R ′ ( x, y, z) → R ′ mt ( x, y) can fire, since it is possible that an R ′ -fact is created by one method mt 1 (result-bounded or not), but then an axiom of the above form is fired by a different method: this is only possible because our model allows more than one access method per relation. However, the creation of such an R ′ mt -fact will not generate any further rule firings.
If we consider the restricted chase with the remaining constraints, we can see that the only rules that create new values are the primed copy of the first constraint above: The new values will never propagate back to the unprimed relations, and will never propagate within unprimed relations. Thus the chase will terminate in polynomial many parallel rounds, which gives the required bound. For simplicity we give the argument for IDs only. The generalization to Linear TGDs is straightforward.
Consider a chase sequence based on the canonical database I 0 of a conjunctive query Q, using a collection of IDs Σ. The collection of facts generated by this sequence can be given the structure of a tree, where there is a root node associated with I 0 , and one node n F for each generated fact F . If applying a chase step to fact F produces fact F ′ in the sequence, then the node n F ′ is a child of the node n F . We refer to this as the chase tree of the sequence.
Consider nodes n and n ′ in the chase tree, with n a strict ancestor of n ′ . We say they n and n ′ are far apart if there are distinct generated facts F 1 and F 2 whose corresponding nodes are ancestors of n ′ and descendants of n, with F 1 an ancestor of F 2 , F 1 and F 2 generated by the same rule of Σ, and any value of F 1 occurring in F 2 occurs in the same positions within F 2 as in F 1 . If such an n and n ′ are not far apart, we say that are near.
Given a match h of Q in the chase tree, its augmented image is the closure of its image under least common ancestors. If Q has size k then this has size k 2 . For nodes n 1 is n 2 in the augmented image we say n 1 is the image parent of n 2 if n 1 is the lowest ancestor of n 2 in the augmented image.
The analysis of Johnson and Klug is based on the following lemma: Lemma E.1. If Q has a match in the chase, then there is a match h with the property that if n 1 is the image parent of n 2 then n 1 and n 2 are near.
Proof. We prove this by induction on the number of violating n 2 's and the sum of the depths of the violations in the tree. If n 1 is far apart from n 2 , then there are witnesses F 1 and F 2 to this, say with F 1 higher than F 2 . We eliminate the interval between F 1 and F 2 (along with the subtrees hanging off of them, which by assumption do not contain any match elements). We adjust h accordingly. In doing this we reduce the sum of the depths, while no new violations are created, since the image parent relationships are preserved. Iterating this operation we must achieve a tree where the nodes corresponding to n 1 and n 2 are near and thus the number of violations decreases.
Call a match h of Q in the chase tight if it has the property given in the lemma above. The depth of the match is the depth of the lowest node in its image. The next observation, also due to Johnson and Klug, is that when the width is bounded, tight matches can not occur far down in the tree: Lemma E.2. If Σ is a set of IDs of width w and the schema has arity bounded by m, then a tight match of size k has depth at most k · |Σ| · m w+1 .
Proof. We claim that the length of the path between a match element h(x) and its closest ancestor h(x ′ ) in the image must be at most |Σ| · m w+1 . At most w values from h(x ′ ) are present in any fact on the path, and thus the number of configurations they can occur is at most m w+1 . Thus after |Σ| · m w+1 there will be two elements which repeat both the rule and the configuration of the values, which would contradict tightness.
Johnson and Klug's result follows from combining the previous two lemmas: Proposition E.3. [22] For any fixed w, there is an NP algorithm for query containment under IDs of width at most w.
Proof. We guess k branches of depth at most k · |Σ| · m w+1 in the chase and a match in them.
We now give the extension of this calculation for bounded semi-width. Recall from the body that a collection of IDs Σ has semi-width bounded by w if it can be decomposed into Σ 1 ∪ Σ 2 where Σ 1 has width bounded by w and the position graph of Σ 2 is acyclic.
An easy modification of Proposition E.3 now completes the proof of Proposition 6.1: Proof. We revisit the argument of Lemma E.2. As in that argument, it suffices to show that the length of the path between a match element h(x) and its closest ancestor h(x ′ ) in the image must be at most |Σ| · m w+1 . As soon as we apply a rule of Σ 1 along the path, at most w values are exported, and so the remaining path is bounded as before.
Since Σ 2 has an acyclic position graph, a value in h(x ′ ) can propagate for at most |Σ 2 | steps when using rules of Σ 2 only. Thus after at most |Σ 2 | edges in a path we will either have no values propagated (if we used only Σ 2 rules) or at most w values (if we used a rule from Σ 1 ). Thus we can bound the path size by the previous bound plus a factor of |Σ 2 |.

E.2. Orderability of truncated chase proofs
For some of our remaining results, we need an observation about proofs using IDs and truncated accessibility axioms: we call such a proof a truncated chase proof. In any such proof we can arrange the facts created in such a proof as a tree, with each node n corresponding to a fact F that is generated by an ID: the parent of this node is the node associated to the fact contained in the trigger that was fired to generate F . During the proof additional accessibility facts are also generated. Each such fact A is created by a truncated accessibility axiom firing with a given atom F over the schema. We say that F is the birth fact of the corresponding accessibility fact A = accessible(c).
The birth constants of A are all constants d such that accessible(d) is a hypothesis of the accessibility axiom creating A. Our main goal will be to normalize proofs so that the creation of accessibility facts is "compatible with the tree structure". Consider a truncated chase proof that results in a chase instance I. Such a proof is well-ordered if it has the following property: For any fact F = R( c) generated in the proof by firing a trigger τ for an ID, if accessible(c i ) is generated in I with birth fact in the subtree of F , and c m 1 . . . c m k are all the elements c that are birth constants of c i and which were exported when firing τ , then each fact accessible(c m 1 ) . . . accessible(c m k ) must already have been present in the chase at the time F was generated.
We now show: Lemma E.4. For any chase proof from the canonical database of Q using truncated accessibility axioms and IDs, producing instance I, there is a well-ordered chase proof that generates a set of facts isomorphic to those of I over the canonical database.
Proof. Note that in an arbitrary proof, it could well be that A j = accessible(c m j ) appeared after the generation of F . The idea of the proof is that we can "re-generate F ", re-firing the rules generating F and its subtree after all such facts A j are created. Formally, we proceed by induction on the number of counterexample firings. In the inductive step, consider a non-well-ordered proof and the subproof f 1 . . . f k up through the first violation of well-orderedness. That is, there is a fact F = R( c) generated by a rule firing f i using an ID δ from its parent fact E, a fact A j = accessible(c m j ) that was not present in the chase when F was generated, and f k is an accessibility axiom using accessible(c m j ) (as well as other accessibility facts) to generate accessible(c i ) with birth fact F B in the subtree of F . We create a new proof that begins with f 1 . . . f k−1 and then continues by "copying f i ", generating a copy F ′ from E via δ. Doing this cannot introduce a violation of well-orderedness, because it does not generate an accessibility fact, and there are no accessibility facts in the subtree of the new fact F ′ .
We now continue the proof with a copy of the firings f i+1 . . . f k−1 , but the firings that were performed in the subtree of F are now performed instead on the corresponding node in the subtree of F ′ . When we perform the copy of these firings, we do not cause any violation of well-orderedness, because the original firings f i+1 . . . f k−1 did not cause such a violation (by minimality of f k ).
Last, instead of firing f k on the fact F B in the subtree of F , we fire it on the corresponding fact F ′ B in the subtree of F ′ : we call this rule firing f ′ k . We argue that all the necessary accessibility hypotheses for f ′ k have been generated, so that we can indeed fire f ′ k . Indeed, for the accessibility hypotheses of f k that have been created in the subtree of F ′ , we know that the corresponding accessibility hypotheses of f ′ k had been generated in the subtree of F by firing f i+1 . . . f k−1 , so these hypotheses have also been generated in the subtree of F ′ . Now, for the accessibility hypotheses of f k that are on exported elements between E and F , they had been generated already when we wanted to fire f k , so they are generated when we want to fire f ′ k . In fact, our construction has ensured that these accessibility hypotheses had already been generated when creating F ′ , which ensures that we can fire f ′ k and not cause a violation of well-orderedness. Hence, the proof that we have obtained by this process generates accessible(c i ) in a well-ordered way, and the number of violations of well-orderedness has decreased.

E.3. Proof of Proposition 6.2: Polynomial time algorithm for derived truncated accessibility axioms
Recall from the body of the paper that given a set of constraints Σ and access methods, a truncated accessibility axiom is a rule of the form: with x distinct variables and P a subset of the positions of R. A derived truncated accessibility axiom is a rule of this form which is implied by Σ and by the original truncated accessibility axioms (i.e., those that correspond to methods in the schema). The breadth of a truncated accessibility axiom is the size of P . We recall the statement of Proposition 6.2: For any fixed w ∈ N, there is a polynomial time algorithm that takes as input a set of IDs of width w and a set of truncated accessibility axioms, and computes all of the derived truncated accessibility axioms of breadth at most w.
Proof. We will iteratively build up a set O of triples (R, p, j) with p a set of positions of R of size at most w and a position j of R, which describes the positions that are accessible given a fact R(x 1 . . . x n ) and accessibility axioms on s. Such a pair will represent a truncated accessibility axiom in the obvious way. We first set O := {(R, p, j) | j ∈ p}, representing trivial axioms of the form φ ∧ accessible(x i ) → accessible(x i ). We then repeat the steps below: to O for all j between 1 and the arity of R.
We continue until we reach a fixpoint, which must occur after at most r · a w+1 steps, with r the number of relations in the schema and a the maximal arity of a relation. It is clear that the fixpoint can be computed in polynomial time in Σ and in the set of access methods. What remains is to show that this correctly computes all derived truncated accessibility axioms satisfying the breadth bound. For one direction, it is straightforward to see that all rules obtained by this process are in fact derived truncated accessibility axioms. Conversely, we claim that, for all derived truncated accessibility axioms of breadth ≤ w then the corresponding triple (R, s 1 . . . s l , i) is added to O.
We prove this by induction on the length of a chase proof of accessible(c i ) from the hypotheses R( c) and at most w other accessibility facts on positions s 1 . . . s l . Note that by Lemma E.4 we can assume that the proof is well-ordered.
If the proof is trivial, then clearly (R, p, i) ∈ O by the initialization of O. If it is non-trivial then some accessibility axiom fired to produce accessible(c i ), and we can fix a guard atom F and accessibility facts F 1 . . . F l that were hypotheses of the firing: in our earlier terminology, F is the birth fact of accessible(c i ) and the constants occurring in the F 1 . . . F l are the birth constants of accessible(c i ). If F = S( c ′ ) with c ′ a subset of c, then each F i is of the form accessible(c s i ), and by induction (R, p, c s i ) ∈ O for each i. Now by (Transitivity) and (Access) we complete the argument.
Otherwise, the guard F = S( a, c) of the accessibility axiom firing was generated by applying an ID δ to some fact E 1 = T 1 ( a, b), with a the subset of the values in E 1 that were exported when firing δ. By well-orderedness, we know that each accessibility fact used in the firing that mentions a value in a was present when δ was fired on E 1 : as the width of the IDs is w, this set has width at most w. Now, we see that there is a subproof of shorter length proving accessible(c i ) from F and this subset of F 1 . . . F l . Therefore by induction we have (S, p ′ , i ′ ) ∈ O for p ′ corresponding to the subset above (of size at most w, so matching the breadth bound) and i ′ corresponding to c i in F . Applying the rule (ID) we have (T 1 , s ′′ , i ′′ ) ∈ O for i ′′ corresponding to c i in E 1 and p ′′ corresponding to the subset in E 1 . The fact E 1 may itself have been generated by a non-full ID applied to some E 2 , and hence may contain values that are not in the original set of constants c. But if so we can iterate the above process on the ID from E 2 to E 1 , noting that E 2 also must contain c i . We thus arrive at a triple (T n , p n , i n ) ∈ O, where i n corresponds to the position of c i in a fact F n that occurs in the original proof with no application of an ID. Thus F n = R( c), and hence T n = R and i n = i. By induction again, we have (R, p, j) ∈ O for each j ∈ p n . Applying (Transitivity) completes the argument.

E.4. Proof of Lemma 6.3: Completeness of the short-cut chase
Recall that, letting Σ consist of IDs of width w and truncated accessibility axioms, a short-cut chase proof on an initial instance I 0 with Σ uses two alternating kinds of steps: • ID steps, where we fire an ID on a trigger τ to generate a fact F : we put F in a new node n which is a child of the node n ′ containing the fact of τ ; and we copy in n all facts of the form accessible(c) that held in n ′ about any element c that was exported when firing τ .
• Breadth-bounded saturation steps, where we consider a newly created node n and apply all derived truncated accessibility axioms of breadth at most w on that node until we reach a fixpoint and there are no more violations of these axioms on n.
The atoms in the proof are thus associated with a tree structure: it is a tree of nodes that correspond to the application of IDs, and each node also contains accessibility facts that occur in the node where they were generated and in the descendants of those nodes that contain facts to which the elements are exported. We recall the statement of Lemma 6.3: For any set Σ of IDs of width w, given a set of facts I 0 and a chase proof using Σ that produces I, letting I Lin 0 be the closure of I 0 under the original and derived truncated accessibility axioms, there is I ′ produced by a short-cut chase proof from I Lin 0 with Σ that has a homomorphism from I to I ′ .
We start with an observation about the closure properties of short-cut chase proofs.
Lemma E.5. Suppose a short-cut chase proof on an initial instance I 0 closed under the derived and original truncated accessibility axioms has a breadth-bounded saturation step producing a fact G = accessible(c i ). Then the node associated with the step is the topmost node where c i appears: the root, if c i is in I 0 , or the node corresponding to the ID-step where c i is generated otherwise.
Proof. We consider the case where c i is a null introduced in a fact E = R( c) that was created by an ID trigger τ . Let n be the node of E, and let S = accessible(c j 1 ) . . . accessible(c j l ) be the set of accessibility facts that were true of the c i when firing τ : the facts of S are present in E Note that S has size at most w, since all but w elements were fresh in E when the ID was fired. The node n must be an ancestor of the node where accessible(c i ) is generated, because n is an ancestor of all nodes where c i appears. Thus G = accessible(c i ) is a consequence of E and the hypotheses S under the constraints, since it is generated via derived truncated accessibility axioms or constraints in Σ. But then is a derived truncated accessibility axiom and it has breadth at most w. Thus this axiom would have applied to generate G from {E} ∪ S when applying the breadth-bounded saturation step to E.
We now consider the case where c i is in I 0 . We know that the saturation step that produced G must have applied to a node which is not the root, as I 0 is closed under the derived and original truncated accessibility axioms. We can assume that the depth of the node n where G is generated is minimal among all such counterexamples. Then G is generated at a node n corresponding to the firing of an ID from a node E to a node F . But then arguing as above, G must already follow from E and the accessibility hypotheses that were present when the ID was fired, of which there are at most w. Thus G would have been derived in the breadth-bounded saturation step that followed E, which contradicts the minimality of n.
We now are ready to complete the proof of Lemma 6.3: Proof. We can extend I to a full chase instance (possibly infinite), denoted I ∞ . Likewise, we can continue the short-cut chase process indefinitely, letting I ′ ∞ be the resulting facts. We claim that I ′ ∞ satisfies the constraints Σ. Assume by contradiction that there is an active trigger in I ′ ∞ . It is necessarily the trigger of an original truncated accessibility axiom, with facts accessible(c m j )∧R( c), whose firing would have produced fact accessible(c i ). Consider the node n where R( c) occurs in the short-cut chase proof. If n is the root node corresponding to I Lin 0 , then we know by Lemma F.8 that any accessibility facts on elements of I Lin 0 must have been generated in I Lin 0 , i.e., must have been already present there, because I Lin 0 is already saturated; hence, we conclude that the trigger is in I Lin 0 , hence it is not active because I Lin 0 is closed under the original truncated accessibility axioms. Now, if n is not the root, then by Lemma F.8, each accessible(c m j ) must have been present at the time R( c) was generated Hence, the breadth-bounded saturation step at n should have resolved the trigger, so we have a contradiction.
Since instance I ′ ∞ satisfies the constraints, there is a homomorphism h from the full infinite chase I ∞ to that instance, by universality of the chase [19]. Letting I ′ be the image of I, we get the desired conclusion.
F. More details on NP complexity with bounded-width IDs, but without result bounds F.1. Proof of Theorem 6.5 We recall the statement of Theorem 6.5: Given a CQ Q over a schema that has bounded-width IDs and access methods without result bounds, we can decide in NP if Q is monotone answerable.
Recall from the body of the paper that, in the absence of result bounds, the containment for AMonDet is Q ⊆ Γ Q ′ and that it can be rephrased to: Recall from the body of the paper that Γ Bounded consist of the (Apply ID) rules obtained from linearizing Σ and the truncated accessibility axioms using Theorem 6.4, along with Σ ′ . Let Γ Acyclic consist of the (Breadth-bounded Saturation) axioms of the linearization and also: for any relation R of arity n having an access method.
The only missing detail in the argument is to show: where Q Lin is formed from Q by applying derived truncated accessibility axioms and the original axioms.
Proof. We know that AMonDet is equivalent to the containment with Γ. It is easy to see that the formation of Q Lin proofs with Γ Bounded ∪ Γ Acyclic can be simulated with a proof in Γ, so we focus on showing the converse. We can observe that it suffices to consider chase proofs where we first fire rules of Σ and truncated accessibility axioms to get a set of facts I 0 , we later fire the (Transfer) rules to get I 1 , and finally fire rules of Σ ′ to get I 2 . From Theorem 6.4 we know that using the axioms of the linearization, which are in Γ Bounded ∪ Γ Acyclic , we can derive a set of facts I ′ 0 such that there is a homomorphism from I 0 to UnLin(I ′ 0 ). Applying (Unlinearize and Transfer) to I ′ 0 gives us I ′ 1 which is a homomorphic image of I 1 . Now we can apply the rules of Σ ′ to I ′ 1 to get a set I ′ 2 that is a homomorphic image of I 2 . We conclude that I ′ 2 also has a match of Q ′ as required.
F.2. Proof of Theorem 6.6 Recall the statement: Given a CQ Q over a schema that has bounded-width IDs and access methods (possibly with result bounds), we can decide in NP if Q is monotone answerable.
Proof. By Theorem 4.2, for any schema whose constraints Σ are IDs, we can reduce the monotone answerability problem to the same problem for a schema with no result bounds, by replacing each result-bounded method mt on a relation R with a non-result bounded access method on a new relation CheckView mt , and expanding Σ to a larger set of constraints Σ 1 , adding additional constraints capturing the semantics of the "existencecheck views" CheckView mt : Let us denote constraints of the first form as "relation-to-view" and of the second form as "view-to-relation".
A natural idea is then to apply our NP bound for answerability with bounded width IDs but with non-result-bounded methods (Theorem 6.5). Sadly, we cannot do so directly: although we have eliminated result-bounded methods, in doing so we have introduced new IDs whose width is not bounded. For this reason, instead of simply invoking Theorem 6.5, we will use the idea of its proof, which is to linearize using Theorem 6.4, and then partition the results into two subsets, one of bounded width and the other acyclic.
Consider now the query containment problem for the monotone answerability problem of Σ 1 . This is of the form Q ⊆ Γ Q ′ , where Γ contains Σ 1 , its copy Σ ′ 1 , and the accessibility axioms, which can be factored into: Above S is any of the relations of Σ 1 , including relations R of the original signature and relations CheckView mt . For the CheckView mt relations, there are no input positions, so the associated accessibility axioms just have the form We first observe that in Σ 1 we do not need to include the original copy of the view-torelation constraints: in the restricted chase they will never fire, since facts over CheckView mt can only be formed from the corresponding R-fact. Similarly, in Σ ′ 1 we do not need to include the relation-to-view constraints.
We next note that we can normalize chase proofs with Γ so that relation-to-view constraints are applied only prior to (Transfer). Thus we can merge relation-to-view rules, the accessibility axioms for CheckView mt , and the primed copy of view-to-relation rules into the following axiom: Let Γ ′ be the resulting axioms. What we have achieved by this is to reduce to a similar situation as in Theorem 6.5, where we can partition consists of the axioms of Σ and the original truncated accessibility axioms.
• Γ ′ 2 consists of axioms in Γ that are either (Fact Transfer) axioms for non-resultbounded access methods, as well as the (Result-bounded Fact Transfer) axioms for relations R having a result-bounded method.
We apply Theorem 6.4 to Γ ′ 1 , to get an equivalent set of linear constraints Γ ′ 1,Lin and query Q Lin such that the chase of Q Lin with Γ ′ 1,Lin is the same as the chase of Q with Γ ′ 1 . Thus monotone answerability is equivalent to checking whether Q Lin is contained in has semi-width w: indeed, Γ ′ 1,Lin has width bounded by w, while Γ ′ 2 ∪ Γ ′ 3 is acyclic. This completes the proof of Theorem 6.6.

F.3. Details for proof of Theorem 6.11
We complete the proof of Theorem 6.11 from the body: For a schema with access methods (possibly with result bounds), where the constraints involve only UIDs and FDs, monotone answerability is in EXPTIME.
Recall that, in the proof, we had to argue that after applying the FDs to the canonical database and pre-processing the constraints slightly, we could drop the FDs in Σ and in Σ ′ without impacting the entailment. We argued in the body that the constraints for AMonDet would consist of Σ, Σ ′ , and the following: We then modified the second set of axioms so that, in going from R to R ′ , they preserve not only the input positions of mt, but also the positions of R that are determined by input positions of mt. As mentioned in the body, the use of these "expanded resultbounded constraints" does not impact the soundness of the chase, since chase step with these constraints can be mimicked by a step with an original constraint followed by FD applications.
We now complete the argument to show that after this rewriting, and after applying the FDs to the initial instance, we can apply the TGD constraints while ignoring the FDs.
To argue this, we note that it suffices to consider chase proofs where the primed copies of the UIDs in Σ ′ are never fired prior to constraints in Σ or constraints of either of the forms above. This is because the primed copies of UIDs can not create triggers for any of those constraints.
We show that in a restricted chase with this additional property, the FDs will never fire. We prove this by induction on the rule firing in the chase.
Observe that the UIDs of Σ or Σ ′ cannot introduce FD violations as long as we perform the restricted chase (i.e., when we fire only active triggers).
We now consider a violation caused by some firing of one of the other rules. We assume that it is the first violation found in the chase.
Consider a fact F ′ 1 = R ′ ( c, d) generated by a constraint in the first item above from F 1 = R( c, d) corresponding to a method without result bounds. Suppose there is another R ′ -fact F ′ 2 that causes a violation with F ′ 1 . Since the Σ ′ UIDs have not yet fired, this fact can only have been generated by an R-fact F 2 via one of the two itemized rules above. If it is an non-result-bounded rule, then F 1 and F 2 are an earlier violation, a contradiction. Thus we can assume F ′ 2 was generated via a result-bounded method. Note that since d is fresh, the determining positions of the violated FD must be positions of c. Thus in the result-bounded rule generating F 2 , the exported positions must agree on these positions. By the definition of the "expansion rules", the determined positions must also be exported, thus the violation must be within the positions of c. This means that we have F 1 = F 2 , so F 1 and F 2 witness an earlier violation, a contradiction. Now, consider a fact generated from the second class of constraints above. If it is an R-fact, the argument by induction is straightforward, using again the definition of the expansion rules. If it is an R ′ -fact F = R ′ ( c, e) then we know that it is generated from R( c, d), and we argue as in the paragraph above.

F.4. Proof of Theorem 6.8
In this appendix, we prove Theorem 6.8, which implies an NP bound for query containment for a class of guarded TGDs (Corollary 6.9), and an EXPTIME bound for query containment for a larger class (Corollary 6.10). The first bound generalizes Johnson and Klug's result on query containment with bounded-width IDs [22]. The second result generalizes a result of Calì, Gottlob and Kifer [10] that query containment with guarded TGDs of bounded arity is in EXPTIME. The construction we use is a refinement of the linearization method given in Section 4.2 of Gottlob, Manna, and Pieris [21].
Of the two corollaries mentioned above, the first one generalizes the technique presented for accessibility axioms in Section 6.1, and the second one is used in the body of the paper to give bounds on the monotone answerability problem. These two results, and the more general Theorem 6.8, are completely independent from access methods or result bounds, and may be of independent interest.
We recall the statement of Theorem 6.8: For any a ′ ∈ N, there are polynomials P 1 , P 2 such that the following is true. Given: • A signature σ of arity a; • A subsignature σ ′ ⊆ σ with n ′ relations and arity ≤ a ′ ; • A CQ Q on σ; • A set Σ of non-full IDs of width w and full GTGDs with side signature σ ′ and head arity h; We can compute the following: • A set Σ ′ of linear TGDs of semi-width ≤ w and arity ≤ a, in time P 1 (|Σ| , 2 P 2 (w,h,n ′ ) ), independently from Q; • A CQ Q Lin , in time P 1 (|Σ| , |Q| P 2 (w,h,n ′ ) ).
The constraints Σ ′ and the CQ Q Lin ensure that for any CQ Q ′ , we have Before proving Theorem 6.8, we comment on the intuition of why the hardness results of [10] do not apply to the languages described in Corollaries 6.9 and 6.10. For Corollary 6.10, it is shown in [11, Theorem 6.2] that containment of a fixed query into an atomic query under GTGDs is 2EXPTIME-hard when the arity is unbounded, even when the number of relations in the signature is bounded. The proof works by devising a GTGD theory that simulates an EXPSPACE alternating Turing machine, by coding the state of the Turing machine as facts on tuples of elements: specifically, a fact zero(V, X) codes that there is a zero in the cell indexed by the binary vector V in configuration X.
The arity of such relations is unbounded, so they cannot be part of the side signature σ ′ . However, in the simulation of the Turing machine, the GTGDs in the proof use another relation as guard (the g relation), and the bodies contain other high-arity relations, so there is no choice of σ ′ for which the GTGD theory defined in the hardness proof can satisfy the definition of a side signature. For Corollary 6.9, the proof in [11, Theorem 6.2] explicitly writes the state of the i-th tape cell of a configuration X as, e.g., zero i (X). These relations occur in rule bodies where they are not guards, but as Corollary 6.9 assumes that the side signature is fixed, they cannot be part of the side signature. A variant of the construction of the proof (to show EXPTIME-hardness on an unbounded signature arity) would be to code configurations as tuples of elements X 1 , . . . , X n and write, e.g., zero(X i ). However, then the constant width bound on IDs would mean that the proof construction can only look at a constant number of cells when creating one configuration from the previous one.
We now turn to the proof of Theorem 6.8. We say that a full GTGD is single-headed if it has only one head atom, and we will preprocess the input full GTGDs in PTIME to ensure this condition. For every full GTGD, we introduce a new relation of arity at most h to stand for its head, and we add the full IDs which assert that the new head relation creates every fact in the original head. This is PTIME, and the resulting set of full GTGDs is single-headed and still satisfies the constraints: the new GTGDs have the same head arity as the original ones, and the new GTGDs have a body of size 1 so they respect the condition on side atoms, and their head arity is at most h. Hence, throughout the appendix, when we refer to full GTGDs, we always assume that they are single-headed thanks to this transformation.
The intuition of the proof is to follow the process given in Section 6.1 for linearizing bounded-width IDs and truncated accessibility axioms.
Our proof strategy consists of three steps. We first show that we can compute a form of closure of our IDs and full GTGDs for a bounded domain of side conditions, which we call bounded breadth (generalizing the notion in Section 6.1). This intuitively ensures that, whenever a full GTGD generates a fact about earlier elements, then this generation could already have been performed when these earlier elements had been generated, using an implied full GTGD. This first step is the main part of the proof, and its correctness relies of a notion of well-ordered chase that we introduce, generalizing the analogous notion presented in Appendix E.2 for the specific case of accessibility axioms.
Once this closure has been done, the second step is to structure the chase further, by enforcing that we only fire the full GTGDs and their small-breadth closure just after having fired an ID. As in the earlier proof, we call this the short-cut chase, since we shortcut certain derivations that go up and down the chase-tree via the firing of a derived axioms. The third step is to argue that the short-cut chase can be linearized with IDs, again generalizing the previous constructions.
We now embark on the proof of Theorem 6.8, which will conclude at the end of this section of the appendix.
Bounded breadth closure. The first step of our proof is to show how to compute a closure of the constraints Σ. We will now consider the side signature σ ′ with its fixed arity bound a ′ , and will consider the bounds w, h on width and head arity respectively. We will reason about full GTGDs with side signature σ ′ that obey a certain breadth restriction: Definition F.2. Let b ∈ N, let σ ′ be a side signature, and let γ be a full GTGD on side signature σ ′ . We say that γ has breadth ≤ b if there exists a guard atom A in the body of γ such that, writing the body as A( x, y) ∧ φ( x) with φ the conjunction of the remaining atoms (which use relations of σ ′ ), we have that | x| ≤ b.
It will be useful to reason about the possible full GTGDs on the side signature σ ′ that satisfy the head arity bound h, have breadth at most the width w of the IDs, and have a guard atom which occurs in the body of a rule of Σ or where there are no repeated variables. (Specifically, the latter condition means that the guard atom either has no repeated variables, or that the atom with its pattern of variable repetitions already occurs in a rule of Σ.) We will call such full GTGDs the suitable GTGDs.
Lemma F.3. The number of possible suitable GTGDs is at most where: b is the breadth bound, r is the number of relations in the full signature σ, g is the number of atoms in bodies of Σ, a is the maximal arity of any relation in σ, n ′ is the number of relations in σ ′ , and a ′ is the maximal arity of the relations of σ ′ .
Proof. This is computed by noting that a suitable GTGD is determined by choosing a guard atom, choosing some additional body atoms (called the side conditions), and choosing a head. Several of these choices may end up yielding the same full GTGD, but this does not matter, because we want to determine an upper bound of the number. We Observe that, when b := w, when h, n ′ , and a ′ are bounded, then the above quantity is polynomial in the input signature σ. Further, when only a ′ is bounded, then the quantity is singly exponential in the input.
Letting Σ be our set of constraints (with non-full IDs of bounded width and full GTGDs), we say that a full GTGD γ is a derived GTGD if it is suitable and if γ is entailed by Σ: that is, any instance that satisfies Σ also satisfies γ. Note that the derived GTGDs do not include all the full GTGDs that we started with, because some of them may have breadth larger than w, so they are not suitable. (The same was true in the previous proof: some original truncated accessibility axioms were not completely reflected in the derived truncated accessibility axioms, but the width bound ensured that this did not matter except on the initial instance: this will also be the case here.) The second step of our proof of Theorem 6.8 is to show that the set of derived GTGDs can be computed in PTIME. We already know that this set is of polynomial size, but we further show that there is an efficient procedure to compute it.
Definition F.4. We say that a suitable GTGD γ is trivial if its head atom already occurs in its body. Given a set of non-full IDs and suitable GTGDs Σ, the b-closure Σ b is obtained by starting with the suitable GTGDs in Σ plus the trivial suitable GTGDs, and applying the following inference rules until we reach a fixpoint: • (Transitivity) Suppose that there is a GTGD body β : R( x, y) ∧ i A i ( x) and heads B 1 ( z 1 ), . . . , B n ( z n ) such that, for each 1 ≤ j ≤ n, the GTGD β → B j ( z j ) is suitable and is in Σ ∪ Σ b . Suppose that there is a GTGD β ′ → ρ in Σ ∪ Σ b , and that there is a unifier υ mapping β ′ to β ∧ j B j ( z j ). Then add to Σ b the following, which is clearly a suitable GTGD: β → υ(ρ).
• (ID) Suppose we have an ID in Σ from R( x) to S( y) of width w ′ ≤ w, which exports where all variables from λ( y) and in i A i ( y) are in {y k 1 , . . . , y k w ′ }, and there are no repeated variables in S( y). Then add to Σ b the following, which is clearly a suitable GTGD: where A ′ i is the result of replacing y k i with x j i in A i and similarly with λ ′ .
We claim that if this procedure is performed with b set to the width bound w, then it computes the set of all derived GTGDs: Proposition F.5. For any set Σ of non-full IDs of width ≤ w and full GTGDs on side signature σ ′ with head arity ≤ h, then Σ w is the set of derived GTGDs.
Before we prove this claim, we state and prove that the computation of the b-closure can be performed efficiently: Lemma F.6. For any set Σ of non-full IDs of width ≤ w and full GTGDs on side signature σ ′ with head arity ≤ h, letting b := w, then Σ b is we can compute Σ b in polynomial time in |Σ| × 2 polynomial(w,h) .
Proof. From our bound in Lemma F.3, we know that the maximal size of Σ b satisfies our running time bound. We can compute it by iterating the possible production of rules until we reach a fixpoint, so it suffices that at each intermediate state of Σ b , testing every possible rule application is in PTIME.
For (ID), this is straightforward: we simply try every ID from Σ and every rule from the current Σ b unioned with Σ, and we check whether we can perform the substitution (which is clearly PTIME), in which case we add the result to Σ b .
For (Transitivity), we enumerate all possible bodies β of a suitable GTGD: there are polynomially many, up to variable renamings. For each β, we then find all rules in Σ and in Σ b have body β up to variable renaming: if β only contains σ ′ -atoms, this is easy, because it is guarded and the arity of σ ′ is constant so the body is on a constantsize domain and we can just test the homomorphism; if β contains a guard atom not in σ ′ , by assumption there is only one, and we can just consider the GTGDs whose body contains this atom, try to unify it with β in PTIME, and check if the candidate body achieves exactly the side atoms of β. Once all suitable rules are identified, clearly we can take 1 ≤ j ≤ n to range over all such GTGDs, and consider the union H of their heads. Now, we enumerate all GTGDs in Σ ∪ Σ b and we must argue that we can test in PTIME whether their body β ′ unifies to β ∪ H. If β ′ contains only atoms from σ ′ , then as it is guarded and the arity of σ ′ is fixed, its domain size is constant, so we can simply test in PTIME all possible mappings of β ′ to see if they are homomorphisms. If β ′ contains an atom A not in σ ′ , by assumption there is only one and it is a guard, so we can simply consider all atoms in β ∪ H which are not in σ ′ : for each of them, we test in PTIME whether A unifies with it, and if yes we test whether the mapping thus defined is a homomorphism from β ′ to β ∪ H. If yes, we add the new full GTGD to Σ b . This concludes the proof.
There remains to prove Proposition F.5. For this, we will need additional machinery.
Well-ordered chase. To prove Proposition F.5, we will need to study the chase by IDs and full GTGDs. We will first define a notion of well-ordered chase (which generalizes the notion studied in Appendix E.2), show that we can ensure that the chase satisfies this condition, and conclude the proof of the proposition.
In the chase, we will distinguish between the ID-facts, which are the facts created in the chase by firing an ID, and the full facts, the ones created by applying a full GTGD. The parent of an ID-fact is the fact to which we applied an ID to create it. This allows us to consider a tree structure on the ID-facts, with the root corresponding to the initial instance I 0 used in the chase (for now I 0 can be arbitrary).
We further observe by an immediate induction that, for each full fact F generated in the chase, the guard of the trigger τ used to generate F must be guarded by some ID-fact: this is vacuously preserved when applying an ID, and it is preserved when applying a full GTGD δ because the ID-fact that guards the guard of δ also guards the generated head fact.
We accordingly define the ID-guard of a full fact F created in the chase by applying a trigger τ as the ID-fact G that guards τ and is the topmost one in the chase: observe that G is uniquely defined, because whenever two ID-facts guard τ then their lowest common ancestor also does.
These definitions allow us to introduce the notion of a well-ordered chase. We say that the chase is well-ordered if it satisfies the following condition: Whenever we create an ID-fact F = R( c, d), where c are the elements shared between F and its parent fact, there is at most one fact H that uses only elements of c such that the following is true: H will be created later in the chase by firing a trigger whose ID-guard is a descendant of F .
We can now show the analogue of Lemma E.4. The specific proof technique is different (we create multiple child facts at once instead of re-firing dependencies as needed), but the spirit is the same.
Lemma F.7. For any instance I 0 , we can perform the chase in a well-ordered way.
Proof. We choose I 0 , and we will perform the chase under full GTGDs and IDs, in a way which chooses which triggers to fire in a special way, and instantiates the heads of violation multiple times. Further, in this chase variant, for all ID-facts F = R( c, d) generated in the chase, where we let c the elements shared between F and its parent fact, the fact F will carry a label, which is some fact on domain c (not necessarily a fact which holds in the chase). Intuitively, this fact will be the one additional fact on c which is allowed to be created in the subtree. Further, for any ID-fact in the chase, there will be an equivalence relation on its children: we will inductively impose that, for any two sibling ID-facts F 1 and F 2 that are equivalent, at any state of the chase, there is an isomorphism between the restriction of the chase to the domain of the subtree rooted at F 1 , and that of F 2 , which is the identity on the elements shared between F 1 , F 2 and their parent fact F . In particular, F 1 and F 2 are facts of the same relation and share the same elements at the same positions with F .
We now explain how to perform the chase in a way which does not violate wellorderedness and satisfies this inductive invariant.
Whenever we fire an ID in a chase proof, we cannot violate the well-orderedness property because all IDs are non-full. We will explain how we change the usual definition of chase steps to create multiple equivalent facts by instantiating the heads of violations multiple times. Whenever we want to fire an ID to create a fact R( c, d) that shares elements c with its parent, we create multiple copies R( c, d 1 ), . . . , R( c, d n ), each copy being labeled with one of the possible facts over c. It is clear that these additional facts make no difference to the chase. Further, all these one-fact subtrees are currently isomorphic. We must also argue that this ID firing can be performed in a way that does not break the isomorphism between subtrees rooted at equivalent ID-facts elsewhere in the chase, but it is clear, as they are currently isomorphic by induction hypothesis, so we can simply perform an analogous chase step by the same ID in all of them.
Whenever we fire a full GTGD in a chase proof with ID-guard G to create a fact H, we consider the topmost ancestor F of G that contains all elements of H (the topmost ancestor exists because the set of suitable ancestors is non-empty: indeed, G is a suitable choice). We then consider the path from F to G, and let G ′ the ID-fact which can be found by following the same path but taking at each ID-fact the equivalent child labeled with H: this path clearly exists by definition of how IDs are fired. Now, by isomorphism of the rooted subtrees at each traversed ID-fact in the path, it is clear that the G ′ is the ID-guard to a trigger τ ′ which is isomorphic to τ and will also create H (as the elements of H are shared between the subtrees). We fire the full GTGD on τ ′ , and doing so does not violate the well-orderedness condition: all possible choices for a counterexample F are on the path from G ′ to F , so they are labeled by the fact H, and it is clear throughout the construction that we cannot have created any other fact than H in their subtree. Further, given the domain of H, creating it does not break the inductive isomorphism condition for any subtrees rooted at an ID-fact below F . The only thing left is to preserve the isomorphism for subtrees rooted at ancestors of F , which we do by firing the analogous trigger in these subtrees exactly in the same way: note that this creates different facts, because at least one element of H does not appear outside of the descendants of F .
We have described a chase variant that produces a well-ordered chase proof, so we have established the desired result.
We are now ready to prove Proposition F.5: Proof. One direction is straightforward: we can immediately show by induction on the derivation that any GTGD produced in the closure is indeed a derived GTGD. Hence, we focus on the converse direction.
We prove that every derived GTGD is produced in the closure, by induction on the length of a well-ordered chase proof of its head. Specifically, let the derived GTGD be γ : R( x, y) ∧ i A i ( x) → λ( z), with z ⊆ x∪ y. Let K 0 be a set of constants, and I 0 be the initial instance which consists of the instantiation of the body of γ on K 0 : we let a 0 , b 0 , c 0 be the tuples of K 0 corresponding to x, y, z, and call F 0 the instantiation of the guard atom (which we will see as the root ID-fact in the chase). We show that γ is produced in the closure by induction on the number of chase steps required in a well-ordered chase proof to produce λ 0 := λ( c 0 ) from I 0 . The fact that we can assume a well-ordered chase proof is thanks to Lemma F.7.
The base case is when there is a well-ordered chase proof of length 0, i.e,. λ 0 is in I 0 . In this case, γ is a trivial GTGD, so it is in the closure by construction.
We now show the induction case. We consider a well-ordered chase proof that produces λ 0 in as little chase steps as possible. The firing that produces λ 0 cannot be the firing of an ID, because the IDs are non-full, so they produce facts that contain some null whereas λ 0 is a fact on K 0 . Hence, λ 0 is produced by firing a full GTGD γ ′ on a trigger τ . We consider the ID-guard F of this firing. Either F = F 0 or F is a strict descendant of F 0 .
If F is F 0 , then it means that F 0 guards this firing. Hence, all facts of τ are facts over K 0 . Hence, those which are ID-facts cannot be another fact than F 0 , as the other ID-facts contain nulls not in K 0 . As for those that are full facts, for each such fact φ(K 0 ), we know that φ(K 0 ) was derived in the chase from I 0 , which means that the GTGD γ φ : R( x, y) ∧ i A i ( x) → φ( z) is a derived GTGD. Now, for each such φ(K 0 ), as it is produced earlier than λ 0 in the chase, it means that there is a well-ordered chase that produces it in strictly less steps. Hence, by induction hypothesis, each γ φ is in the closure. We can now see that, as γ ′ ∈ Γ, as the γ φ are in the closure for φ(K 0 ) a full fact, and as the other φ(K 0 ) must be F 0 , we can apply the (Transitivity) rule and conclude that γ is also in the closure. Now, if the ID-guard F of the firing is a strict descendant of F 0 in the chase tree, we consider the path in the chase from F 0 to F . Let F ′ be the ID-fact which is the first element of this path after F 0 : it is a child of F 0 and an ancestor of F . Let c be the elements shared between F ′ and F 0 . Observe that λ 0 is a fact on c, because it is a fact on K 0 with ID-guard F , so the elements of λ 0 are shared between F and F 0 , hence between F ′ and F 0 . Now, the deduction of λ 0 creates a new fact on c with ID-guard in the subtree rooted at F ′ : by definition of the well-ordered chase, it is the only such firing for F ′ . Thus, we know that, when we create F ′ , we had already derived all facts on c that are derived in the chase, except for λ. Let Φ be the set of these facts.
We now observe that, if we had started the chase with the ID-fact F ′ plus the set Φ, then we would also have deduced λ. Indeed, we can reproduce all chase steps that happened in the subtree rooted at F ′ , specifically, all steps where we applied IDs to a descendent of F ′ , and all full GTGD steps with ID-guard in the subtree rooted at F ′ . We show this by induction: the base case corresponds to the facts of {F ′ } ∪ Φ, the induction step is trivial for ID applications, and for full GTGD applications we know that all hypotheses to the firing are in guarded tuples of the subtree rooted at F ′ , so they were all generated previously in that subtree or were part of Φ. Thus, letting β ′ be the result of renaming the constants of {F ′ }∪ Φ by nulls in a manner compatible with the mapping from γ to I 0 , this shows that γ ′ : β ′ → λ is entailed by Σ. Now, this is a GTGD with breadth at most w because the width of IDs (and hence the width of Φ) is at most w. Hence, it is a derived GTGD, and the proof for this derived GTGD is shorter than that of γ. Hence, by induction hypothesis, we have γ ′ ∈ Σ b . We can now apply (ID) to γ ′ with the IDs that generated F ′ from F 0 , thanks to the fact that Φ and λ 0 are on K 0 so they are on exported positions of the ID. This yields γ ′′ : R( x, y) ∧ Φ ′ ( z) → λ( z), where Φ ′ is the result of renaming the elements of Φ to variables as in the definition of β ′ .
We now argue as in the base case that, as each fact of Φ was derived from I 0 with a shorter proof than the proof of γ, by induction hypothesis, for each φ ∈ Φ, the derived GTGD R( x, y) We conclude, by applying (Transitivity) to γ ′′ and these derived GTGDs, that γ ∈ Σ b .
Hence, we have shown that γ was derived, which concludes the induction and finishes the completeness proof.
Normalization. We are now ready for the second stage of our proof: normalizing the chase to add short-cuts. The short-cut chase works by constructing bags: a bag is a set of facts consisting of one fact generated by an ID (called an ID-fact) and facts on the domain of the ID-fact (called full facts). We will have a tree structure on bags that corresponds to how they are created.
The short-cut chase then consists of two alternating kinds of steps: • The ID steps, where we fire an ID on an ID-fact F where it is applicable. Let g be the bag of F . The ID step creates a new bag g ′ which is a child of g, which contains the result F ′ of firing the ID, along with a copy of the full facts of g which only use elements shared between F and F ′ .
• The full saturation steps, which apply to a bag g, only once per bag, precisely at the moment where it is created by an ID step (or on the root bag). In this step, we apply all the full GTGDs of Σ w to the facts of g, and add the consequences to g (they are still on the domain of g because the rules are full).
Our goal is to argue that the short-cut chase is equivalent to the usual chase. The short-cut chase consists of chase steps in the usual sense, so it is still universal. What is not obvious is that the infinite result of the short-cut chase satisfies Σ: indeed, we must argue that all violations are solved. This is not straightforward, because the short-cut chase does not consider all triggers: specifically, for full GTGDs, it only considers triggers that are entirely contained in a bag, and only for full GTGDs in Σ w (not those in Σ). So what we must do is argue that the short-cut chase does not leave any violations unsolved. The intuition for this is that the closure of Σ w suffices to ensure that all violations can be seen within a bag.
To show this formally, we will rely on the following observation, which uses closure of Σ w . In the statement of this lemma, we talk of the topmost bag that contains a guarded tuple: it is obvious by considering the domains of the bags that this is well-defined: Lemma F.8. Consider the short-cut chase on an instance I 0 which is closed under the full GTGDs of Σ and of Σ w (we see I 0 as a root bag). Assume that a full saturation step on some bag g creates a fact F . Let g ′ be the topmost bag that contains the elements of F . Then F already holds in g ′ .
Proof. The claim is vacuous if g = g ′ . Hence, assume that g is a strict descendant of g ′ . Consider the moment in the chase where g ′ was created in an ID step. At this moment, g ′ consists of its ID-fact plus full facts copied over from the parent of g ′ (or none, in the case where g ′ is the root): as the IDs have width ≤ w, these full facts are on a domain c of size at most w. We let β be the set of these facts. As the shot-cut chase proceeds entirely downwards in the tree, and it constructs a subset of the usual chase, by starting the chase with a root bag containing β, we know that F is deduced. Hence, letting β ′ and F ′ be the result of renaming the elements of β and F to variables, we know that the GTGD γ : β → F is entailed by Σ w . Hence, γ is a derived TGD, so we must have γ ∈ Σ w . Thus, we have also applied γ in the full saturation step just after the moment where we created b ′ ; or, if b ′ is the root bag, it is closed under Σ and under Σ w by hypothesis. Hence, F already holds in b ′ .
This immediately implies the following: Corollary F.9. For any fact F created in the short-cut chase, for any bag g containing all elements of F , then F appears in the bag g.
Proof. If F is an ID-fact, it contains a null, the topmost bag g containing all elements of g is the bag where this null was introduced, so it also contains F . Now, the bags that contain the elements of F form a subtree of the tree on bags rooted at g, and F is copied in all these bags.
If F is a full fact, we use by Lemma F.8 to argue that F occurs in the topmost bag containing all its elements, and again F is copied in all other bags.
This allows us to show that the short-cut chase is equivalent to the full chase. Specifically, let I 0 be an arbitrary set of facts, which we consider as a root bag, and which has been closed under the full GTGDs of Σ and of Σ w . Let I be the result of the short-cut chase of I 0 by Σ w . We claim: Proof. Consider a trigger τ in I, and show that it is not active. All rules in Σ are guarded, so the domain of τ is guarded, and there is a topmost bag g where all elements of g appear. By Corollary F.9, all facts of τ are reflected in all bags, in particular they are all reflected in g. Hence, τ is included in a single bag g.
It is clear that any trigger for an ID would have been solved by an ID step, so we can assume that τ is a trigger for a full GTGD of Σ. Either g is the root bag, or it is not. If g is the root bag, then the trigger is satisfied by our assumption that I 0 is closed. If g is not the root bag, we need to argue that τ is a trigger for a full GTGD of Σ w . Indeed, when we created g by an ID step, g contained only an ID-fact plus σ ′ -facts on a domain of size at most w, thanks to the fact that the IDs have width at most w: and all further facts created in g are created by the full saturation step. Hence, if a trigger for Σ ∪ Σ w is active, it means that its head is entailed by the initial contents of g, so the corresponding full GTGD is suitable because its breadth is bounded by b. Hence, the trigger is also a trigger for the corresponding derived GTGD in Σ w . Now, as we have applied a full saturation step on g, any remaining trigger there for a full TGD of Σ w would have been solved in this step, because Σ w is closed so it does not leave any trigger by full TGDs unsatisfied in g. Hence, τ is no longer active in I.
We now know that the result I of the short-cut chase satisfies Σ, and as it only applies chase steps, it is actually equivalent to the chase, i.e., for any CQ Q, we have that Q is satisfied in I iff I 0 , Σ |= Q. All that remains now is to translate the short-cut chase to a set of IDs.
Linearization. We now describe the third and last stage of the proof of Theorem 6.8, by describing the translation. For every relation R, for every subset P = {p 1 , . . . , p k } of its positions of size at most w, for every instance χ of the relations of σ ′ on the x p 1 , . . . , x p k , we create a copy R P,χ of relation R. Observe that this creates a singly exponential number of relations when the arity a ′ of σ ′ is fixed, and it creates only polynomially many relations when we further fix w and σ ′ . We let Θ consist of the following IDs: • Forget: for all R, P, χ, the full ID: • Instantiate: for all R, P, χ, letting χ ′ be the instance on x 1 , . . . , x n obtained by computing the closure by the full TGDs of Σ w (a full saturation step), for every fact S( y) of χ ′ (with y ⊆ x), the full ID: • Lift: for all R, P, χ, letting χ ′ be as above, for every fact S( y) of χ ′ (with y ⊆ x) such that there is an ID S( y) → T ( z), letting P ′ be the exported positions of this IDs in z and χ ′′ be the restriction of χ ′ to P ′ , the full ID: The result of this transformation are clearly linear TGDs. Further, they are of semiwidth w: indeed, the Lift rules have width bounded by w, and the other rules have an acyclic position graph. The last point is to see that these rules are clearly computed polynomial time in their number: this relies on the fact that the number of facts in the closure of a bag by the full GTGDs of Σ w contains a number of facts which is singly exponential in w, in the number n ′ of relations in σ ′ , and in the head arity bound. We have describe the construction of Σ ′ . Now, to construct the rewriting Q Lin of Q, we simply take the query whose canonical database is obtained by closing the canonical database of Q under the full GTGDs of Σ and of Σ w . To see why the result of this can be computed in the prescribed bound, we can assume that we have computed Σ w as we already know that it can be computed in the given time bound, so we need only reason about the complexity of applying the rules. Now, note that the domain size does not increase, and the number of possible new facts is bounded by |Adom(CanonDB(Q))| h . Now, testing each possible rule application is in PTIME. Indeed, it amounts to testing homomorphism of the body of a full GTGD with I, for which it suffices to consider the guard atom, trying to map it to every fact of I, and then check whether the function that this defines is a homomorphism from the entire body to I.
The last thing to show is that, on the resulting Q Lin , the short-cut chase is equivalent to the chase by Θ. To see this, consider the short-cut chase where each bag is annotated by the relation for the ID-fact that created it, the subset of positions of the elements that it shared with its parent, and the subinstance that was copied by the parent; and consider the ID-chase by the rules of the form Lift in Θ. We can observe by a straightforward induction that the tree structure on the bags of the short-cut chase with the indicated labels is isomorphic to the tree of facts of the form R P,χ ( x) created in the ID chase by the Lift rules. Now, the application of the rules Forget and Instantiate create precisely the facts contained in these bags, so this shows that the ID chase by Θ and the well-ordered chase create precisely the same facts.
This shows that Θ satisfies indeed the hypotheses of Theorem 6.8, and concludes the proof.
G. Additional results from summary table G.1. Failure of approximation for GC 2 Tables 1 and 2 state that GC 2 constraints do not admit choice approximation, and hence do not admit existence check approximation or FD approximation.
We explain why this is true. In GC 2 one can write a constraint saying that U has at most two elements, namely: ∃ ≤2 x U (x), Now suppose we have a free access to U with result bound 2. Note that the result bound has no impact given the constraint. Suppose we also have a free access to a unary relation V with no result bound, and consider the query Q has a monotone plan that answers it: simply access U and V by their respective methods and compare.
However, if the result bound for U is replaced with 1, then Q is not access-determined, and hence there is no plan (monotone or not). Indeed, consider an instance I 1 in which U = {1, 2}, V = {1}, and another instance I 2 in which U = {0, 2}, V = {1}. There is a common accessible part for the choice approximation, by taking the common value 2 for U and the full contents of V . But Q holds in I 1 but not in I 2 , which shows that Q is not access-determined. Table 1 states the following:

G.2. Undecidability for equality-free first-order constraints
Proposition G.1. It is undecidable, given a CQ Q and a schema Sch with arbitrary constraints expressed in equality-free first-order logic, to decide whether Q has a monotone plan with respect to Sch.
This result is true even without result bounds, and follows from results in [7]: we give an argument here for completeness. Satisfiability for equality-free first-order constraints is undecidable [1]. We will reduce from this to show undecidability of answerability. Let us prove Proposition G.1: Proof. Assume that we are given a satisfiability problem consisting of equality-free firstorder constraints Σ. We produce from this an answerability problem where the schema has no access methods and constraints Σ, and we have a CQ Q consisting of a single 0-ary relation A not mentioned in Σ.
We claim that this gives a reduction from unsatisfiability to answerability, and thus shows that the latter problem is undecidable for equality-free first-order constraints.
If Σ is unsatisfiable, then vacuously any plan answers Q: since answerability is a condition where we quantify over all instances satisfying the constraints, this is vacuously true when the constraints are unsatisfiable because we are quantifying over the empty set.
Conversely, if there is some instance I satisfying Σ, then we let I 1 be formed from I by setting A to be true and I 2 be formed by setting A to be false. I 1 and I 2 both satisfy Σ and have the same accessible part, so they form a counterexample to AMonDet. Thus, there cannot be any monotone plan for Q. This establishes the correctness of our reduction, and concludes the proof of Proposition G.1.

H. Generalization of results to RA-plans
In the body of the paper we dealt with monotone answerability. However, at the end of Section 2 and in Section 7, we claimed that most of the results in the paper, including reduction to query containment and simplifying schemas, generalize in the "obvious way" to answerability where general relational algebra expressions are allowed. In addition, the results on complexity for monotone answerability that are shown in the body extend to answerability with RA-plans, with one exception: we do not have a decidability result for UIDs and FDs analogous to Theorem 6.11. We explain the generalizations now. In addition, in Proposition H.8 we show that answerability and monotone answerability correspond for IDs (this generalizes results known for views).

H.1. Variant of reduction results for RA-answerability
We first formally define the analog of AMonDet for RA-answerability. In the absence of result bounds, this is the notion of access-determinacy [7,8], which states that two instances with the same accessible part must agree on the query result. Here we generalize this to the presence of result bounds.
For a schema Sch a common subinstance I A of I 1 and I 2 is jointly access-valid if, for any access performed with a method of Sch in I A , there is a set of matching tuples in I A which is a valid result for the access in I 1 and in I 2 . In other words, there is an access selection ς for I A whose results are valid in I 1 and in I 2 .
The following shows that this notion of I 1 and I 2 having one consistent view of the data agrees with a definition via the notion of accessible part. This is the analogue of Proposition 3.1: Proposition H.1. The following are equivalent: 1. I 1 and I 2 have a common subinstance I A that is jointly access-valid.
2. There is an I Accessed that is an accessible part for I 1 and for I 2 .
Proof. Suppose I 1 and I 2 have a common subinstance I A that is jointly access-valid. Let ς be the function that describes the tuples returned when performing accesses in function ς returning a set of tuples that is valid in I 1 and in I 2 . We can see that ς can be used as a selection function in I 1 and I 2 , and AccPart(ς, I 1 ) = AccPart(ς, I 2 ). Thus the first item implies the second.
Conversely, suppose there is a selection function ς 1 for I 1 and ς 2 for I 2 such that AccPart(ς 1 , I 1 ) = AccPart(ς 2 , I 2 ). Let I Accessed := AccPart(ς 1 , I 1 ). Given a binding AccBind, mt in I Accessed , we know that there is i such that AccBind is in AccPart i (ς 1 , I 1 ). Thus we can choose a valid response for I 1 in I Accessed . But this response must also be in AccPart(ς 2 , I 2 ), and thus it is valid in I 2 as well. Thus, I Accessed is jointly access-valid, and it is clearly a subinstance of I 1 and I 2 .
Given a schema Sch with constraints and result-bounded methods, a query Q is said to be access-determined if for any two instances I 1 , I 2 satisfying the constraints of Sch, if I 1 and I 2 have a common subinstance that is jointly access-valid then Q(I 1 ) = Q(I 2 ).
The following analogue of Proposition B.1 justifies the definition: Proposition H.2. If Q has a plan that answers it w.r.t. Sch, then Q is access-determined over Sch.
Proof. Consider instances I 1 and I 2 with a common accessible part I, and a plan PL that answers Q making use of access methods from the schema. We argue that there are access selection functions ς 1 on I 1 , ς 2 on I 2 and ς on I such that PL evaluated with ς 1 , I 1 , PL evaluated with ς 2 , I 2 , and PL evaluated with ς, I all yield the same output for each temporary table of PL. We prove this by induction on PL. Inductively, it suffices to look at an access command T ⇐ mt ⇐ E with mt an access method on some relation. E can be assumed (using the induction hypothesis) to evaluate to the same set of tuples E 0 on I as on I 1 and I 2 . Given a tuple t in E 0 , consider the set O t of "matching tuples" (tuples for the relation R extending t) in I. Suppose that this set has cardinality j where j is strictly smaller than the result bound of mt. Then we can see that the set of matching tuples in I 1 and in I 2 must be exactly O t , and we can take O t to be the result of the access on t in all 3 structures. Suppose now O t has size at least that of the result bound. Then the other structures may have additional results, but we are again free to take a subset of O t of the appropriate size to be the response to the access to mt with t in all 3 structures. Unioning the tuples for all t in E 0 completes the induction. Now assume that PL answers Q. Then it must give the same result for any access selection function on I, I 1 or I 2 . In particular Q cannot distinguish I 1 and I 2 .
Analogously to Theorem 3.2, we can show access-determinacy is equivalent to RAanswerability. The proof starts the same way as that of Theorem 3.2, noting that in the absence of result bounds, this equivalence was shown in prior work: Theorem H.3. [7,8] For any CQ Q and schema Sch (with no result bounds) whose constraints Σ are expressible in active-domain first-order logic, the following are equivalent: 1. Q has an RA plan that answers it over Sch 2. Q is access-determined over Sch.
The extension to result bounds is shown using the same reduction as for Theorem 3.2, by just "axiomatizing" the additional selections. This gives the immediate generalization of Theorem H.3 to schemas that may include result bounds: Theorem H.4. For any CQ Q and schema Sch whose constraints Σ are expressible in active-domain first-order logic, the following are equivalent: where above x denotes the input positions of mt in R. The only difference from the AMonDet containment is that the additional constraints are now symmetric in the two signatures, primed and unprimed. The following proposition follows immediately from Theorem H.4 and the definition of access-determinacy: Proposition H.5. For any conjunctive query Q and schema Sch with constraints expressible in active-domain first-order logic and result bounds, the following are equivalent: • Q has an RA-plan that answers it over Sch • Q is access-determined over Sch • The containment corresponding to access-determinacy holds: From the proposition we can conclude an analog of Theorem 3.5: Theorem H.6. We can decide if a CQ Q is RA-answerable with respect to a schema Sch where all relations have arity at most 2 and whose constraints are expressible in GC 2 .
Elimination of result upper bounds for RA-plans. As with monotone answerability, it suffices to consider only result lower-bounds.
Proposition H.7. Consider a schema Sch with arbitrary constraints and access methods, some of which may be result-bounded. A query Q is answerable in Sch if and only if it is answerable in Relax(Sch).
Proof. By Theorem H.4, answerability is equivalent to access-determinacy. So the proof now follows exactly the lines of Proposition 3.3, considering instances I 1 and I 2 that satisfy the constraints, and showing that any common subinstance I Accessed of I 1 and I 2 is jointly access-valid for I 1 and I 2 on Sch iff it is access-valid for I 1 and I 2 on Relax(Sch). The argument for this is essentially that of Proposition 3.3: In the forward direction, suppose we have an access with binding in I Accessed , and there is a set of matching tuples that is a set of results in I Accessed that is valid in both I 1 and I 2 for Sch. We consider the case only of access with result bounded methods, since the non-result-bounded case is even easier. Then clearly the set is valid for both I 1 and I 1 in Relax(Sch). Conversely, consider a set of results that is valid in Relax(Sch) for both I 1 and I 2 . If the number of tuples returned is ≤ k, we see immediately that the same response is a valid result for the corresponding access in Sch. If it is greater than k, clearly any choice of k tuples among the response gives a valid result for the corresponding access in Sch.

H.2. Full-answerability and monotone answerability
We show that there is no difference between full-answerability and monotone answerability when constraints consists of IDs only. This is a generalization of an observation that is known for views (see, e.g. Proposition 2.15 in [7]): Proposition H.8. Let Sch be a schema with access methods and constraints consisting of inclusion dependencies, and Q be a CQ that is access-determined. Then Q is AMonDet.
Proof. Towards proving AMonDet, assume we have instances I 1 and I 2 satisfying the IDs, an accessible part A 1 of I 1 , an accessible part A 2 of I 2 such that A 1 ⊆ A 2 , while Q holds in I 1 but not in I 2 . Modify I 2 by replacing each element that is in I 1 but not in A 1 by a copy that is not I 1 . Since the resulting instance is isomorphic to I 2 , we do not impact the fact that A 2 is a valid accessible part for I 2 . We also do not impact that I 2 satisfies the constraints and that Q fails in I 2 . Thus we can assume that if a value in I 2 is not in A 1 , then it is not in I 1 .
Let I ′ 1 = I 1 ∪ I 2 , we see that Q holds in I ′ 1 and the constraints also hold, since IDs are preserved under taking unions. We claim that A 2 is a valid accessible part for I ′ 1 . Consider an access with a binding AccBind from A 2 to a method mt; we need to show that there is a valid response in A 2 . We look first at the case where mt has result lowerbound k. Suppose that for some i ≤ k there are at least i matches in I ′ 1 ; we need to show that we can find this many matches in A 2 . If there are at least i matches in I 2 , then we are done because A 2 is a valid accessible part in I 2 . So assume there are i ′ < i matches in I 2 . Again, since A 2 is a valid accessible part, we know that all of these are in I 2 . Further, there must be at least i − i ′ > 0 matches in I 1 − I 2 . But then the values of AccBind must be in A 1 , since if they were not in A 1 (by the pre-processing step above) they would not be in I 2 at all, and hence could not be in A 2 . Now if the number of matches in I 1 is at least i, we are done, since A 1 is a valid accessible part of I 1 . But then the number of matches in I 1 is strictly less than i, and hence A 1 , and thus A 2 must contain all of them. We conclude that A 2 has i matches as required.
Suppose now that mt has no result bound, and consider the set of matches for AccBind in I ′ 1 . All matches that lie in I 2 are necessarily accounted for in A 2 , since A 2 is a valid accessible part. But as above, if there are any matches in I 1 , then the binding must be in I 1 , and thus all matches must be in A 1 and thus in I 2 . Either way, we conclude that all matches must lie back in A 2 as required.
We have shown that I ′ 1 and I 2 have a common accessible part. Since Q holds in I ′ 1 , by access-determinacy Q holds in I 2 , a contradiction.
From Proposition H. 8 we immediately see that in the case where the constraints consist of IDs only, all the results about monotone-answerability with result bounds transfer to answerability. This includes approximation results and complexity bounds.

H.3. Enlargement for RA-answerability
We now explain how the method of "blowing up counterexamples" introduced in the body extends to work with access-determinacy. We consider a counterexample to access-determinacy in the approximation (i.e., a pair of instances that satisfy the constraints and have a common subinstance that is jointly access-valid but one satisfy the query and one does not), and we show that it can be enlarged to a counterexample to accessdeterminacy in the original schema.
Definition H.9. A counterexample to access-determinacy for a CQ Q and a schema Sch on a signature σ, integrity constraints Σ, and access methods, is a pair of instances I 1 , I 2 on σ such that I 1 satisfies Q, I 2 satisfies ¬Q, and I 1 and I 2 have a common subinstance I Accessed that is jointly access-valid for I 1 and I 2 .
It is clear that, whenever there is a counterexample to access-determinacy for schema Sch and query Q, then Q is not access-determined w.r.t. Sch.
We now state the enlargement lemma that we use, which is the direct analogue of Lemma 4.3: Lemma H. 10. Let Sch and Sch ′ be schemas and Q be a CQ that is not access-determined in Sch ′ . Suppose that for some counterexample I 1 , I 2 to access-determinacy for Q ′ in Sch ′ we can construct instances I + 1 ⊇ I 1 and I + 2 ⊇ I 2 that satisfy the constraints of Sch and have a common subinstance I Accessed that is jointly access-valid for Sch, and such that I + 2 has a homomorphism to I 2 . Then Q is not access-determined in Sch.
Proof. We prove the contrapositive of the claim. Let Q be a query which is not accessdetermined in Sch ′ , and let {I 1 , I 2 } be a counterexample. Using the hypothesis, we construct I + 1 and I + 2 . It suffices to observe that they are a counterexample to accessdeterminacy for Q and Sch, which we show. First, they satisfy the constraints Σ and have a common subinstance which is jointly access-valid. Second, as I 1 satisfies Q and I 1 ⊆ I + 1 , we know that I + 1 satisfies Q. Last, as I 2 does not satisfy Q and I + 2 has a homomorphism to I 2 , we know that I + 2 does not satisfy Q. Hence, I + 1 , I + 2 is a counterexample to access-determinacy of Q in Sch, which concludes the proof.

H.4. Choice approximability for RA-answerability
We say say that a schema is RA choice approximable if any CQ that has an RA-plan over a schema has one over its choice approximation. The following result is the counterpart to Theorem 4.5: Theorem H.11. Let schema Sch have constraints given by equality-free first-order constraints, and Q be a CQ that is access-determined w.r.t. Sch. Then Q is also accessdetermined in the choice approximation of Sch. In particular, the result holds for schemas with constraints given by TGDs.
The proof follows that of Theorem 4.5 with no surprises. This time we fix (I 1 , I 2 ) that is a counterexample to access-determinacy in the approximation: that is, I 1 satisfies the query, I 2 violates the query, I 1 and I 2 satisfy the equality-free first order constraints Σ, and I 1 and I 2 have a common subinstance I Accessed which is jointly access-valid in the approximation. We expand them to I + 1 and I + 2 that have a common subinstance that is jointly access-valid in Sch. Our construction is identical to the blow-up used in Theorem 4.5: For each element a in the domain of I 1 , introduce infinitely many fresh elements a j for j ∈ N >0 , and identify a 0 := a. Now, define I + 1 := Blowup(I 1 ), where we define Blowup(I 1 ) as the instance with facts {R(a 1 i 1 , . . . , a n in ) | R( a) ∈ I 1 , i ∈ N n }. Define I + 2 from I 2 in the same way. The proof of Theorem 4.5 already showed that I 1 and I + 1 agree on all equality-free first-order constraints, I 1 still satisfies the query and I 2 still violates the query.
We need to construct a common subinstance that is jointly access-valid in Sch, and we do this as in Theorem 4.5, setting I + Accessed := Blowup(I Accessed ). The argument for correctness is given in the proof of Theorem 4.5, but substituting jointly access-valid for access-valid. This completes the proof of Theorem H.11, using Lemma H. 10.
As with choice approximation for AMonDet, this result can be applies immediately to TGDs, and is particularly useful for TGDs with decidable query containment. If we consider frontier-guarded TGDs, the above result says that we can assume any result bounds are 1, and thus the query containment problem produced by Proposition H.5 will involve only frontier-guarded TGDs. We thus get the following analog of Theorem 5.1: Theorem H.12. We can decide whether a CQ Q has an RA-plan with respect to a schema with result bounds whose constraints are frontier-guarded TGDs. The problem is 2EXPTIME-complete.

H.5. FD approximability for RA-plans
A schema is FD approximable for RA-plans if every CQ having a plan over the schema has an RA-plan in its FD approximation.
We now show that schemas whose constraints consist only of FDs are FD approximable.
Theorem H. 13. If Sch has all of its constraints as FDs, and Q has an RA-plan over Sch, then Q has a plan over the FD approximation of Sch.
Proof. It suffices to assume that we have I 1 and I 2 agreeing on an accessible part A in the FD approximation of Sch, with I 1 |= Q, I 2 |= ¬Q, and to upgrade them to J 1 , J 2 having the same property for Sch.
Consider an access using a method mt on relation R with binding AccBind having values in A, in which I 1 and I 2 disagree on the output. Let in mt be the input positions of mt, and DetBy(in mt ) be the positions determined by them under the FDs. Clearly there must be some positions X of R that are not in DetBy(in mt ), since otherwise the hypotheses would imply that the matching tuples are the same in I 1 and I 2 . We can also conclude that the matching tuples in both I 1 and I 2 must be non-empty and that the matching tuples in I 1 ∪ I 2 must all agree on positions in DetBy(in mt ).
We can blow up the output of this access in I 1 and I 2 . If k is the result bound of mt, we add k tuples with all positions in DetBy(in mt ) agreeing with the common value of all matching tuples of I 1 ∪ I 2 . For the other positions we pick values that are disjoint from each other and from other values in I 1 ∪ I 2 . Note that in doing this we do not break any FDs with determinant not contained in DetBy(in mt ), and by definition of X we do not break a determinant contained in DetBy(in mt ).
Iteratively performing this operation on each access gives us J 1 , J 2 having a common accessible part in Sch. Further, J i has a homomorphism back to I i , which implies that J 2 |= ¬Q.

H.6. Complexity of RA-answerability for FDs
In Theorem 5.4 we showed that monotone answerability with FDs was decidable in the lowest possible complexity, namely, NP.
The argument involved first showing FD-approximability, which allowed us to eliminate result bounds at the cost of adding additional IDs. We then simplified the resulting rules so that they are acyclic, ensuring that the chase would terminate. This relied on the fact that the axioms for AMonDet would include rules going from R mt to R ′ mt , but not vice versa. Hence, the argument does not generalize for the rules that axiomatize RA plans.
However, we can repair the argument at the cost of adding an additional assumption. A schema Sch with access methods is single method per relation, abbreviated SMPR, if for every relation there is at most one access method. This assumption was made in many papers on access methods [25, 25], although we do not make it by default elsewhere in this work.
Theorem H.14. We can decide whether a CQ Q has a plan with respect to an SMPR schema with result bounds whose constraints are FDs. The problem is NP-complete.
We will actually show something stronger: for SMPR schemas with constraints consisting of FDs only, there is no difference between full-answerability and monotone answerability. Given Theorem 5.4, this immediately implies H.14.
Proposition H.15. Let Sch be a schema with access methods satisfying SMPR and constraints Σ consisting of functional dependencies, and Q be a CQ that is access-determined. Then Q is AMonDet.
Proof. We know from Theorem H.13 that the schema is FD-approximable. Thus we can eliminate result bounds as follows: • We extend the relations of Sch by adding, for each result-bounded method mt on relation R, a relation R mt of arity |DetBy(mt)|, where DetBy(mt) denotes the positions determined by input positions of mt under the FDs.
• We extend the constraints to add, for each result-bounded method mt, the following ID constraints: R( x, y, z) → R mt ( x, y) and R mt ( x, y) → ∃ z R( x, y, z) where x denotes the input positions of mt, and y denotes the other positions of DetBy(mt).
• The access methods of Sch ′ are the methods of Sch that have no result bounds, plus the following: for each result-bounded method mt on relation R with input positions j 1 . . . j m , a method mt ′ on R mt whose input positions are the first m positions of R mt . Using the FDs on R and the constraints relating R to R mt , we can see that any access to mt ′ is guaranteed to return at most one result.
By Proposition H.5 we know that Q is access-determined exactly when Q ⊆ Γ Q ′ , where Γ contains two copies of the above schema ans also axioms: where S has an access method on the positions j 1 . . . j m . Note that S may be one of the original relations, or one of the relations R mt produced by the transformation above.
We now show that chase proofs with Γ must in fact be very simple under the SMPR assumption: Claim H. 16. Assuming our schema is SMPR, in any restricted chase sequence for Γ. Then: • Rules of the form R mt ( x, y) → ∃ z R( x, y, z) will never fire.
• Rules of the form R ′ ( x, y, z) → R ′ mt ( x, y) will never fire.
• FDs will never fire (assuming they were applied to the initial instance) • (Backward) axioms will never fire.
Note that the last item implies that the proposition holds, so it suffices to prove the claim.
We prove this by induction. We consider the first item. Consider a fact R mt ( c, d). Since the (Backward) axioms never fire, the fact must have been produced from a fact R( c, d, e). Hence the axiom can not fire on this fact.
We move to the second item, considering a fact R ′ ( c, d, e). By SMPR and the inductive assumption that FDs do not fire, this fact can only have been produced by R ′ mt ( c, d). Thus the rule in question will not fire in the restricted chase.
Turning to the third item, we first consider a potential violation of an FD D → r on relation R. This consists of tuples R( c) and R( d) agreeing on positions in D and disagreeing on position r. By assumption that the FDs are applied in the initial instance, these tuples are not in the initial instance. By the inductive assumption, they could not have been otherwise produced. Now turning to tuples that are potential violations of the primed copies of the FDs. We know by induction that these are produced by the rule going from R ′ mt to R ′ . Thus the facts are R ′ ( c 1 , d 1 , e 1 ) and R ′ ( c 2 , d 2 , e 2 ). Assume that R ′ ( c 2 , d 2 , e 2 ) was the latter of the two facts to be created, then e 2 would have been chosen fresh. Hence the violation must occur within the positions corresponding to c 1 , d 1 and c 2 , d 2 . But by induction, and by the SMPR assumption, these facts must have been created from facts R ′ mt ( c 1 , d 1 ) and R ′ mt ( c 2 , d 2 ), which in turn must have been created by R mt ( c 1 , d 1 ) and R mt ( c 2 , d 2 ). These last must (again, by induction) have been created from facts R( c 1 , d 1 , f 1 ) and R( d 1 , d 1 , g 1 ). But then we have an earlier violation of the FDs on these two facts, a contradiction.
Turning to the last item, we consider a fact R ′ mt ( c, d). By induction, it can only have been generated by a fact R mt ( c, d), and thus (Backward) could not fire, which establishes the desired result.
Without SMPR, we can still argue that RA-answerability is decidable, and show a singly exponential complexity upper bound: Theorem H.17. For general schemas with access methods and constraints Σ consisting of FDs, RA-answerability is decidable in EXPTIME.
Proof. We consider again the query containment problem for answerability, letting Γ be the corresponding constraints, as in Proposition H. 15.
Instead of claiming that neither the FDs nor the backward axioms will not fire, as in the case of SMPR, we argue only that the FDs will not fire. From this it follows that the constraints consist only of IDs and accessibility axioms, leading to an EXPTIME complexity upper bound: one can apply either Corollary 6.10 from the body of the paper, or the EXPTIME complexity result without result bounds from [5].
We consider a chase proof with Γ, and claim, for each relation R and each resultbounded method mt on R: • Every R mt -fact and every R ′ mt -fact is a projection of some R-fact or some R ′ -fact.
• All the FDs are satisfied in the chase instance, and further for any relation R, R ∪ R ′ satisfies the FDs. That is: for any FD D → r, we cannot have an R and R ′ -fact that agree on positions in D and disagree on r.
The second item implies that the FDs do not fire. The invariant is initially true, by assumption that FDs are applied initially. When firing an R-to-R mt axiom or an R ′ -to-R ′ mt axiom, the first item is preserved by definition, and the second is trivially preserved since there are no FDs on R mt or R ′ mt . When firing an accessibility axiom, either forward or backward, again the first and the second item are clearly preserved. Now, consider the firing of an R mt -to-R axiom. The first item is trivially preserved, so we must only show the second.
Consider the fact R mt (a 1 , . . . , a m ) and the generated fact F = R(a 1 , . . . , a m , b 1 , . . . , b n ) created by the rule firing. Assume that F is part of an FD violation with some other fact F ′ . We consider the case where where F ′ is of the form R ′ (a ′ 1 , . . . , a ′ m , b ′ 1 , . . . b ′ m ). We know that the determiner of the FD cannot contain any of the positions of the b i , because they are fresh nulls. Hence, the FD determiner is included in the positions of a 1 , . . . , a m . But now, by definition of the FD approximation, the determined position cannot correspond to one of the b 1 , . . . , b n , since otherwise that position would have been included in R mt . So the determined position is also one of the positions of a 1 , . . . , a m . Now we use the first item of the inductive invariant. There was already a fact F ′′ , either an R or R ′ -fact, with tuple of values (a 1 , . . . , a m , b ′′ 1 , . . . , b ′′ m ). Looking at F and F ′ , we know that the determined position is different, so there is i such that a ′ i = a i . This implies that F ′ = F ′′ . But now, as F ′ and F ′′ are an FD violation on the positions a 1 , . . . a m , then F and F ′′ are seen to also witness an FD violation in R ∪ R ′ that existed before the firing. This contradicts the first point of the invariant.
When firing R ′ mt -to-R ′ rules, the symmetric argument applies. This completes the proof of the invariant, which completes the proof of Theorem H.17.

H.7. Choice approximability for RA-plans with UIDs and FDs
Theorem H.18. Let schema Sch have constraints given by unary IDs and arbitrary FDs, and Q be a CQ that is access-determined w.r.t. Sch. Then Q is also access-determined in the choice approximation of Sch.
is jointly access-valid for both with Sch ′ ). Let (mt, AccBind) be an access on relation R • All accesses of I + Accessed which are not accesses of I Accessed are valid for Sch in I + 1 and I + 2 . As before, such accesses must include an element of W , so by the first bullet point all matching tuples are in W , so they are all in I + Accessed .
Hence, we have explained how to fix the access (mt, AccBind), so we can conclude using Lemma H.19 that we obtain a counterexample to access-determinacy of Q in Sch by fixing all accesses. This concludes the proof.
H.8. Summary of extensions to answerability with RA-plans Table 2 summarizes the expressiveness and complexity results for RA-plans. The distinction from the corresponding table for monotone answerability, Table 1 in the body, is in the complexity for FDs without SMPR and in addition for FDs and UIDs. While for monotone answerability we could use a separability argument to show that FDs could be ignored for FDs and UIDs, we do not have such an argument for answerability with RA plans. Thus decidability of answerability for the UID and FD case with RA-plans is open. When the SMPR assumption is dropped, we also do not have tight bounds for FDs in isolation.