STRONGLY NORMALIZING HIGHER-ORDER RELATIONAL QUERIES

. Language-integrated query is a powerful programming construct allowing database queries and ordinary program code to interoperate seamlessly and safely. Language-integrated query techniques rely on classical results about the nested relational calculus, stating that its queries can be algorithmically translated to SQL, as long as their result type is a ﬂat relation. Cooper and others advocated higher-order nested relational calculi as a basis for language-integrated queries in functional languages such as Links and F#. However, the translation of higher-order relational queries to SQL relies on a rewrite system for which no strong normalization proof has been published: a previous proof attempt does not deal correctly with rewrite rules that duplicate subterms. This paper ﬁlls the gap in the literature, explaining the diﬃculty with a previous proof attempt, and showing how to extend the (cid:62)(cid:62) -lifting approach of Lindley and Stark to accommodate duplicating rewrites. We also show how to extend the proof to a recently-introduced calculus for heterogeneous queries mixing set and multiset semantics.


Introduction
The Nested Relational Calculus (NRC ) [BNTW95] provides a principled foundation for integrating database queries into programming languages.Wong's conservativity theorem [Won96] generalized the classic flat-flat theorem [PG92] to show that for any nesting depth d, a query expression over flat input tables returning collections of depth at most d can be expressed without constructing intermediate results of nesting depth greater than d.In the special case d = 1, this implies the flat-flat theorem, namely that a nested relational query mapping flat tables to flat tables can be expressed in a semantically equivalent way using the flat relational calculus.In addition, Wong's proof technique was constructive, and gave an easily-implemented terminating rewriting algorithm for normalizing NRC queries to flat relational queries, which can in turn be easily translated to idiomatic SQL queries.The basic approach has been extended in a number of directions, including to allow for (nonrecursive) higher-order functions in queries [Coo09b], and to allow for translating queries that return nested results to a bounded number of flat relational queries [CLW14].
Normalization-based techniques are used in language-integrated query systems such as Kleisli [Won00] and Links [CLWY07], and can improve both performance and reliability of language-integrated query in F# [CLW13].However, most work on normalization considers homogeneous queries in which there is a single collection type (e.g.homogeneous sets or multisets).Currently, language-integrated query systems such as C# and F# [MBB06] support duplicate elimination via a Distinct() method, which is translated to SQL queries in an ad hoc way, and comes with no guarantees regarding completeness or expressiveness as far as we know, whereas Database-Supported Haskell (DSH) [UG15] supports duplicate elimination but gives all operations list semantics and relies on more sophisticated SQL:1999 features to accomplish this.Fegaras and Maier [FM00] propose optimization rules for a nested object-relational calculus with set and bag constructs but do not consider the problem of conservativity with respect to flat queries.
Recently, we considered a heterogeneous calculus for mixed set and bag queries [RC19], and conjectured that it too satisfies strong normalization and conservativity theorems.However, in attempting to extend Cooper's proof of normalization we discovered a subtle problem, which makes the original proof incomplete.
Most techniques to prove the strong normalization property for higher-order languages employ logical relations; among these, the Girard-Tait reducibility relation is particularly influential: reducibility interprets types as certain sets of strongly normalizing terms enjoying desirable closure properties with respect to reduction, called candidates of reducibility [GLT89].The fundamental theorem then proves that every well-typed term is reducible, hence also strongly normalizing.In its traditional form, reducibility has a limitation that makes it difficult to apply it to certain calculi: the elimination form of every type is expected to be a neutral term or, informally, an expression that, when placed in an arbitrary evaluation context, does not interact with it by creating new redexes.However, some calculi possess commuting conversions, i.e. reduction rules that apply to nested elimination forms: such rules usually arise when the elimination form for a type (say, pairs) is constructed by means of an auxiliary term of any arbitrary, unrelated type.In this case, we expect nested elimination forms to commute; for example, we could have the following commuting conversion hoisting the elimination of pairs out of case analysis on disjoint unions: cases (let (a, b) = p in t) of inl(x) ⇒ u; inr(y) ⇒ v let (a, b) = p in cases t of inl(x) ⇒ u; inr(y) ⇒ v where p has type A × B, t has type C + D, u, v have type U , and the bound variables a, b are chosen fresh for u and v. Since in the presence of commuting conversions elimination forms are not neutral, a straightforward adaptation of reducibility to such languages is precluded. 1.1.
-lifting and NRC λ .Cooper's NRC λ [Coo09a,Coo09b] extends the simply typed lambda calculus with collection types whose elimination form is expressed by comprehensions {M |x ← N }, where M and N have a collection type, and the bound variable x can appear in M : Γ N : {S} Γ, x : S M : {T } Γ {M |x ← N } : {T } (we use bold-style braces {•} to indicate collections as expressions or types of NRC λ ).In the rule above, we typecheck a comprehension destructuring collections of type {S} to produce new collections in {T }, where T is an unrelated type: semantically, this corresponds to the union of all the collections M [V /x], such that V is in N .According to the standard approach, we should attempt to define the reducibility predicate for the collection type {S} as: (we use roman-style braces {•} to express metalinguistic sets).Of course the definition above is circular, since it uses reducibility over collections to express reducibility over collections; however, this inconvenience could in principle be circumvented by means of impredicativity, replacing Red {T } with a suitable, universally quantified candidate of reducibility (an approach we used in [RC17] in the context of justification logic).Unfortunately, the arbitrary return type of comprehensions is not the only problem: they are also involved in commuting conversions, such as: Because of this rule, comprehensions are not neutral terms, thus we cannot use the closure properties of candidates of reducibility (in particular, CR3 [GLT89]) to prove that a collection term is reducible.To address this problem, Lindley and Stark proposed a revised notion of reducibility based on a technique they called -lifting [LS05].
-lifting, which derives from Pitts's related notion of -closure [Pit98], involves quantification over arbitrarily nested, reducible elimination contexts (continuations); the technique is actually composed of two steps: -lifting, used to define the set Red T of reducible continuations for collections of type T in terms of Red T , and -lifting proper, defining Red {T } = Red T in terms of Red T .In our setting, if we use SN to denote the set of strongly normalizing terms, the two operations can be defined as follows: Notice that, in order to avoid a circularity between the definitions of reducible collection continuations and reducible collections, the former are defined by lifting a reducible term M of type T to a singleton collection.
In NRC λ , besides commuting conversions, we come across an additional problem concerning the property of distributivity of comprehensions over unions, represented by the following reduction rule: One can immediately see that in {M ∪ N |x ← } the reduction above duplicates the hole, producing a multi-hole context that is not a continuation in the Lindley-Stark sense.
Cooper, in his work, attempted to reconcile continuations with duplicating reductions.While considering extensions to his language, we discovered that his proof of strong normalization presents a nontrivial lacuna which we could only fix by relaxing the definition of continuations to allow multiple holes.This problem affected both the proof of the original result and our attempt to extend it, and has an avalanche effect on definitions and proofs, yielding a more radical revision of the -lifting technique which is the subject of this paper.
The contribution of this paper is to place previous work on higher-order programming for language-integrated query on a solid foundation.As we will show, our approach also extends to proving normalization for a higher-order heterogeneous collection calculus NRC λ (Set, Bag) [RC19] and we believe our proof technique can be extended further.This article is a revised and expanded version of a conference paper [RC20].Compared with the conference paper, this article refines the notion of -lifting by omitting a harmless, but unnecessary generalization, includes details of proofs that had to be left out, and expands the discussion of related work.In addition, we fully comment on the extension of our result to a language allowing to freely mix and compose set queries and bag queries, which was only marginally discussed in the conference version.We also solved a subtle problem with the treatment of variable capture in contexts by reformulating the statement of Lemma 3.19.1.2.Summary.Section 2 reviews N RC λ and its rewrite system.Section 3 presents the refined approach to reducibility needed to handle rewrite rules with branching continuations.Section 4 presents the proof of strong normalization for N RC λ .Section 5 outlines the extension to a higher-order calculus NRC λ (Set, Bag) providing heterogeneous set and bag queries.Sections 6 and 7 discuss related work and conclude.

Higher-order NRC
NRC λ , a nested relational calculus with non-recursive higher-order functions, is defined by the following grammar: where x, and c range over countably infinite and pairwise disjoint sets of variables, record field labels, and constants.
Types include atomic types A, B, . . .(among which we have Booleans B), record types with named fields −−→ : T , collections {T }; we define relation types as those in the form { −−→ : A }, i.e. collections of records of atomic types.Terms include applied constants c( − → M ), records with named fields and record projections ( = M , M. ), various collection terms (empty, singleton, union, and comprehension), the emptiness test empty, and one-sided conditional expressions for collection types where M do N ; we allow the type of records with no fields: consisting of a single, empty record .Notice that λx.M and {M | x ← N } bind the variable x in M .
We will allow ourselves to use sequences of generators in comprehensions, which are syntactic sugar for nested comprehensions, e.g.: The typing rules, shown in Figure 1, are largely standard, and we only mention those operators that are specific to our language: constants are typed according to a fixed signature Σ, prescribing the types of the n arguments and of the returned expression to be atomic; we assume that Σ assigns the type B to the constants true and false (representing the two Boolean values), and the type (B, B) → B to the constant ∧ (representing the logical 'and') and we will allow ourselves to write M ∧ N instead of ∧(M, N ).The operation empty takes a collection and returns a Boolean indicating whether its argument is empty; where takes a Boolean condition and a collection and returns the second argument if the Boolean is true, otherwise the empty set.(Conventional two-way conditionals, at any type, are omitted for convenience but can be added without difficulty.) Overall, our presentation of NRC λ is very similar to the language of queries used by Cooper in [Coo09a]: two minor differences are that NRC λ does not have a specific construct for input tables (these can be simulated as free variables with relation type) and uses one-armed conditionals instead of an if-then-else construct.Additionally, Cooper provided a type-and-effect system to track the use of primitive operations that may not be translated to SQL; the issue of translating to SQL is not addressed directly in this paper (equivalently, we may assume that all the primitive operations of NRC λ may be translated to SQL).
2.1.Reduction and normalization.NRC λ is equipped with a rewrite relation whose purpose is to convert expressions of relation type into a sublanguage isomorphic to a fragment of SQL, even when the original expression contains subterms whose type is not available in SQL, such as nested collections.This rewrite relation is obtained from the basic contraction ˘ shown in Figure 2, by taking its congruence closure (Figure 3).
We will allow ourselves to say "induction on the derivation of reduction" to mean the structural induction induced by the notion of congruence closure, followed by a case analysis on the basic reduction rules used as its base case.
We now discuss the basic reduction rules in more detail.0-ary constants are values of atomic type and do not reduce.Applied constants (with positive arity) reduce when all of their arguments are (0-ary) constants: the reduction rule relies on a fixed semantics • which assigns to each constant c of signature Σ(c) = − → A n → A a function mapping constants c 1 , . . ., c n of type − → A n to values of type A. The rules for collections and conditionals are mostly standard.The reduction rule for the emptiness test is triggered when the argument M is not of relation type (but, for instance, of nested collection type) and employs comprehension to generate a (trivial) relation that is empty if and only if M is.
The normal forms of queries under these rewriting rules construct no intermediate nested structures, and are straightforward to translate to syntactically isomorphic (up to notational Figure 2: Query normalization (basic contraction rules) differences) and semantically equivalent SQL queries.For example, consider the following NRC λ query which, given a table t, first wraps the id field of every tuple of t into a singleton, yielding a collection of singletons (i.e. a nested collection), then converts it back to a flat collection by performing the grand union of all of its elements: The normal form of this query does not create the unnecessary intermediate nested collection: Such a query is easily translated to SQL as: Cooper [Coo09b] and Lindley and Cheney [LC12] give a full account of the translation from NRC λ normal forms to SQL.Cheney et al. [CLW13] showed how to improve the performance and reliability of LINQ in F# using normalization and gave many examples showing how higher-order queries support a convenient, compositional language-integrated query programming style.

Reducibility with branching continuations
We introduce here the extension of -lifting we use to derive a proof of strong normalization for NRC λ .The main contribution of this section is a refined definition of continuations with branching structure and multiple holes, as opposed to the linear structure with a single hole used by standard -lifting.In our definition, continuations (as well as the more general notion of context) are particular forms of terms: in this way, the notion of term reduction can be used for continuations as well, without need for auxiliary definitions.
3.1.Contexts and continuations.We start our discussion by introducing contexts, or terms with multiple, labelled holes that can be instantiated by plugging other terms (including other contexts) into them.(it suffices to use FV(C) as holes are never used as bound variables).When a term does not contain any [p], we say that it is a pure term; when it is important that a term be pure, we will refer to it by using overlined metavariables L, M , N , R, . . .We introduce a notion of permutable multiple context instantiation.
The word "permutable" is explained by the following properties: Lemma 3.4.Let η be permutable, and p 1 , . . ., p k be an enumeration of all the elements of dom(η) without repetitions, in any order.Then, for all contexts C, we have: Proof.By structural induction on C. The relevant case is when C = [p i ], for some i ∈ {1, ..., k}.By definition, the left-hand side rewrites to η(p i ); we can express the right-hand side as ..,k .Then we prove: because for all j, p i = p j ; , by the permutability hypothesis, for all j we have that [p j ] / ∈ FV(η(p i )).Then the right-hand side also rewrites to η(p i ), proving the thesis.
All the other cases are trivial, applying induction hypotheses where needed.
Lemma 3.5.Let η be permutable and let us denote by η ¬p the restriction of η to indices other than p.Then for all p ∈ dom(η) we have: Proof.Immediate, by Lemma 3.4.
We can now define continuations as certain contexts that capture how one or more collections can be used in a program.Definition 3.6 (continuation).Continuations K are defined as the following subset of contexts: where for all indices p, [p] can occur at most once.
This definition differs from the traditional one in two ways: first, holes are decorated with an index; secondly, and most importantly, the production K ∪ K allows continuations to branch and, as a consequence, to use more than one hole.Note that the grammar above is ambiguous, in the sense that certain expressions like where B do N can be obtained either from the production where B do K with K = N , or as pure terms by means of the production M : we resolve this ambiguity by parsing these expressions as pure terms whenever possible, and as continuations only when they are proper continuations.
An additional complication of NRC λ when compared to the computational metalanguage for which -lifting was devised lies in the way conditional expressions can reduce when placed in an arbitrary context: continuations in the grammar above are not liberal enough to adapt to such reductions, therefore, like Cooper, we will need an additional definition of auxiliary continuations allowing holes to appear in the body of a comprehension (in addition to comprehension generators).Definition 3.7 (auxiliary continuation).Auxiliary continuations are defined as the following subset of contexts: where for all indices p, [p] can occur at most once.
We can then see that regular continuations are a special case of auxiliary continuations; however, an auxiliary continuation is allowed to branch not only with unions, but also with comprehensions. 1  We use the following definition of frames to represent certain continuations with a distinguished shallow hole denoted by .
1 It is worth noting that Cooper's original definition of auxiliary continuation does not use branching comprehension (nor branching unions), but is linear just like the original definition of continuation.The only difference between regular and auxiliary continuations in his work is that the latter allowed nesting not just within comprehension generators, but also within comprehension bodies (in our notation, this would correspond to two separate productions {M |x ← Q} and {Q|x ← N }).
Definition 3.8 (frame).Frames are defined by the following grammar: where does not occur in Q, and for all indices p, [p] can occur in Q at most once.The operation F p , lifting a frame to an auxiliary continuation with a distinguished hole [p] is defined as: The composition operation Q p F is defined as: We generally use frames in conjunction with continuations or auxiliary continuations when we need to partially expose their leaves: for example, if we write K = K 0 p {M |x ← }, we know that instantiating K at index p with (for example) a singleton term will create a redex: We say that such a reduction is a reduction at the interface between the continuation and the instantiation (we will make this notion formal in Lemma 3.25).
In certain proofs by induction that make use of continuations, we will need to use a measure of continuations to show that the induction is well-founded.We introduce here two measures |•| p and • p denoting the nesting depth of a hole [p]: the two measures differ in the treatment of nesting within the body of a comprehension.Definition 3.9.The measures |Q| p and Q p are defined as follows: We will also use |Q| and Q to refer to the derived measures: The definitions of frames and measures are designed in such a way that the following property holds.
Lemma 3.10.Let Q be an auxiliary continuation such that p ∈ supp(Q); then for all frames F : Proof.By induction on the structure of Q.When examining the forms Q can assume, we will have to consider subexpressions Q for which p may or may not be in supp(Q ): in the first case, we can apply the induction hypothesis; otherwise, we prove NRC λ reduction can be used immediately on contexts (including regular and auxiliary continuations) since these are simply terms with distinguished free variables; we will also abuse notation to allow ourselves to specify reduction on context instantiations: whenever η(p) N and η = η ¬p [p → N ], we can write η η .We will denote the set of strongly normalizing terms by SN .Strongly-normalizing applied contexts satisfy the following property: For strongly normalizing terms (and by extension for context instantiations containing only strongly normalizing terms), we can introduce the concept of maximal reduction length.Definition 3.11 (maximal reduction length).Let M ∈ SN : we define ν(M ) as the maximum length of all reduction sequences starting with M .We also define ν(η) as p∈dom(η) ν(η(p)), whenever all the terms in the codomain of η are strongly normalizing.
Since each term can only have a finite number of contracta, it is easy to see that ν(M ) is defined for any strongly normalizing term M .Furthermore, ν(M ) is strictly decreasing under reduction.Lemma 3.12.For all strongly normalizing terms M , if M M , then ν(M ) < ν(M ).
Proof.If ν(M ) ≥ ν(M ), by pre-composing M M with a reduction chain of maximal length starting at M we obtain a new reduction chain starting at M with length strictly greater than ν(M ); this contradicts the definition of ν(M ).
3.2.Renaming reduction.According to the Definitions 3.6 and 3.7, in order for a context to be a continuation or an auxiliary continuation, it must on one hand agree with the respective grammar, and on the other hand satisfy the condition that no hole occurs more than once.We immediately see that, since holes can be duplicated under reduction, the sets of plain and auxiliary continuations are not closed under reduction.For instance: where K is a continuation, but C is not due to the two occurrences of [p].For this reason, we introduce a refined notion of renaming reduction which we can use to rename holes in the results so that each of them occurs at most one time.
Definition 3.13.Given a term M with holes and a finite map σ : P → P, we write M σ for the term obtained from M by replacing each hole [p] such that σ(p) is defined with [σ(p)].
Even though finite renaming maps are partial functions, it is convenient to extend them to total functions by taking σ(p) = p whenever p / ∈ dom(σ); we will write id to denote the empty renaming map, whose total extension is the identity function on P. Definition 3.14 (renaming reduction).M σ-reduces to N (notation: Terms only admit a finite number of redexes and consequently, under regular reduction, any given term has a finite number of possible contracta.However, under renaming reduction, infinite contracta are possible: if M N , there may be infinite R, σ such that N = Rσ.When a strongly normalizing term M admits infinite contracta, it does not necessarily have a maximal reduction sequence (just like the maximum of an infinite set of finite numbers is not necessarily defined).Fortunately, we can prove (Lemma 3.16) that to every renaming reduction chain there corresponds a plain reduction chain of the same length, and vice-versa.
Lemma 3.15.If M N , then for all σ we have M σ N σ.
Proof.Routine induction on the derivation of M N .
For every finite plain reduction sequence, there is a corresponding renaming reduction sequence of the same length (using the identity renaming id); and conversely, for every finite renaming reduction sequence, there is a corresponding plain reduction sequence of the same length involving renamed terms.More precisely: (1) Proof.The first part of the lemma is trivial.For the second part, proceed by induction on the length of the reduction chain: in the inductive case, we have by hypothesis and induction hypothesis; to obtain the thesis, we only need to prove that In order for this to be true, by Lemma 3.15, it is sufficient to show that M n M n+1 σ n+1 ; this is by definition equivalent to M n σ n+1 M n+1 , which we know by hypothesis.
Proof.By Lemma 3.16, for any plain reduction chain there exists a renaming reduction chain of the same length, and vice-versa.Thus, since plain reduction lowers the length of the maximal reduction chain (Lemma 3.12), the same holds for renaming reduction.
The results above prove that the set of strongly normalizing terms is the same under the two notions of reduction, thus ν(M ) can be used to refer to the maximal length of reduction chains starting at M either with or without renaming.
Our goal is to describe the reduction of pure terms expressed in the form of applied continuations.One first difficulty we need to overcome is that, as we noted, the sets of continuations (both regular and auxiliary) are not closed under reduction: the duplication of holes performed by reduction will produce contexts that are not continuations or auxiliary continuations because they do not satisfy the condition of the linearity of holes.Thankfully, renaming reduction allows us to restore the linearity of holes, as we show in the following lemma.
(1) For all continuations K, if K C, there exist a continuation K and a finite map σ such that K σ K and K σ = C.
(2) For all auxiliary continuations Q, if Q C, there exist an auxiliary continuation Q and a finite map σ such that Q Furthermore, the σ, K , Q in the statements above can be chosen so that dom(σ) is fresh with respect to any given finite set of indices S.
Proof.Let S be a finite set of indices and C a contractum of the continuation we wish to reduce.This contractum will not, in general, satisfy the linearity condition of holes that is mandated by the definitions of plain and auxiliary continuations; however we can show that, for any context with duplicated holes, there exists a structurally equal context with linear holes.
Operationally, if C contains n holes, we generate n different indices that are fresh for S, and replace the index of each hole in C with a different fresh index to obtain a new context C : this induces a finite map σ : supp(C ) → supp(C) such that C σ = C.By the definition of renaming reduction, we have To prove that C is a continuation (resp.auxiliary continuation) we need to show that it satisfies the linearity condition and that it meets the grammar in Definition 3.6 (resp.Definition 3.7).The first part holds by construction; the proof that C satisfies the required grammar is obtained by structural induction on the derivation of the reduction, with a case analysis on the structure of K (or on the structure of Q).
By construction of σ, we also have that dom(σ) ∩ S = ∅, as required by the Lemma statement.
A further problem concerns variable capture: if we reduce C C , there is no guarantee that Cη C η for a given context instantiation η.This happens for two reasons: the first one is that reducing a context C may cause a hole to move within the scope of a new binder.So, because the first term is equal to {x | y ← {N | x ← M }} where the x in the head of the outer comprehension is free, and the reduction is blocked until we rename the bound x of the inner comprehension.
The second reason for which the reduction of a context may be disallowed if we apply it to a context instantiation is that, due to variable capture, the reduction may involve the context instantiation in a non-trivial way: because the left-hand term is equal to (λz.[p][p → z]) N , and the right-hand one is z, and the former does not reduce to the latter, but to While we understand that the first of the two problems should be handled with a suitable alpha-renaming of the redex, the other is more complicated.Fortunately, in most cases we are not interested in the reduction of generic contexts, but only in that of auxiliary continuations: due to their restricted term shape, auxiliary continuations only allow some reductions, most of which do not present the problem above; the exception is when a 23:13 reduction is obtained by contracting a subterm using the comprehension-singleton rule: By applying a context instantiation η to both sides, we obtain an incorrect contraction: where the left-hand term does not reduce to the right-hand one because in the latter the codomain of η might contain free instances of z that have not been replaced by L. In the rest of the paper, we will call reductions using the comprehension-singleton rule special reductions.
When Q Q by means of a special reduction, we know that in general Qη Q η (not even after alpha-renaming), and we will have to handle such a case differently.
If however Q Q is not a special reduction, to mimic that reduction within Qη we may start by renaming the bound variables of this term in such a way that no reduction is blocked: we obtain a term in the form Oθ, where O is an auxiliary continuation alpha-equivalent 2 to Q, and θ is obtained from η by replacing some of its free variables with other free variables, consistently with the renaming of Q to O, as shown by the following lemma (a more general result applying to all reductions C C and all context instantiations η, which however gives weaker guarantees on the result of contracting Cη, will be provided as Lemma 3.29).
Lemma 3.19.For all auxiliary continuations Q and for all permutable context instantiations η, there exist an auxiliary continuation O and a context instantiation θ such that: Proof.We proceed by induction on the size of Q followed by a case analysis on its structure; for each case, after considering all possible reductions starting in that particular shape of Q (where we are allowed, by the hypothesis, to ignore special reductions), we perform a renaming of Qη to Oθ that is guaranteed to allow us to prove the thesis.Particular care is needed when context instantiations cross binders, as variable capture is allowed to happen: • Case Q = [p]: no reduction of Q is possible, so we can choose O := Q, θ := η, and the thesis holds trivially.• Case Q = M : for all reductions M M , we have M η = M and M η = M , because context instantiation is ineffective on pure terms; so we can choose O := Q, θ := η, and the thesis holds trivially.
Since the holes in Q are linear and η is permutable, we can decompose η = η 1 η 2 , such that Qη = Q 1 η 1 ∪ Q 2 η 2 ; we apply the induction hypothesis twice on the two subterms, to obtain O 1 , O 2 , θ 1 , θ 2 such that for i = 1, 2, we have θ i is equal to η i up to a renaming of the free variables in its codomain, and for all C i such that Q i C i where the reduction is not special, there exists D i = α C i such that 2 In our setting, contexts are defined as a particular case of terms, allowing special "hole" free variables that are not used in binders; thus, we only have a single notion of alpha-equivalence for terms that we also apply to contexts (just like our notion of reduction works on terms and contexts alike).This may look surprising and perhaps suspicious, considering that in some formal treatments of contexts (e.g.[BdV01]) the alpha-renaming of contexts is forbidden; however, our work does not need to provide a general treatment of alpha-renaming in contexts: we only use it under special conditions that ensure its consistency.
O i D i and O i θ i D i θ i .Now, to prove the thesis, we fix O := O 1 ∪ O 2 and θ := θ 1 θ 2 ; we easily show that O = α Q, Oθ = α Qη and that θ is equal to η up to a renaming of the free variables in its codomain.To conclude the proof, we consider any given reduction Q C , and we see by case analysis that either in the first case, we know that there exists Since the holes in Q are linear and η is permutable, we can decompose η = η 1 η 2 η 3 , such that , by applying the unnesting reduction; however, the last reduction might be blocked in Qη, if z ∈ FV(Q 1 η 1 ).For this reason, we start by choosing a non-hole variable z * / ∈ FV(Q 1 η 1 ) and renaming Q as ] and η * := η 1 η * 2 η 3 , we also have we apply the induction hypothesis three times on the subterms, to obtain: to a renaming of the free variables in its codomain, and for all C 1 such that Q 1 C 1 where the reduction is not special, there exists , θ 1 is equal to η * 2 (and thus to η 2 ) up to a renaming of the free variables in its codomain, and for all C 2 such that Q * 2 C 2 where the reduction is not special, there exists to a renaming of the free variables in its codomain, and for all C 3 such that Q 3 C 3 where the reduction is not special, there exists D 3 = α C 3 such that O 3 D 3 and O 3 θ 3 D 3 θ 3 Note that the first and third case are similar, but the second one has slightly different properties to account for the renaming of z to z * .Now we fix We easily show that O = α Q, Oθ = α Qη and that θ is equal to η up to a renaming of the free variables in its codomain.To conclude the proof, we consider any given reduction Q C , and we see by case analysis that either the reduction corresponds to a reduction in the subcontinuations Q i (in which case we conclude by a reasoning on the subterms and induction hypotheses, similarly to the union case above), or to one of the following: - and prove the thesis.
where Q 2 is not a comprehension: this is similar to the case above, but instead of comprehension unnesting we have to consider two possible reductions However, these reductions do not require us to perform renamings and do not pose any problems.
• Case Q = where B do Q 1 .Besides reductions in the subterms, we have to consider the following cases: - In all these cases, no renaming is required (besides those produced by applying the induction hypothesis); in particular, the first reduction is always possible without renaming because B is a pure term, so z / ∈ FV(Bη) because Bη = B. Therefore, we prove the thesis using the induction hypothesis and an exhaustive case analysis on the possible reduction as we did above, without particular problems.
Remark 3.20.It is important to understand that, unlike all other operations on terms, context instantiation is not defined on the abstract syntax, independently of the particular choice of names, but on the concrete syntax.In other words, all operations and proofs that do not use context instantiation work on alpha-equivalence classes of terms; but when context instantiation is used, say on a context C, we need to choose a representative of the alpha-equivalence class of C.
Thanks to Lemma 3.19, whenever we need to reduce Q with a non-special reduction in a term of the form Qη, we may assume without loss of generality that the representative of the alpha-equivalence class of Qη is chosen so that if Q C , then Qη C η. Technically, we prove that there exist O, D , θ such that Q = α O, C = α D , θ is equal to η up to renaming, and Oθ D θ, but after the context instantiation is completed, we return to consider terms as equal up to alpha-equivalence.The result will be used in the proof of the following Lemma 3.23, where we have clarified the technical parts.
Finally, given a non-special renaming reduction Q σ Q , we want to be able to express the corresponding reduction on Qη: due to the renaming σ, it is not enough to change Q to Q , but we also need to construct some η containing precisely those mappings [q → M ] such that, if σ(q) = p, then p ∈ dom(η) and η(p) = M .This construction is expressed by means of the following operation.
Definition 3.21.For all context instantiations η and renamings σ, we define η σ as the context instantiation such that: • in all other cases, η σ (p) = p.
The results above allow us to express what happens when a reduction duplicates the holes in a continuation which is then combined with a context instantiation.Lemma 3.23.For all auxiliary continuations Q, renamings σ, and permutable context instantiations η such that, for all p ∈ dom(η), supp(η(p)) ∩ dom(σ) = ∅, there exist an auxiliary continuation O and a context instantiation θ such that: Remark 3.24.In [Coo09a], Cooper attempts to prove strong normalization for NRC λ using a similar, but weaker result: If K C, then for all terms M there exists . Since he does not have branching continuations and renaming reductions, whenever a hole is duplicated, e.g.
he resorts to obtaining a continuation from C simply by filling one of the holes with the term M : Unfortunately, subsequent proofs rely on the fact that ν(K) must decrease under reduction: since we have no control over ν(M ), which could potentially be much greater than ν(K), it may be that ν(K M ) ≥ ν(K).
In our setting, by combining Lemmas 3.18 and 3.23, we can find a K which is a proper contractum of K.By Lemma 3.12, we get ν(K ) < ν(K), as required by subsequent proofs.
More generally, the following lemma will help us in performing case analysis on the reduction of an applied continuation.Lemma 3.25 (classification of reductions in applied continuations).Suppose Qη N , where η is permutable, and dom(η) ⊆ supp(Q); then one of the following holds: (1) there exist an auxiliary continuation Q and a finite map and, for all p ∈ dom(η), supp(η(p)) ∩ dom(σ) = ∅: we say that this is a reduction of the continuation Q; (2) there exist auxiliary continuations Q 1 , Q 2 , an index q ∈ supp(Q 1 ), a variable x, and a term this is a special reduction of the continuation Q; (3) there exists a permutable η such that N = Qη and η η : in this case we say the reduction is within η; (4) there exist an auxiliary continuation Q 0 , an index p such that p ∈ supp(Q 0 ) and p ∈ dom(η), a frame F and a term M such that M : in this case we say the reduction is at the interface.Furthermore, if Q is a regular continuation K, then the Q in case 1 can be chosen to be a regular continuation K , and case 2 cannot happen.
Proof.By induction on Q with a case analysis on the reduction rule applied.In case 1, to satisfy the property relating η and σ, we use Lemma 3.18 to generate a σ such that the indices of its domain are fresh with respect to the codomain of η.To see that this partition of reductions is exhaustive, the most difficult part is to check that whenever we are in the case of a reduction at the interface, there is a suitable F such that Q can be decomposed as Q 0 p F ; while there are some reduction rules for which we cannot find a suitable F , the structure of Q prevents these from happening at the interface between Q and η: for example, in a reduction (Q , ( L) is not a valid frame: but we do not have to consider this case, because Q cannot be of the form Q 0 p ( L), since the latter is not a valid auxiliary continuation.
• θ is permutable Proof sketch.If C has n hole occurrences, we generate n distinct indices p 1 , . . ., p n (which we take to be fresh with respect to S and the free variables of the codomain of η) and replace each hole occurrence within C with a different [p i ]: this induces a context D and a renaming σ such that Dσ = C.By Lemma 3.22 we prove Cη = Dση = Dη σ σ (we can apply this lemma thanks to the careful choice of the p i ).We take θ η σ and the remaining properties follow easily (the permutability of θ, again, descends from choosing sufficiently fresh indices p i ).
Lemma 3.28.Let C 1 , C 2 be contexts and η 1 , η 2 context instantiations.Then for all free variables x and sets of hole indices S, there exist a context D, a permutable context instantiation θ, and a hole renaming σ such that: • The holes in D are linear and fresh with respect to S • For all q ∈ dom(θ) ∪ dom(σ), [q] ∈ FV(D) • θ is permutable Proof.The proof is by induction on the size of C 1 , followed by a case analysis on its structure.Here we consider the variable cases, lambda as a template for binder cases, and application as a template for cases with multiple subterms.
. By Lemma 3.28, we find D, θ, σ such that Dσ = C 2 , Dθσ = C 2 η 2 , the holes in D are linear and arbitrarily fresh, for all q ∈ dom(θ) ∪ dom(σ), [q] ∈ FV(D), and θ is permutable; this proves the thesis.∈ {x} ∪ FV(C 2 η 2 ), such that y * is not a hole; let us define the following abbreviations: Since C * 0 is equal to C 0 up to a renaming, it is smaller than C 1 and by induction hypothesis we get that there exist D 0 , θ 0 , σ 0 such that where the holes in D 0 are linear and arbitrarily fresh, for all q ∈ dom(θ 0 ) ∪ dom(σ 0 ) we have [q] ∈ FV(D 0 ), and θ 0 is permutable.Then we can choose D = λy * .D 0 , θ = θ 0 , σ = σ 0 and show that: We can easily show that the other required properties of D, θ, σ are verified, thus proving the thesis.
Furthermore, the induction hypothesis provides enough information on the D i , θ i , σ i to guarantee that the holes in D are linear and arbitrarily fresh, for all q ∈ dom(θ) ∪ dom(σ) we have [q] ∈ FV(D), and θ is permutable, as required.
Lemma 3.29.Let C, C be contexts such that C C ; then, for all context instantiations η, there exist a context D, context instantiation θ, and renaming σ such that Dσ = α C (and consequently C σ D) and Cη σ Dθ.
Proof.By structural induction on the derivation of C C .The property we need to prove essentially states that any reduction of C can still be performed after applying any instantiation η; however, due to variable capture and the possibility that some redexes of C may be blocked in Cη, the statement is complicated by explicit alpha-conversions and hole renamings.We present here the two interesting cases of the proof: By repeated applications of Lemma 3.27, we obtain contexts C 1 , C 2 , C 3 , context instantiations η 1 , η 2 , η 3 , and renamings σ 1 , σ 2 , σ 3 , such that {C 1 | y ← C 3 , x ← C 2 } has linear, arbitrarily fresh holes, for i = 1, 2, 3 we have dom(η i ) ∪ dom(σ i ) ⊆ FV(C i ) and η i is permutable, and such that: We prove: The following result, like many others in the rest of this section, proceeds by well-founded induction; we will use the following notation to represent well-founded relations: • < stands for the standard less-than relation on N, which is well-founded; • is the lexicographic extension of < to k-tuples in N k (for a given k), also well-founded; • ≺ will be used to provide a decreasing metric that depends on the specific proof: such metrics are defined as subsets of and are thus well-founded.
Lemma 3.30.Let C be a context and η a context instantiation such that Cη ∈ SN .Then we have: Proof.Property 3 follows immediately by induction on ν(Cη) by noticing that, since [p] ∈ FV(C) implies that η(p) appears as a subexpression of Cη, and since reduction is defined by congruence closure, every reduction of η(p) can be mimicked by a corresponding reduction within Cη.
To prove the first two properties, we proceed by well-founded induction on (C, η) using the metric: We consider all the possible contractions C C .By Lemma 3.29, we find D, σ, θ such that Cη A similar property about the composition of continuations and frames follows immediately.
N by means of a reduction at the interface.By Lemma 3.31 we know ν(Q 0 ) ≤ ν(Q); by Lemma 3.10 we prove Q 0 < Q .We take η = [p → N ]η 0 : since Qη reduces to Q 0 η and both terms are strongly normalizing, we have that ν(η ) is defined.Then we observe (Q 0 , η , θ) ≺ (Q, η, θ) and obtain the thesis by induction hypothesis.A symmetric case with p ∈ dom(θ) is proved similarly.
Proof.By the definition of [p → M ] σ , using Lemma 3.32 to decompose the resulting context instantiation.
3.3.Candidates of reducibility.We here define the notion of candidates of reducibility: sets of strongly normalizing terms enjoying certain closure properties that can be used to overapproximate the sets of terms of a certain type.Our version of candidates for NRC λ is a straightforward adaptation of the standard definition given by Girard and like that one is based on a notion of neutral terms, i.e. those terms that, when placed in an arbitrary context, do not create additional redexes.Definition 3.34 (neutral term).A term M is neutral if it belongs to the following grammar: where n ≥ 1.The set of neutral terms is denoted by N T .
Let us introduce the following notation for Girard's CRx properties of sets [GLT89]: The set CR of the candidates of reducibility is then defined as the collection of those sets of terms which satisfy all the CRx properties.Some standard results include the non-emptiness of candidates (in particular, all free variables are in every candidate) and that SN ∈ CR.
3.4.Reducibility sets.In this section we introduce reducibility sets, which are sets of terms that we will use to provide an interpretation of the types of NRC λ ; we will then prove that reducibility sets are candidates of reducibility, hence they only contain strongly normalizing terms.The following notation will be useful as a shorthand for certain operations on sets of terms that are used to define reducibility sets: The sets C p and C are called the -lifting and -lifting of C.These definitions refine the ones used in the literature by using indices: -lifting is defined with respect to a given index p, while the definition of -lifting uses any index (in the standard definitions, continuations only contain a single hole, and no indices are mentioned).Definition 3.35 (reducibility).For all types T , the set Red T of reducible terms of type T is defined by recursion on T by means of the rules: Let us use metavariables S, S , . . . to denote finite sets of indices: we provide a refined notion of -lifting C S depending on a set of indices rather than a single index, defined by pointwise intersection.This notation is useful to track a -lifted candidate under renaming reduction.
Definition 3.36.C S p∈S C p .Definition 3.37.Let C and S be sets of terms and indices respectively, and σ a finite renaming: then we define (C S ) σ := C σ −1 (S) , where σ −1 (S) = {q : σ(q) ∈ S} We now proceed with the proof that all the sets Red T are candidates of reducibility: we will only focus on collections since for the other types the result is standard.The proofs of CR1 and CR2 do not differ much from the standard -lifting technique.
Proof.To prove the lemma, it is sufficient to show that for all M ∈ C we have [p][q → {M }] ∈ SN .This term is equal to either {M } (if p = q) or to [p] (otherwise); both terms are s.n.(in the case of {M }, this is because CR1 holds for C, thus M ∈ SN ).K[p → M ], and this last term, being a contractum of a strongly normalizing term, is strongly normalizing as well.This proves the thesis.
In order to prove CR2 for all types (and particularly for collections), we do not need to establish an analogous property on continuations; however such a property is still useful for subsequent results (particularly CR3).Its statement must, of course, consider that reduction may duplicate (or indeed delete) holes, and thus employs renaming reduction.We can show that whenever we need to prove a statement about n-ary permutable instantiations of n-ary continuations, we can simply consider each hole separately, as stated in the following lemma.
Lemma 3.42.K ∈ (C S ) σ if, and only if, for all q ∈ σ −1 (S), we have K ∈ C q .In particular, K ∈ (C p ) σ if, and only if, for all q s.t.σ(q) = p, we have K ∈ C q .
This is everything we need to prove CR3.Proof.By definition, we need to prove K[p → M ] ∈ SN whenever K ∈ C p for some index p.By Lemma 3.39, knowing that C, being a candidate, is non-empty, we have K ∈ SN .We can thus proceed by well-founded induction on ν(K) to prove the strengthened statement: for all indices q, if K ∈ C q , then K[q → M ] ∈ SN .Equivalently, we prove that all the contracta of K[q → M ] are s.n. by cases on the possible contracta: • K [q → M ] σ (where K σ K ): to prove this term is s.n., by Corollary 3.33, we need to show K [q → M ] ∈ SN whenever σ(q ) = q; by Lemmas 3.43 and 3.42, we know K ∈ C q , and naturally ν(K ) < ν(K) (Lemma 3.12), thus the thesis follows by the IH.Proof.In this proof, we assume the names of bound variables are chosen so as to avoid duplicates, and are distinct from the free variables.We proceed by well-founded induction on (K, p, N , L) using the following metric: Now we show that every contractum must be a strongly normalizing: • K[p → N L x ]: this term is s.n. by hypothesis.
• K [p → {N |x ← {L}}] σ , where K σ K .Lemma 3.12 allows us to prove ν ) for all q s.t.σ(q) = p by means of Lemma 3.30 (because [q → N L x ] is a subinstantiation of [p → N L x ] σ ); then we can apply the IH to obtain, for all q s.t.σ(q) = p, K [q → {N |x ← {L}}] ∈ SN ; by Corollary 3.33, this implies the thesis.Reducibility for conditionals is proved in a similar manner.However, to make the induction work under all the conversions commuting with where, we cannot prove the strong normalization statement within regular continuations K, but we need to generalize it to auxiliary continuations.A minor complication with the merging of nested where is handled by a separate lemma.Additionally, due to the more complicated structure of auxiliary continuations, we will need to ensure that the free variables of the Boolean guard of the where expression do not get captured: the assumption uses an auxiliary operation BV denoting the set of variables bound over holes:

Heterogeneous Collections
SQL allows a user to write queries that will evaluate to relations that are bags of tuples by means of constructs including SELECT statements and UNION ALL operations; additionally, it also allows constructs like SELECT DISTINCT and UNION to produce sets of tuples (more precisely, bags without duplicates); both kinds of constructs can be freely mixed in the same query.In contrast, the language NRC λ we have discussed in the previous sections can only deal with one kind of collection (either sets or bags).
In a short paper [RC19], we introduced a generalization of NRC called NRC (Set, Bag) that makes up for this shortcoming by allowing both set-valued and bag-valued collections (with distinct types denoted by {T } and T ), along with mappings from bags to sets (deduplication δ) and from sets to bags (promotion ι).We conjectured that this language also satisfies a normalization property, allowing its normal forms to be translated to SQL.Here, we prove that NRC (Set, Bag) is, indeed, strongly normalizing, even when extended to a richer language NRC λ (Set, Bag) with higher-order (nonrecursive) functions.Its syntax is a straightforward extension of NRC λ : types S, T ::= . . .| T terms L, M, N :: We use T to denote the type of bags containing elements of type T ; similarly, the notations , M , M N , M |x ← N denote empty and singleton bags, bag disjoint union and bag comprehension; the language also includes conditionals on bags.The notations ιM and δN stand, respectively, for the bag containing exactly one copy of each element of the set M , and for the set containing the elements of the bag N , forgetting about their multiplicity.We do not need to provide a primitive emptiness test for bags, since it can be defined anyway as empty bag M := empty δM .
The type system for NRC λ (Set, Bag) is obtained from the one for NRC λ by adding the unsurprising rules of Figure 4: these largely replicate, at the bag level, the corresponding set-based rules; additionally, the rules for δ and ι describe how these operators turn bag-typed terms into set-typed ones, and vice-versa.Similarly, the rewrite system for NRC λ (Set, Bag) is also an extension of the one for NRC λ , with additional reduction rules for the new operators involving bags that mimic the corresponding set-based operations; there are simplification rules involving δ that state that the deduplication of empty or singleton bags yields empty or singleton sets, and that deduplication commutes with bag union and comprehension, example we have shown of this phenomenon is: The reason for this discrepancy lies in the fact that while beta reduction yields a substitution replacing z with N , once this substitution meets the hole [p], it is completely lost.If we replaced the meta-operation of substitution with new syntax L x := M denoting a (suspended) explicit substitution that will eventually replace with M all the free occurrences of x within L, we could write: where the final term correctly reduces to N .Holes with explicit substitutions have been studied in the context of dependently-typed lambda calculi, where they are more often known as metavariables, with applications to proof assistants ( [Muñ01]).We could study strong normalization in such an extended calculus, however explicit substitutions are known to require a careful treatment of reduction for them to simultaneously preserve confluence and strong normalization (see [Mel95] for a counterexample); more recent explicit substitution calculi (e.g.[DG01,KL05]) often employ ideas from linear logic to ensure strong normalization is preserved.Another approach, introduced by Bognar and De Vrijer, employs a context calculus ([BdV01]), i.e. an extension of the lambda calculus with additional operators to express context-building and instantiation, along with interfaces describing the evolution of contexts under reduction ("communication").Under this approach, the context (λz.where the final term can be further reduced to (Λz.z) N , and finally to N , as expected.Like explicit substitutions, the context calculus allows contexts to be reduced independently of an applied instantiation, potentially simplifying technical results such as those of Lemma 3.19 and 3.29.Both techniques require fairly important extension to the language, type system and rewrite system, and will be considered in future work.Since the fundamental work of Wong and others on the Kleisli system, languageintegrated query has gradually made its way into other systems, most notably Microsoft's .NET framework languages C# and F# [MBB06], and the Web programming language Links [CLWY07].Cheney et al. [CLW13] formally investigated the F# approach to languageintegrated query and showed that normalization results due to Wong and Cooper could be adapted to improve it further; however, their work considered only homogeneous collections.In subsequent work, Cheney et al. [CLW14] showed how use normalization to perform query shredding for multiset queries, in which a query returning a type with n nested collections can be implemented by combining the results of n flat queries; this has been implemented in Links [CLWY07].
Higher-order relational queries have also been studied by Benedikt et al. [BPV15], where the focus was mostly on complexity of the evaluation and containment problems.Their calculus focuses on higher-order expressions composing operations over flat relational algebra operators only, where the base types are records listing the fields of the relations.Thus, modulo notational differences, their calculus is a sublanguage of NRC .In their setting, normalization up to β-reduction follows as a special case of normalization for typed lambdacalculus; in our setting the same approach would not work because collection and record types can be combined arbitrarily in NRC and normalization involves rules that nontrivially rearrange comprehensions and other collection operations.
Several recent efforts to formalize and reason about the semantics of SQL are complementary to our work.Guagliardo and Libkin [GL17] presented a semantics for SQL's actual behaviour in the presence of set and multiset operators (including bag intersection and difference) as well as incomplete information (nulls), and related the expressiveness of this fragment of SQL with that of an algebra over bags with nulls.Chu et al. [CWCS17] presented a formalized semantics for reasoning about SQL (including set and bag semantics as well as aggregation/grouping, but excluding nulls) using nested relational queries in Coq, while Benzaken and Contejean [BC19] presented a semantics including all of these SQL features (set, multiset, aggregation/grouping, nulls), and formalized the semantics in Coq.Kiselyov et al. [KK17] has proposed language-integrated query techniques that handle sorting operations (SQL's ORDER BY).
However, the above work on semantics has not considered query normalization, and to the best of our knowledge normalization results for query languages with more than one collection type were previously unknown even in the first-order case.We are interested in extending our results for mixed set and bag semantics to handle nulls, grouping/aggregation, and sorting, thus extending higher-order language integrated query to cover all of the most widely-used SQL features.Normalization of higher-order queries in the presence of all of these features simultaneously remains an open problem, which we plan to consider next.In addition, fully formalizing such normalization proofs also appears to be a nontrivial challenge.

Conclusions
Integrating database queries into programming languages has many benefits, such as type safety and avoidance of common SQL injection attacks, but also imposes limitations that prevent programmers from constructing queries dynamically as they could by concatenating SQL strings unsafely.Previous work has demonstrated that many useful dynamic queries can be constructed safely using higher-order functions inside language-integrated queries; provided such functions are not recursive, it was believed, query expressions can be normalized.Moreover, while it is common in practice for language-integrated query systems to provide support for SQL features such as mixed set and bag operators, it is not well understood in theory how to normalize these queries in the presence of higher-order functions.Previous work on higher-order query normalization has considered only homogeneous (that is, pure set or pure bag) queries, and in the process of attempting to generalize this work to a heterogeneous setting, we discovered a nontrivial gap in the previous proof of strong normalization.We therefore prove strong normalization for both homogeneous and heterogeneous queries for the first time.
As next steps, we intend to extend the Links implementation of language-integrated query with heterogeneous queries and normalization, and to investigate (higher-order) query normalization and conservativity for the remaining common SQL features, such as nulls, grouping/aggregation, and ordering.

Figure 3 :
Figure 3: Query normalization (congruence closure of ˘ ) Definition 3.1 (context).Let us fix a countably infinite set P of indices: a context C is a term that may contain distinguished free variables [p], also called holes, where p ∈ P. Holes are never bound by any of the binders (we disallow terms of the form λ [p] .M or {M | [p] ← N }).Given a finite map from indices to terms [p 1 → M 1 , . . ., p n → M n ] (context instantiation), the notation C[p 1 → M 1 , . . ., p n → M n ] (context application) denotes the term obtained by simultaneously plugging M 1 , . . ., M n into the holes [p 1 ], . . ., [p n ].Notice that the M i are allowed to contain holes.We will use metavariables η, θ to denote context instantiations.Definition 3.2 (support).Given a context C, its support supp(C) is defined as the set of the indices p such that [p] occurs in C: supp(C) {p : [p] ∈ FV(C)} e. θ is equal to η up to a renaming of the free variables in its codomain) (3) for all C such that Q C with a non-special reduction, there exists D = α C such that O D and we also have Oθ D θ and we prove, as required, that D = α C , O D , and Oθ D θ. -Q 1 = ∅ and C = ∅: we easily see that O 1 = ∅, so we can fix D = ∅ and show the thesis.

Lemma 3 .
39 (CR1 for continuations).For all p and all non-empty C, C p ⊆ SN .Proof.We assume K ∈ C p and M ∈ C: by definition, we know that K[p → {M }] ∈ SN ; then we have K ∈ SN by Lemma 3.30.Lemma 3.40 (CR1 for collections).If CR1(C), then CR1(C ).Proof.We need to prove that if M ∈ C , then M ∈ SN .By the definition of C , we know that for all p, K[p → M ] ∈ SN whenever K ∈ C p .Now assume any p, and by Lemma 3.38 choose K = [p]: then K[p → M ] = M ∈ SN , which proves the thesis.Lemma 3.41 (CR2 for collections).If M ∈ C and M M , then M ∈ C .Proof.Let p be an index, and take K ∈ C p : we need to prove K[p → M ] ∈ SN .By the definition of M ∈ C , we have K[p → M ] ∈ SN ; if p / ∈ supp(K), K[p → M ] = K[p → M ] and the thesis trivially holds; otherwise the instantiation is effective and we have K[p → M ] Lemma 3.44 (CR3 for collections).Let C ∈ CR, and M a neutral term such that for all reductions M M we have M ∈ C .Then M ∈ C .
• K[p → M ] (where M M ): this is s.n. because M ∈ C by hypothesis.• Since M is neutral, there are no reductions at the interface.Theorem 3.45.For all types T , Red T ∈ CR.Proof.Standard by induction on T .For T = {T }, we use Lemmas 3.40, 3.41, and 3.44.
and all other metrics do not increase) we prove K[p → {N i |x ← {L}}] ∈ SN (for i = 1, 2), and consequently obtain the thesis by Lemma 4.4.•K 0 [p → { {M |y ← N }|x ← {L}}],where K = K 0 p {M |y ← }; since we know, by the hypothesis on the choice of bound variables, that x / ∈ FV(M ), we note thatK 0 [p → {M |y ← N } L x ] = K[p → N L x ]; furthermore, by Lemma 3.10 we know K 0 p < K p ; then we can apply the IH to obtain the thesis.•K 0 [p → {where B do N |x ← {L}}] (when K = K 0 p where B do ): since we know, from the hypothesis on the choice of bound variables, that x / ∈ FV(B), we note that K 0 [p → (where B do N ) L x ] = K[p → N L x ]; furthermore, by Lemma 3.10 we know K 0 p < K p ; then we can apply the IH to obtain the thesis.•reductions within N or L follow from the IH by reducing the induction metric.
Lemma 4.8 (reducibility for comprehensions).Assume CR1(C), CR1(D), M ∈ C and for all L ∈ C, N L x ∈ D .Then {N |x ← M } ∈ D .Proof.We assume p, K ∈ D p and prove K[p → {N |x ← M }] ∈ SN .We start by showing that K = K p {N |x ← } ∈ C p , or equivalently that for all L ∈ C, K [p → {L}] = K[p → {N |x ← {L}}] ∈ SN : since CR1(C), we know L ∈ SN , and since N L x ∈ D , K[p → N L x ] ∈ SN ; then we can apply Lemma 4.7 to obtain K [p → {L}] ∈ SN and consequently K ∈ C p .But then, since M ∈ C , we have K [p → M ] = K[p → {N |x ← M }] ∈ SN , which is what we needed to prove.
[p]) N would be expressed as δ [p] .(λz.[p] z ) N where the operator δ [p] .−(unrelated to the deduplication operator of Section 5) builds a context by abstracting over a hole variable [p], and the syntax [p] z expresses the fact that once [p] is instantiated with a term, this term will communicate with the context by means of the (captured) variable z.The term z to be plugged into the context would be represented as Λz.z where the abstraction Λz.− is provided to express the fact that this term can communicate with the context over the variable z.To apply this term to the context, we use the syntax − − : (δ [p] .(λz.[p] z ) N ) Λz.zHere as well, we are allowed to beta reduce the context both in the unapplied and in the applied form: δ [p] .(λz.[p] z ) N δ [p] .[p] N and (δ [p] .(λz.[p] z ) N ) Λz.z (δ [p] .[p] N ) Λz.z