Galois connecting call-by-value and call-by-name

We establish a general framework for reasoning about the relationship between call-by-value and call-by-name. In languages with computational effects, call-by-value and call-by-name executions of programs often have different, but related, observable behaviours. For example, if a program might diverge but otherwise has no effects, then whenever it terminates under call-by-value, it terminates with the same result under call-by-name. We propose a technique for stating and proving properties like these. The key ingredient is Levy's call-by-push-value calculus, which we use as a framework for reasoning about evaluation orders. We show that the call-by-value and call-by-name translations of expressions into call-by-push-value have related observable behaviour under certain conditions on computational effects, which we identify. We then use this fact to construct maps between the call-by-value and call-by-name interpretations of types, and identify further properties of effects that imply these maps form a Galois connection. These properties hold for some computational effects (such as divergence), but not others (such as mutable state). This gives rise to a general reasoning principle that relates call-by-value and call-by-name. We apply the reasoning principle to example computational effects including divergence and nondeterminism.


Introduction
Suppose that we have a language in which terms can be statically tagged either as using call-by-value evaluation or as using call-by-name evaluation.Each program in this language would therefore use a mix of call-by-value and call-by-name at runtime.Given any such program M , we can construct a new program M ′ by changing call-by-value to call-by-name for some subterm.The question we consider in this paper is: what is the relationship between the observable behaviour of M and the observable behaviour of M ′ ?
For a language with computational effects (such as divergence), changing the evaluation order in this way will in general change the behaviour of the program, but for some effects we can often say something about how we expect the behaviour to change: • If there are no effects at all (in particular, programs are normalizing), the choice of evaluation order is irrelevant: M and M ′ terminate with the same result.
• If there are diverging terms (for instance, via recursion), then the behaviour may change: a program might diverge under call-by-value and return a result under call-by-name.However, we can say something about how the behaviour changes: if M terminates with some result, then M ′ terminates with the same result.• If nondeterminism is the only effect, every result of M is a possible result of M ′ .These three instances of the problem are intuitively obvious, and each can be proved separately.We develop a general technique for proving these properties.
The idea is to use a calculus that captures both call-by-value and call-by-name, as a setting in which we can reason about both evaluation orders (this is where M and M ′ live).The calculus we use is Levy's call-by-push-value (CBPV) [Lev99].Levy describes how to translate (possibly open) expressions e into CBPV terms V e and N e , which respectively correspond to call-by-value and call-by-name.We study the relationship between the behaviour of V e and the behaviour of N e in a given program context.
The main obstacle is that V e and N e have different types.The former has a "callby-value type" F (V τ ) and the latter a "call-by-name type" N τ , defined in Section 2.1.They hence cannot be directly compared.Our solution is inspired by Reynolds's work relating direct and continuation semantics of the λ-calculus [Rey74].
The first step is to define a family of (set-theoretic) relations (in the style of a logical relation) that compares the observable behaviour of a term of call-by-value type with observable behaviour of a term of call-by-name type.We can then ask whether V e is related in this sense to N e .This is not the case in general.In the presence of arbitrary computational effects, we cannot expect to say anything useful about how the behaviour of V e relates to the behaviour of N e .However, under certain conditions satisfied only for certain effects, V e is related to N e .These conditions say roughly that we can discard, duplicate, and reorder effects.The main result of the first step is a theorem relating the two translations of e when these conditions hold (Theorem 4.7).This does not quite say what happens if we were to replace call-by-value with call-by-name within some program; that is the goal of the second step.
The second step is to identify maps between the call-by-value and call-by-name interpretations, forming Galois connections (one for each source-language type) between the two interpretations.We compose these maps with the translations of expressions, to arrive at two terms that can be compared directly.For this step we assume a stronger condition on computational effects than in the first, saying informally that effects can be thunked.Under this condition we show that the maps between call-by-value and call-by-name represent the relations from the first step.By combining this fact with Theorem 4.7 we prove a result that directly relates the two terms we construct by composition with the Galois connections.
We therefore arrive at a general reasoning principle (Theorem 6.2) that we use to compare call-by-value with call-by-name.Given any preorder ≼ that captures the property we wish to show about programs, our reasoning principle gives conditions that imply M ≼ M ′ , where M ′ is constructed as above by replacing call-by-value with call-by-name.We apply our reasoning principle to examples by choosing different relations ≼; each of these relations indicates the extent to which changing evaluation order affects the behaviour of the program.In the divergence example N ≼ N ′ is defined to mean termination of N implies termination of N ′ with the same result; in the other examples ≼ similarly mirrors the properties described informally above.
• We define a family of relations for comparing the observable behaviours of a term of call-by-value type with a term of call-by-name type (Section 4).We prove that, for effects satisfying certain conditions, the call-by-value and call-by-name translations of expressions are related by these (Theorem 4.7).As a corollary, we directly relate the call-by-value and call-by-name translations of closed expressions of type bool (Corollary 4.8).• We define the Galois connections between the call-by-value and call-by-name translations (Section 5), and show that they represent the relations from the first step (Lemma 5.8).• We use the Galois connections to prove a novel reasoning principle (Theorem 6.2) that relates the call-by-value and call-by-name translations of expressions (Section 6).
We apply our reasoning principle to three different examples: no effects, divergence, and nondeterminism.In this way we establish all of three facts listed at the beginning of this introduction.Our motivation is partly to demonstrate the Galois connection technique as a way of reasoning about different semantics of a given language.Call-by-value and call-by-name is one example of this (and Reynolds's original application to direct and continuation semantics is another).This paper is a revised and extended version of [MM22].The primary difference is the addition of Section 4, containing the first step outlined above.The conference version [MM22] skips this step and goes directly to the Galois connections.The first step in particular enables us to prove a statement about closed terms of type bool (Corollary 4.8) under weaker assumptions than in the conference version [MM22,Corollary 22].We also add an extra example (immutable state), add products to the source language, and include more detailed proofs than in the conference version.

Vol. 20:1 GALOIS CONNECTING CALL-BY-VALUE AND CALL-BY-NAME
13:5 We give an operational semantics for CBPV.This consists of a big-step evaluation relation M ⇓ R, which means the computation M evaluates to R.Here R ranges over terminal computations, which are the subset of computations with an introduction form on the outside: R ::= λ{1.M 1 , 2. M 2 } | λx : A. M | return V We only evaluate closed, well-typed computations, so when we write M ⇓ R we assume M : C for some C (this implies R : C).Reduction therefore cannot get stuck.The rules defining ⇓ are given in Figure 2. All terminal computations evaluate to themselves.Products of computations are lazy: to evaluate a projection i'M , only the ith component of the pair M is evaluated.Since we have not yet included any way of forming impure computations, the semantics is deterministic and normalizing: given any M : C, there is exactly one terminal computation R such that M ⇓ R. Section 2.2 extends the semantics in ways that violate these properties.We are primarily interested in evaluating computations of returner type.
A CBPV program is a closed computation M : F bool.The reasoning principle we give for call-by-value and call-by-name relates open terms in program contexts.A program relation consists of a preorder2 ≼ on programs.For example, we could use We could also use, for example, the total relation for ≼ (and in this case apply our reasoning principle for call-by-value and call-by-name even if we include e.g.mutable state as a side effect -but then of course the conclusion of our reasoning principle would be trivial).Given any program relation ≼, we define a contextual preorder M ≼ Γ ctx M ′ on arbitrary well-typed computations (in typing context Γ) by considering the behaviour of M and M ′ in programs as follows.A computation context E is a computation term, with a single hole □ where a computation term is expected.We write E[M ] for the computation that results from replacing □ with M (which may capture some of the free variables of M ).For example, if E is the computation context N to x. □ then E[return x] is the computation N to x. return x, where x is captured.We use computation contexts to define ≼ Γ ctx .Definition 2.1 (Contextual preorder).Suppose that ≼ is a program relation, and that Γ ⊢ c M : C and Γ ⊢ c M ′ : C are two computations of the same type.We write and say that M and M ′ are contextually equivalent, when both M ≼ Γ ctx M ′ and M ′ ≼ Γ ctx M hold.We sometimes omit Γ, and write just 2.1.Call-by-value and call-by-name.We use CBPV (instead of e.g.Moggi's monadic metalanguage [Mog91]) because it captures both call-by-value and call-by-name in a strong sense (see the introduction of [Lev99] for a detailed discussion of this).Levy [Lev99] gives two compositional translations from a source language into CBPV: one for call-by-value and one for call-by-name.We recall both translations in this section; our goal is to reason about the relationship between them.
For the source language, we use the following syntax of types τ and expressions e: We include two base types unit and bool to be used in examples. 3The source language has a typing judgement of the form Γ ⊢ e : τ , defined by the usual rules.The two translations from the source language to CBPV are defined in Figure 3.For callby-value, each source language type τ is mapped to a CBPV value type V τ that contains the results of call-by-value computations.For call-by-name, τ is translated to a computation type N τ , which contains the computations themselves.Products in call-by-value use the value-type products of CBPV (which means they are necessarily strict: both components of a pair are always evaluated).For call-by-name we give a lazy interpretation of binary products, using products of CBPV computation types.(Though note that we do not interpret unit as a nullary product of computation types.We instead treat unit as a base type, so that effects can happen at type unit, which matches typical functional languages.)Functions under the call-by-value translation accept values of type V τ as arguments; arguments are evaluated before being passed to the function.Under the call-by-name translation, functions accept thunks of computations as arguments; instead of evaluating them, arguments are thunked before passing them to call-by-name functions.Source-language typing contexts Γ are translated to CBPV typing contexts V Γ and N Γ .In call-by-value they contain values, in call-by-name they contain thunks of computations.Source-language expressions e are mapped to CBPV computations V e and N e .The translation uses some auxiliary program variables, which are assumed fresh.
For call-by-value we arbitrarily choose left-to-right evaluation for both pairing and function application.Under the call-by-name translation, computational effects occur only at the base types unit and bool (since this is where the returner types appear).
Of course, we have to justify that these translations actually capture call-by-value and call-by-name.There are two semantics of interest for the source language: a call-by-value semantics (that evaluates left-to-right), and a call-by-name semantics (with lazy products).Since we consider the observable behaviour of CBPV terms, the properties we want are that if the call-by-value translations V e and V e ′ have the same observable behaviour then e and e ′ have the same observable behaviour with respect to the call-by-value semantics, and similarly for call-by-name.Levy [Lev99] proves both of these properties (though without products in the source language).We take this as the required justification, and do not give the details.

2.2.
Examples.We consider three collections of (allowable) effects as examples throughout the paper.
Example 2.2 (No effects).We include the simplest possible example: the case where there are no computational effects at all.For this example, call-by-value and call-by-name turn out to have identical behaviour.We define the program relation M ≼ pure M ′ (for closed computations M, M ′ : F bool) as: In other words, M and M ′ both evaluate to the same result V .Since evaluation is deterministic, V is necessarily unique.The contextual preorder M ≼ Γ ctx M ′ means if we construct two programs by wrapping M and M ′ in the same computation context, then 3 Unlike in Levy [Lev99], we do not include general sum types, only bool.We expect that including arbitrary sum types would complicate Section 4, because it is difficult to extend logical relations of varying arity with sums.The difficulty, and techniques for dealing with it, are discussed e.g. in [AS19,FS99,Kat08].
fst e → V e to z. match z with (z 1 , z 2 ).return z 1 snd e → V e to z. match z with (z 1 , z 2 ).return z 2 true → return true false → return false if e 0 then e 1 else e 2 → V e 0 to z. if z then V e 1 else V e 2 λx : τ. e → return thunk λx : V τ .V e e e ′ → V e to y. V e ′ to z. z ' force y Example 2.3 (Divergence).For our second example, the only effect is divergence (via recursion).In this case, call-by-value and call-by-name do not have identical behaviour (they are not related by ≼ ctx as it is defined in our no-effects example).We instead show that replacing call-by-value with call-by-name does not change a terminating program into a diverging one.We extend our two languages with recursion.For CBPV we extend the syntax of computations with fixed points rec x : U C. M , and correspondingly extend the type system and operational semantics with the following rules: The variable x is bound to a thunk of the recursive computation, so recursion is done by forcing x. (This is not the only way to add recursion to CBPV [DCL18], but is the most convenient for our purposes.)Of course, by adding recursion we lose normalization (but the semantics is still deterministic).We extend the source language, and the two translations into CBPV, with recursive functions: Again, the translations are the same as those given by Levy [Lev99], except that Levy has general fixed points for call-by-name, rather than just recursive functions.The expression Ω τ = ((rec f : bool → τ. λx.f x) false) : τ enables us to distinguish between call-byvalue and call-by-name: (λx : τ. true) Ω τ diverges in call-by-value but not in call-by-name.In particular, we have N (λx : τ. true) Ω τ ⇓ return true, but there is no R such that V (λx : τ. true) Ω τ ⇓ R.
For this example, we define the program relation ≼ div by terminates with some result then the same program with M ′ instead of M terminates with the same result.
Example 2.4 (Nondeterminism).For our third example, we consider finite nondeterminism.Again call-by-value and call-by-name have different behaviour, but any result of a call-byvalue execution is also a result of a call-by-name execution (if suitable nondeterministic choices are made).
We consider CBPV without recursion, but augmented with computations fail C for nullary nondeterministic choice and M or N for binary nondeterministic choice between computations; the typing and evaluation rules are standard: The computation fail C is the unit for or, so fail C or M and M or fail C have the same behaviour as M .For each closed computation M : F A there might be zero, one or several values V : A such that M ⇓ return V .
We similarly include nullary and binary nondeterminism in the source language, and extend the call-by-value and call-by-name translations: As an example, evaluating the expression e = (λx.if x then x else true)(true or false) under call-by-value necessarily results in true, but under call-by-name we can also get false.
(We have V e ̸ ⇓ return false but N e ⇓ return false.) For nondeterminism, we define ≼ nd in the same way as our divergence example: This captures the property that any result that arises from an execution of M (which may involve call-by-value) might arise from an execution of M ′ (which may involve call-by-name).
Example 2.5 (Immutable state).Finally, we consider the basic languages enriched with an extra construct for getting the value of a immutable state whose value is either true or false.Once again we do not expect there to be any difference between call-by-value and call-by-name, and it is indeed the case that if e is a closed expression of type bool, then call-by-value and call-by-name evaluations of e have the same behaviour (this is an instance of Corollary 4.8).Notably however, the model we use for this example fails to satisfy the assumptions of our main theorem (Theorem 6.2).We augment CBPV with a computation get.This gets the value of the state, producing either true or false.Again we extend the source language, and also the call-by-value and call-by-name translations: We define the program relation ≼ get as follows:

Order-enriched denotational semantics
We give a denotational semantics for CBPV, which we use to prove instances of ≼ ctx .Since ≼ ctx is not in general symmetric, we use order-enriched models, which come with partial orders ⊑ between denotations.In an adequate model, M ⊑ N implies M ≼ ctx N .Our semantics is based on Levy's algebra models [Lev06] for CBPV, in which each computation type is interpreted as a monad algebra.(We restrict to algebra models for simplicity.Other forms of model, such as adjunction models [Lev03] can be used for the same purpose.) 3.1.Order-enriched categories and strong monads.We define the basic categorical notions we need for the rest of the paper.We assume no knowledge of enriched category theory; instead we give the relevant order-enriched (specifically Poset-enriched) definitions here.(We do however assume some basic ordinary category theory.) Definition 3.1.A Poset-category C is an ordinary category, together with a partial order ⊑ on each hom-set C(X, Y ), such that composition is monotone.
If C is a Poset-category, we refer to the ordinary category as the underlying ordinary category, and write |C| for the class of objects.In each case, composition and identities are defined in the usual way.For Set, since the hom-posets Set(X, Y ) are discrete, all of the Poset-enriched definitions coincide with the ordinary (unenriched) definitions.The objects of ωCpo are posets (X, ⊑) for which ⊑ is ω-complete, i.e. for which every ω-chain x 0 ⊑ x 1 ⊑ • • • has a least upper bound x.
Morphisms are ω-continuous functions, i.e. monotone functions that preserve least upper bounds of ω-chains.
Let C be a Poset-category.We say that C is cartesian when its underlying category has a terminal object 1 and binary products X 1 × X 2 , such that the pairing functions ⟨−, −⟩ : C(W, X 1 )×C(W, X 2 ) → C(W, X 1 ×X 2 ) are monotone.We write π i : X 1 ×X 2 → X i for the projections from a product, and write ⟨⟩ X : X → 1 for the unique map into the terminal object.In every cartesian category, there are canonical associativity isomorphisms assoc We say that C is cartesian closed when it is cartesian and its underlying category has exponentials X ⇒ Y for which the currying functions Λ : Binary coproducts in C are just binary coproducts in the underlying ordinary category, except that the copairing functions [−, −] : ) are required to be monotone.We write inl : X 1 → X 1 + X 2 and inr : X 2 → X 1 + X 2 for the coprojections.The Poset-categories Set, Poset, and ωCpo are all cartesian closed, and have binary coproducts given by disjoint union.
Above we ask for monotonicity of the bijections We do not need to require monotonicity of their inverses explictly, because this holds automatically.In particular, the uncurrying functions , and • and × are both monotone.
We interpret computation types as (Eilenberg-Moore) algebras for an order-enriched monad T, which we need to be strong (just as models of Moggi's monadic metalanguage [Mog91] use a strong monad).The definitions of strong Poset-monad and of Talgebra we give are slightly non-standard, but are equivalent to the standard ones (see for example [MU22]).In particular, it is more convenient for us to bake the strength into the (Kleisli) extension of the monad instead of having a separate strength.
These are required to satisfy the following four laws. 4 Naturality of extension in W : Specializing the Kleisli extension of T to W = 1 produces a (non-strong) extension operator (−) † : C(X, T Y ) → C(T X, T Y ), satisfying the usual monad laws: We use this to define, for every f : The latter definition makes T into a Poset-functor : the mapping f → T f is monotone, and preserves identities and composition.The definition of T on morphisms also ensures that the unit and Kleisli extension of T satisfy the following naturality laws: In the notation f † W ×□ : W × T X → T Y , the square □ indicates the position of T in the domain.Since products are symmetric, choosing to put T to the right of W is arbitrary.We construct a Kleisli extension operator with the square to the left as follows: We also define two natural transformations for sequencing of computations: seq L for left-toright and seq R for right-to-left, as follows.
We further define an effectful pairing operation ⟨⟨−, −⟩⟩: This evaluates from left to right; we do not need the right-to-left version.
For each T-algebra Z, we write U T Z for the carrier Z ∈ |C|.
Just as for the extension operator of a strong Poset-monad, we specialize the extension operator of a T-algebra to W = 1 and obtain a (non-strong) extension operator (−) ‡ : C(X, Z) → C(T X, Z).We also have an extension operator with reversed products, written (−) ‡ □×W : C(X × W, Z) → C(T X × W, Z).The following constructions of algebras are standard. where We use these constructions to interpret CBPV computation types: returner types F A are interpreted as free T-algebras, product types C 1 × C 2 are interpreted as product T-algebras, and function types A → C are interpreted as power T-algebras.

Models of CBPV.
We define the notion of (order-enriched, algebra) model as follows.Programs ⋄ ⊢ c M : bool are therefore interpreted as morphisms M : 1 → T 2. To interpret if , we use the fact that, since C is cartesian closed, products distribute over the coproduct 2 = 1 + 1.This means that for every W ∈ |C|, the coproduct W + W also exists in C, and the canonical morphism By composing the semantics of CBPV with the two translations of the source language, we obtain a call-by-value semantics V − = V − and a call-by-name semantics N − = N − of the source language.For convenience, we spell out these composed semantics in Figure 5.
We use the denotational semantics as a tool for proving instances of contextual preorders; for this we need adequacy.Definition 3.7.A model of CBPV is adequate with respect to a given program relation ≼ if for all computations Γ ⊢ c M : C and Γ ⊢ c M ′ : C we have Example 3.9.For divergence, we use C = ωCpo.The strong Poset-monad T freely adjoins a least element ⊥ to each ωCpo.The unit η X is the inclusion X → T X, while Kleisli extension is given by A T-algebra Z is equivalently an ωCpo Z with a least element ⊥ ∈ Z.The extension operator is completely determined once the carrier is fixed; it is analogous to (−) † .
In this case, the product Z 1 × Z 2 is the set of pairs ordered componentwise, and the exponential Y ⇒ Z is the set of set of ω-continuous functions ordered pointwise.Hence Z 1 × Z 2 has a least element (⊥, ⊥) (so forms a T-algebra) whenever Z 1 and Z 2 have least where P fin X is the set of finite subsets of X, and Each T-algebra is again completely determined by its carrier; a T-algebra Z is equivalently a poset Z that has finite joins.The extension operator is necessarily given by f ‡ W ×□ (w, S) = x∈S f (w, x).(The latter join exists because S is the downwards-closure of a finite set, even though S itself might not be finite.)The product Z 1 ×Z 2 is the set of pairs ordered componentwise, with joins given by i (z i , z ′ i ) = ( i z i , i z ′ i ).The power Y ⇒ Z is the set of monotone functions ordered pointwise, with joins given by We interpret nondeterministic computations using nullary and binary joins: The interpretation M of a closed computation M : F bool is one of the four subsets of 2.
Example 3.11.For immutable state, we use C = Set, with the reader monad where 2 = {true, false}.The CBPV computation get is interpreted as

The relation between call-by-value and call-by-name
We now return to the main contribution of this paper: relating call-by-value with call-byname.Recall the first step outlined in the introduction.We define a family of relations ⋉ (Definition 4.1) that compare the observable behaviour of a denotation of call-by-value type with a denotation of call-by-name type.The main result of this section is that, under certain conditions on computational effects, we have for all Γ ⊢ e : τ (Theorem 4.7).Here we work with the denotational semantics, instead of with the syntax directly, so the relations ⋉ are defined with respect to a fixed model M that we assume to be given.There is one relation ⋉ for each source-language context Γ and type τ : To define ⋉, we first give a family of relations Figure 6.The relation between call-by-value and call-by-name that relate elements f v of T (V τ ) with elements f n of U T (N τ ).Here by element we mean generalized element, so f v and f n are morphisms from some W .The definition of R τ W is in the style of a logical relation, by induction on the type τ .The cases are listed in Figure 6.Informally, we have the following.
• For τ = unit and τ = bool, we compare f v and f n directly using the order relation ⊑ on morphisms in C. (We can do this because T (V τ ) = U T (N τ ).) • For a product type τ 1 × τ 2 , we compare the first components and compare the second components.We get these components by composing with the call-by-value and call-byname interpretations of the projections fst and snd.• For a function type τ → τ ′ , we relate f v to f n when these give related results when applied to related arguments.Here we use the call-by-value and call-by-name interpretations of application.Note that in the function case we quantify over morphisms w : W ′ → W to permit varying arities W (cf. the Kripke logical relations of varying arity of [JT93]).(Precisely, this ensures that R τ is closed under precomposition with morphisms w : W ′ → W , as in Lemma 4.5(2) below.) We define ⋉ in terms of R − .To state the definition, we need some more notation.Let Γ ′ = x 1 : A 1 , . . ., x k : A k be a CBPV context.Given a morphism f i : W → A i for each i ≤ k, we obtain a morphism ⟨f i ⟩ i : W → Γ , by iterated pairing.Given instead a morphism f i : W → T A i for each i, we obtain a morphism ⟨⟨f i ⟩⟩ i : W → T Γ by iterated pairing and left-to-right evaluation: Let M be a CBPV model, and let be morphisms, where Γ = x 1 : τ 1 , . . ., x k : τ k .We write when, for all objects W ∈ |C| and families of morphisms we have This suggests we should assume that computations can be discarded, copied, and reordered with respect to other computations.Precisely, we want the following properties.Definition 4.2.Let T be a strong Poset-monad.A morphism f : X → T Y is: The non-lax versions of these properties were first defined by Führmann [Füh99].
Example 4.3.For each of our examples from Section 3.3, every morphism f : X → T Y is lax discardable, lax copyable, and lax central.
Here we define these three properties for morphisms in the model M, but there are similar notions for CBPV computations, as the following lemma shows.Lemma 4.4.Let Γ ⊢ c M : F A be a CBPV computation.The following hold for every CBPV model that is adequate with respect to a program relation ≼. 13:20 Proof.Since we assume adequacy, in each case we can reason inside the model.
We turn to the proof that lax discardability, lax copyability and lax centrality are sufficient to relate call-by-value to call-by-name.The following two lemmas are useful for this.The first lemma says that (even without assuming these properties of effects), the relations R τ are closed under various operations.Lemma 4.5.Let M be a CBPV model.For each τ , the family of relations R τ has the following closure properties.
(1) For all f v , g v , f n , g n , we have (2) For all w : W ′ → W , and f v , f n , we have (4) For all W 1 , W 2 such that the coproduct W 1 + W 2 exists, and all The proof of each property is by induction on the type τ .
(1) The unit and bool cases are trivial, while the cases for product and function types follow from the inductive hypothesis by monotonicity of composition and pairing.
(2) The unit and bool cases follow from monotonicity of composition.The case for product types follows from the inductive hypothesis.For a function type τ → τ ′ we need to show, for every w ′ : This follows immediately from the assumption (3) The unit and bool cases follow from monotonicity of extension operators.The case for product types follows from the inductive hypothesis, by naturality of extension operators and the definition of the product T-algebra: For a function type τ → τ ′ we show, for every w : By property (2) above, we have Hence, by applying (2) and the inductive hypothesis for τ ′ , we have h v R τ ′ W ′ h n , where we define with β as in Definition 3.5.It then remains to show that h v and h n are the two sides of the required instance of R τ ′ W ′ , which we prove as follows.To prove we have the correct left-hand side, we use the associativity law, naturality of Kleisli extension, and the associativity law again, as follows.
To prove we have the correct right-hand side, we use naturality of extension, and the definition of power T-algebras, as follows.
(4) The unit and bool cases are immediate from monotonicity of the copairing operator [−, −].For product types, it is enough to note that , and then apply the inductive hypothesis.For a function type τ → τ ′ , we show that so from the assumption where We can therefore apply the inductive hypothesis for τ , and then (2), to obtain The result follows because The second lemma consists of some technical consequences of lax discardability, lax copyability and lax centrality; we state them here for use in the proof of Theorem 4.7 below.For convenience, we render each of the inequalities in the statement of the lemma in the syntax of CBPV (we will not need the syntactic inequalities in the following, so we omit the precise statements and proof).
• For a variable x j , we are required to prove A simple induction on j, using Lemma 4.6(1), tells us that The result then follows from the assumption f v j R τ j W f n j via Lemma 4.5(1).• For the expression (), we are required to show • For a pair (e 1 , e 2 ), the inductive hypothesis tells us that so that Lemma 4.5 implies • For fst e, we need to show • The snd case is similar to the fst case.
• For the expression true, we need to show This follows from lax discardability, and naturality of η.
• The false case is similar to the true case.
• For if e 0 then e 1 else e 2 , the inductive hypothesis gives us where we define By applying all of the closure properties of Lemma 4.5, we therefore have 13:26 This is not quite what we need, but it does imply the result via another use of Lemma 4.5(1), as follows.We have by precomposing with dist −1 and using the universal property of the coproduct T (V Γ ) + T (V Γ ).It follows that • For a λ-abstraction λx : τ. e of type τ → τ ′ , consider arbitrary w : W ′ → W and g v , g n such that We need to show ) so that, by the inductive hypothesis, we have We show that this is the required instance of R τ ′ W ′ by rewriting both sides as follows.For the left-hand side, we have • For an application e e ′ , where e has type τ → τ ′ , the inductive hypothesis for e ′ gives us ) so that, by the inductive hypothesis for e with w = id W , we have We rewrite both sides as follows.For the left we have and for the right, As a corollary, we can directly compare the call-by-value and call-by-name translations of source-language programs (closed expressions of type bool).The conclusion of this corollary, namely V e ≼ N e is independent of the choice of model M. In contrast, the conclusion of Theorem 4.7 is not independent of M. Theorem 4.7 should therefore be viewed as a result about the denotations V e and N e , rather than about the translations V e and N e .We rectify this in Section 6 below, where the conclusion of our main result Theorem 6.2 relates V e with N e via the contextual preorder, which is independent of M.

4.1.
Examples.To conclude this section, we discuss the consequences of the results above for each of our examples.
We first note that we can in fact simplify the definition of ⋉ for each of these examples, by using the fact that, in each of the three Poset-categories Set, Poset, ωCpo, morphisms are in particular functions, and are ordered pointwise.(We treat a set as a discrete poset here.)It follows that instead of considering generalized elements f : W → X, it is enough to consider ordinary elements t ∈ X (which we can identify with morphisms t : 1 → X).The simplification of R − we obtain is defined as follows.

GALOIS CONNECTING CALL-BY-VALUE AND CALL-BY-NAME
13:31 The ith projection of the left-hand side evaluates both N 1 and N 2 , but the ith projection of the right is just N i .Thus moving from left to right discards effects.Similarly, converting a strict pair M of type F (V unit × unit ) = F (unit × unit) to call-by-name and back duplicates the effects of M : 2 ) These suggest that lax discardability and lax copyability will be useful, and indeed we use both of these properties in the proof of Lemma 5.8 below.
For function types we need even more.Consider what happens when we convert a CBPV computation M : F (V unit → unit ) = F (U (unit → F unit)) to call-by-name and then back to call-by-value.By doing this we obtain the denotation of a computation that immediately returns: The round-trip from call-by-value to call-by-name and back thunks the computational effects of M , suspending them until the function is applied.The property we ask for the model to satisfy in order to make this a valid inequality is lax thunkability of morphisms.In particular, it follows from this proposition that lax discardability, lax copyability, and lax centrality are not enough.Our immutable state example satisfies all three of those properties, but the morphism get : 1 → T 2 is not lax thunkable, so we do not have V e ⊑ ψ τ • N e • φΓ for every e.

The reasoning principle
We now use the Galois connections defined in the previous section to relate the call-by-value and call-by-name translations of expressions, and arrive at our main reasoning principle.
Recall that the problem with comparing V e with N e directly is that they have different types.We render the Galois connections defined in the previous section in the syntax of CBPV, and then construct from N e a computation that we can directly compare with V e : V e ≼ ctx Ψ τ N e [ ΦΓ ] More precisely, we render ϕ τ and ψ τ in the syntax as maps Φ τ from call-by-value computations to call-by-name computations, and Ψ τ from call-by-name to call-by-value. 8These are defined, again by induction on τ , in Figure 8. (We use some auxiliary variables in the definition, which are assumed to be fresh.)We further define, for each source-language context Γ = x 1 : τ 1 , . . ., x k : τ k , a substitution ΦΓ = x 1 → thunk Φ τ 1 (return x 1 ) , . . ., x k → thunk Φ τ k (return x k ) 8 We define Φ and Ψ directly as maps from computations to computations, but we could instead have defined computations ) and then recovered Φ and Ψ modulo βη-laws for thunks, by substitution.This definition is slightly less convenient to work with however.
Both of these inequalities are in general proper (they are not contextual equivalences).To see this, consider our divergence example, for which the above inequalities hold.For each C, let Ω C be the diverging computation rec x : U C. force x (which has type C).Then if τ = bool → bool and M = Ω F (V τ ) , we do not have M ≽ ctx Ψ τ (Φ τ M ), because for E = (□ to f. return false) the computation E[M ] diverges but E[Ψ τ (Φ τ M )] ⇓ return false.
In particular, our maps between call-by-value and call-by-name are merely Galois connections, and not sections or retractions.This contrasts with Reynolds [Rey74], who obtains a retraction between direct and continuation semantics.

Related work
Comparing evaluation orders.Plotkin [Plo75] and many others (e.g.[IT16]) relate call-by-value and call-by-name.Crucially, they consider λ-calculi with no effects other than divergence.This makes a significant difference to the techniques that can be used, in particular because in this case the equational theory for call-by-name is strictly weaker than for call-by-value.This is not necessarily true for other effects.Other evaluation orders (such as call-by-need) have also been compared in similarly restricted settings [MOTW95,MM19,HH19].We suspect our technique could also be adapted to these.Here we use CBPV as a calculus in which to reason about both call-by-value and call-by-name, but other calculi (e.g. the modal calculus of [ESPU22]) may be suitable for this purpose.
It might also be possible to recast some of our work in terms of the duality between call-by-value and call-by-name [Fil89, CH00, Wad03, Sel01], In particular, this may shed some light on our definitions of Φ and Ψ.It is not clear to us what the precise connection is however.While Selinger [Sel01] defines translations between call-by-value and call-by-name versions of Parigot's λµ-calculus [Par92], these translations behave differently to ours, in particular, they are semantics-preserving.
Relating semantics of languages.The technique we use here to relate call-by-value and call-by-name is based on the idea used first by Reynolds [Rey74] to relate direct and continuation semantics of the λ-calculus, and later used by others (e.g.[MW85,Kuč98,CF94,Fil96]).Reynolds constructs a relation between the two semantics, and uses this to establish a retraction between direct and continuation semantics, just as we construct a relation between call-by-value and call-by-name and then use this to establish a Galois connection.A minor difference is that Reynolds relies on continuations with a large-enough domain of answers (e.g. a solution to a particular recursive domain equation).Our maps exist for any choice of model.We are the first to use this technique to relate call-by-value and call-by-name.There has been some work [SF92,LD93,SW96] on soundness and completeness properties of translations (similar to the translations into CBPV), in particular using Galois connections (and similar structures) for which the order is reduction of programs.Our results would fail if we used reduction of programs directly, so we consider only the observable behaviour of programs.
There are some similarities between our work and the work of New et al. [NL20,NLA21] on gradual typing.In particular, [NLA21] has embedding-projection pairs (a special case of Galois connections) for casting from a more dynamic type to a less dynamic type, and vice versa.Their application is quite different however.The double category perspective used in [NL20] may also be illuminating here.

Conclusions
In this paper, we give a general reasoning principle (Theorem 6.2) that relates the observable behaviour of terms under call-by-value and call-by-name.The reasoning principle works for various collections of computational effects, in particular, it enables us to obtain theorems about divergence and nondeterminism.It is about open expressions, and enables us to change evaluation order within programs.
The technique we use involves first relating the observable behaviour of the call-byvalue and call-by-name translations expressions via a logical relation (Theorem 4.7).We obtain a result about call-by-value and call-by-name evaluations of programs as a corollary (Corollary 4.8).Applying this to divergence, we show that if the call-by-value execution terminates with some result then the call-by-name execution terminates with the same result.For nondeterminism, we show that all possible results of call-by-value executions are possible results of call-by-name executions.There may be other collections of effects we can apply our technique to, including combinations of divergence and nondeterminism.
We expect that our technique can be applied to other evaluation orders.Two evaluation orders can be related by giving translations into some common language (here we use CBPV), constructing maps between the two translations, and showing that (for some models) these maps form Galois connections.A major advantage of the technique is that it allows us to identify axiomatic properties of computational effects (thunkable, etc.) that give rise to relationships between evaluation orders.

Figure 3 .
Figure 3. Translations from the source language into CBPV

Γ
⊢ c get : F bool Big-step evaluation has a slightly different form in this case.We write M ⇓ b R to mean M evaluates to R when the state is b ∈ {true, false}.The rules are those of Figure 2 (with the subscript b added), plus get ⇓ true return true get ⇓ false return false Example 3.2.We use the following three Poset-categories.Poset-category C Objects X ∈ |C| Morphisms f : X → Y Order f ⊑ f ′ Set sets functions equality Poset posets monotone functions pointwise ωCpo ωcpos ω-continuous functions pointwise Vol. 20:1 GALOIS CONNECTING CALL-BY-VALUE AND CALL-BY-NAME13:13

Definition 3. 4 (
Eilenberg-Moore algebra).Let T be a strong Poset-monad on a cartesian Poset-category C. A T-algebra Z = (Z, (−) ‡ ) is a pair of: • an object Z ∈ |C| (the carrier ); • a monotone function (−) ‡ W ×□ : C(W × X, Z) → C(W × T X, Z) (the extension operator) for each W, X ∈ |C|.These are required to satisfy the following three laws.• Naturality in W : Definition 3.5.Let T be a strong Poset-monad on a cartesian closed Poset-category C.• The free T-algebra F T X on an object X ∈ |C| has carrier T X; the extension operator is Kleisli extension (−) † .• If Z 1 and Z 2 are T-algebras, then their product Z 1 × Z 2 is the T-algebra with carrier Z 1 × Z 2 , and extension operator Definition 3.6.A model M = (C, T) of CBPV consists of • a cartesian closed Poset-category C that admits the coproduct 2 = 1 + 1; • a strong Poset-monad T on C. Given a model M = (C, T), the interpretation − of CBPV is defined in Figure 4. Value types A are interpreted as objects A ∈ |C|, while computation types C are interpreted as T-algebras.The value type U C is interpreted as the carrier U T C of the T-algebra C .Typing contexts Γ are interpreted as objects Γ ∈ C using the cartesian structure of C; if (x : A) ∈ Γ then we write π x for the corresponding projection Γ → A .Values Γ ⊢ V : A (respectively computations Γ ⊢ c M : C) are interpreted as morphisms Γ ⊢ V : A (resp.Γ ⊢ c M : C ) in C; we often omit the typing context and type when writing these.

N e 0 Figure 5 .
Figure 5. Denotational semantics of call-by-value and call-by-name As we mention above, our goal is relate the observable behaviour of V e to the observable behaviour of N e .Precisely, we want to prove V e ⋉ N e .By considering what this means for specific expressions e, we can see that this is not true in general, for three reasons:• Consider the expression e = (λx : bool.()) : bool → unit If we apply this to an argument, then in call-by-value we evaluate the argument but in call-by-name we do not.•Consider the expression e = (λx : bool.if x then x else x) : bool → bool If we apply this, then in call-by-value the argument is evaluated once, but in call-by-name the argument is evaluated twice.•Consider the expression e = (λx : bool.λy : bool.if y then x else x) : bool → bool → bool In call-by-value the argument to the outer function is evaluated first, and the argument to the inner function is evaluated second.In call-by-name the arguments are evaluated in the opposite order.
Corollary 4.8.Let M = (C, T) be a CBPV model that is adequate with respect to a program relation ≼.If every morphism f : X → T Y is lax discardable, lax copyable, and lax central, then for every closed expression e : bool, we haveV e ≼ N eProof.By Theorem 4.7 we have V e ⋉ N e , so in particular V e R bool 1 N e By definition, the latter means V e ⊑ N e , which implies the result by adequacy.

Figure 7 .
Figure 7. Semantic morphisms ϕ from call-by-value to call-by-name and ψ from call-by-name to call-by-value

Figure 8 .
Figure 8. Syntactic maps Φ from call-by-value to call-by-name and Ψ from call-by-name to call-by-value e 1 , e 2 ) | fst e | snd e | true | false | if e 0 then e 1 else e 2 | λx : τ. e | e e ′