Modular session types for objects

Session types allow communication protocols to be specified type-theoretically so that protocol implementations can be verified by static type checking. We extend previous work on session types for distributed object-oriented languages in three ways. (1) We attach a session type to a class definition, to specify the possible sequences of method calls. (2) We allow a session type (protocol) implementation to be modularized, i.e. partitioned into separately-callable methods. (3) We treat session-typed communication channels as objects, integrating their session types with the session types of classes. The result is an elegant unification of communication channels and their session types, distributed object-oriented programming, and a form of typestate supporting non-uniform objects, i.e. objects that dynamically change the set of available methods. We define syntax, operational se-mantics, a sound type system, and a sound and complete type checking algorithm for a small distributed class-based object-oriented language with structural subtyping. Static typing guarantees that both sequences of messages on channels, and sequences of method calls on objects, conform to type-theoretic specifications, thus ensuring type-safety. The language includes expected features of session types, such as delegation, and expected features of object-oriented programming, such as encapsulation of local state.


Introduction
Computing infrastructure has become inherently concurrent and distributed, from the internals of machines, with the generalisation of many-and multi-core architectures, to data storage and sharing solutions, with "the cloud". Both hardware and software systems are now not only distributed but also collaborative and communication-centred. Therefore, the precise specification of the protocols governing the interactions, as well as the rigorous verification of their correctness, are critical factors to ensure the reliability of such infrastructures.
Software developers need to work with technologies that may provide correctness guarantees and are well integrated with the tools usually used. Since Java is one of the most widely used programming languages, the incorporation of support to specify and implement correct software components and their interaction protocols would be a step towards more reliable systems.
Behavioural types represent abstractly and concisely the interactive conduct of software components. These kind of types are simple yet expressive languages, that characterise the permitted interaction within a distributed system. The key idea is that some aspects of dynamic behaviour can be verified statically, at compile-time rather than at run-time. In particular, when working with a programming language equipped with a behavioural type system, one can statically ensure that an implementation of a distributed protocol, specified with types, is not only safe in the standard sense ("will not go wrong"), but also that the sequences of interactions foreseen by the protocol are realizable by its distributed implementation. Thorough descriptions of the state-of-the-art of research on the topic have been prepared by COST Action IC1201 (Behavioural Types for Reliable Large-Scale Software Systems, "BETTY") [3,46].
The problem we address herein is the following: can one specify the full interactive behaviour of a given protocol as a collection of types and check that a Java implementation of that protocol realises safely such behaviour?
The solution we present works like this: first, specify as a session type (a particular idiom of behavioural types) the behaviour of each party involved in the protocol; second, using each such term to type a class, implement the protocol as a (distributed, using channel-based communication where channels are objects) Java program; third, simply compile the code and if the type checker accepts it, then the code is safe and realises the protocol. Details follow.
Session types [43,70] allow communication protocols to be specified type-theoretically, so that protocol implementations can be verified by static type checking. The underlying assumption is that we have a concurrent or distributed system with bi-directional pointto-point communication channels. These are implemented in Java by TCP/IP socket connections. A session type describes the protocol that should be followed on a particular channel; that is to say, it defines the permitted sequences, types and directions of messages. For example, the session type S = ! [Int] . ? [Bool] . end specifies that an integer must be sent and then a boolean must be received, and there is no further communication. More generally, branching and repetition can also be specified. A session type can be regarded as a finite-state automaton whose transitions are annotated with types and directions, and whose language defines the protocol.
In previous work [36] we proposed a new approach to combining session-typed communication channels and distributed object-oriented programming. Our approach extends earlier work and allows increased programming flexibility. We adopted the principle that it should be possible to store a channel in a field of an object and allow the object's methods to use the field like any other; we then followed the consequences of this idea. For example, consider a field containing a channel of type S above, and suppose that method m sends the integer and method n receives the boolean. Because the session type of the channel requires the send to occur first, it follows that m must be called before n. We therefore need to work with non-uniform objects, in which the availability of methods depends on the state of the object: method n is not available until after method m has been called. In order to develop a static type system for object-oriented programming with session-typed channels, we use a form of typestate [69] that we have previously presented under the name of dynamic interfaces [74]. In this type system, the availability of a class's methods (i.e., the possible sequences of method calls) is specified in a style that itself resembles a form of session type, giving a pleasing commonality of notation at both the channel and class levels.
The result of this combination of ideas is a language that allows a natural integration of programming with session-based channels and with non-uniform objects. In particular, the implementation of a session can be modularized by dividing it into separate methods that can be called in turn. This is not possible in SJ [45], the most closely related approach to combining sessions and objects (we discuss related work thoroughly in Section 9). We believe that we have achieved a smooth and elegant combination of three important highlevel abstractions: the object-oriented abstraction for structuring computation and data, the typestate abstraction for structuring state-dependent method availability, and the session abstraction for structuring communication.
Contributions In the present paper we formalize a core distributed class-based object-oriented language with a static type system that combines session-typed channels and a form of typestate. The language is intended to model programming with TCP/IP sockets in Java. The formal language differs from that introduced in our previous work [36] by using structural rather than nominal types. This allows several simplifications of the type system. We have also simplified the semantics, and revised and extended the presentation. We prove that static typing guarantees two runtime safety properties: first, that the sequence of method calls on every non-uniform object follows the specification of its class's session type; second, as a consequence (because channel operations are implemented as method calls), that the sequence of messages on every channel follows the specification of its session type. This paper includes full statements and proofs of type safety, in contrast to the abbreviated presentation in our conference paper. We also formalize a type checking algorithm and prove its correctness, again with a revised and expanded presentation in comparison with the conference paper.
There is a substantial literature of related work, which we discuss in detail in Section 9. Very briefly, the contributions of our paper are the following.
• In contrast to other work on session types for object-oriented languages, we do not require a channel to be created and completely used (or delegated) within a single method. Several methods can operate on the same channel, thus allowing effective encapsulation of channels in objects, while retaining the usual object-oriented development practice. This is made possible by our integration of channels and non-uniform objects. This contribution was the main motivation for our work. • In contrast to other typestate systems, we use a global specification of method availability, inspired by session types, as part of a class definition. This replaces pre-and post-condition annotations on method definitions, except in the particular case of recursive methods. • When an object's typestate depends on the result (in an enumerated type) of a method call, meaning that the result must be case-analyzed before using the object further, we do not force the case-analysis to be done immediately by using a combined  Figure 1: A class describing a file in some API "switch-call" primitive. Instead, the method result can be stored in a field and the case-analysis can happen at any subsequent point. Although this feature significantly increases the complexity of the formal system and could be omitted for simplicity, it supports a natural programming style and gives more options to future programming language designers. • Our structural definition of subtyping provides a flexible treatment of relationships between typestates, which can also support inheritance; this is discussed further in Section 4.7. The remainder of the paper is structured as follows. In Section 2 we illustrate the concept of dynamic interfaces by means of a sequential example. In Section 3 we formalize a core sequential language and in Section 4 we describe some extensions. In Section 5 we extend the sequential example to a distributed setting and in Section 6 we extend the formal language to a core distributed language. In Section 7 we state and prove the key properties of the type system. In Section 8 we present a type checking algorithm and prove its soundness and completeness, and describe a prototype implementation of a programming language based on the ideas of the paper. Section 9 contains a more extensive discussion of related work; Section 10 outlines future work and concludes.

A Sequential Example
A file is a natural example of an object for which the availability of its operations depends on its state. The file must first be opened, then it can be read repeatedly, and finally it must be closed. Before reading from the file, a test must be carried out in order to determine whether or not any data is present. The file can be closed at any time.
There is a variety of terminology for objects of this kind. Ravara and Vasconcelos [67] refer to them as non-uniform. We have previously used the term dynamic interface [74] to indicate that the interface, i.e. the set of available operations, changes with time. The term typestate [69] is also well established. Figure 1 defines the class File, which we imagine to be part of an API for using a file system. The definition does not include method bodies, as these would typically be implemented natively by the file system. What it does contain is method signatures and, crucially, a session type definition which specifies the availability of methods. We will refer to a skeleton class definition of this kind as an interface, using the term informally to mean  Figure 1 that method definitions are omitted. The figure shows the method signatures in the session type and not as part of the method definitions as is normal in many programming languages. This style is closer to the formal language that we define in Section 3.
Line 3 declares the initial session type Init for the class. This and other session types are defined on lines 4-7. We will explain them in detail; they are types of objects, indicating which methods are available at a given point and which is the type after calling a method. In a session type, the constructor {...}, which we call branch, indicates that certain methods are available. In this example, Init declares the availability of one method (open), states Open and Read allow for two methods each, and state Close for a single method (close). For technical convenience, the presence of data is tested by calling the method hasNext, in the style of a Java iterator, rather than by calling an endOfFile method. If desired, method hasNext could also be included in state Read. The constructor ... , which we call variant, indicates that a method returns a value from an enumeration, and that the subsequent type depends on the result. For example, from state Init the only available method is open, and it returns a value from an enumeration comprising the constants (or labels) OK and ERROR. If the result is OK then the next state is Open; if the result is ERROR then the state remains Init . It is also possible for a session type to be the empty set of methods, meaning that no methods are available; this feature is not used in the present example, but would indicate the end of an object's useful life.
The session type can be regarded as a finite state automaton whose transitions correspond to method calls and results. This is illustrated in Figure 2. Notice the two types of nodes ( {...} and ... ) and the two types of labels in arcs (method names issuing from {...} nodes and enumeration constants issuing from ... nodes).
Our language does not include constructor methods as a special category, but the method open must be called first and can therefore be regarded as doing initialisation that might be included in a constructor. Notice that open has the filename as a parameter. Unlike a typical file system API, creating an object of class File does not associate it with a particular file; instead this happens when open is called.
The reader might expect a declaration void close() rather than Null close(); for simplicity, we do not address procedures in this paper, instead working with the type Null inhabited by a single value, null. Methods open and hasNext return a constant from an enumeration: Enumerations are simply sets of labels, and do not need to be declared with names. Figure 3 defines the class FileReader, which uses an object of class File. FileReader has a session type of its own, defined on lines 2-3. It specifies that methods must be called in the sequence init , read, toString, toString, . . . . Line 5 defines the fields of FileReader. The formal language does not require a type declaration for fields, since fields always start with type Null, and are initialised to value null. Fields are always private to a class, even if we do not use a corresponding keyword. Lines 7-10 define the method init , which has initialisation behaviour typical of a constructor. Lines 12-19 illustrate the switch construct. In this particular case the switch is on the result of a method call. One of the distinctive features of our language is that it is possible, instead, to store the result of the method call in a field and later switch on the field; we will explain this in detail later. This contrasts with, for example, Sing# [31], in which the call/switch idiom is the only possibility. The while loop (lines [16][17] is similar in the sense that the result of file .hasNext must be tested in order to find out whether the loop can continue, calling file .read, or must terminate. Line 21 defines the method toString which simply accesses a field.
Typechecking the class FileReader according to our type system detects many common mistakes. Each of the following code fragments contains a type error. Here, a switch is correctly used to find out whether or not the file was successfully opened. However, if file is in state Open, the read method cannot be called immediately. First, hasNext must be called, with a corresponding switch.
• r e s u l t = f i l e . open ( f i l e n a m e ) ; . . . switch ( r e s u l t ) case ERROR: f i l e . c l o s e ( ) ; In state ERROR, method close is not available (because the file was not opened successfully). The only available method is open.
• f i l e . c l o s e ( ) ; i f ( f i l e . hasNext ( ) ) . . .
After calling close, the file is in state Init , so the method hasNext is not available. Only open is available. Clearly, correctness of the code in Figure 3 requires that the sequence of method calls on field file within class FileReader matches the available methods in the session type of class File, and that the appropriate switch or while loops are performed when prescribed by session types of the form ... in class File . Our static type system, defined in Section 3, enables this consistency to be checked at compile-time. A distinctive feature of our type system is that methods are checked in a precise order: that prescribed by the session type ( init , read, toString in class FileReader, Figure 3). As such the type of the private reference file always has the right type (and no further annotations-pre/post conditions-are required when in presence of non-recursive methods). Also, in order to check statically that an object with a dynamic interface such as file is used correctly, our type system treats the reference linearly so that aliases to it cannot be created. This restriction is not a problem for a simple example such as this one, but there is a considerable literature devoted to more flexible approaches to unique ownership. We discuss this issue further in Sections 4. 6, 9 and 10. In order to support separate compilation we require only the interface of a class, including the class name and the session type (which in turn includes the signature of each method). For example, in order to typecheck classes that are clients of FileReader, we only need its interface. Similarly, to typecheck class FileReader, which is a client of File, it suffices to use the interface for class File, thus effectively supporting typing clients of classes containing native methods. Figure 4 defines the interface for a class FileReadToEnd. This class has the same method definitions as File , but the close method is not available until all of the data has been read. According to the subtyping relation defined in Section 3.4, type Init of File is a subtype of type Init of FileReadToEnd, which we express as File . Init <: FileReadToEnd.Init. Subtyping guarantees safe substitution: an object of type File . Init can be used whenever an object of type FileReadToEnd.Init is expected, by forgetting that close is available in more states. As it happens, FileReader reads all of the data from its File object and could use a FileReadToEnd instead.

A Core Sequential Language
We now present the formal syntax, operational semantics, and type system of a core sequential language. As usual, the formal language makes a number of simplifications with respect to the more practical syntax used in the examples in Section 2. We summarise below the main differences with what was discussed in the previous section; in Section 4, we will discuss in more detail how some usual programming idioms, which would be expected in a full programming language, can be encoded into this formal core.
• Every method has exactly one parameter. This does not affect expressivity, as multiple parameters can be passed within an object, and a dummy parameter can be added if necessary: we consider a method call of the form f.m() as an abbreviation for f.m(null). • Field access and assignment are defined in terms of a swap operation f ↔ e which puts the value of e into the field f and evaluates to the former content of f . This operation is formally convenient because our type system forbids aliasing. In Java, the expression f = e computes the result of e and then both puts it into f and evaluates to it, allowing expressions such as g = (f = e) which create aliases; reading a field without removing its content also allows creation of aliases. The swap operation is a combined read-write which does not permit aliasing.
The normal assignment operation f = e is an abbreviation for f ↔ e; null (where the sequence operator explicitly discards the former content of f ) and field read as the standalone expression f is an abbreviation for f ↔ null. They differ from usual semantics by the fact that field read is destructive and that the assigment expression evaluates to null. • In the examples, all method signatures appearing in a branch session type indicate both a return type and a subsequent session type. In general, those types are two separate things. However, when the subsequent behaviour of the object depends on the returned value, like the case of Open in type Init on line 3 of Fig. 1, the return type is an enumerated set of labels and the subsequent session type is a variant which must provide cases for exactly these labels. To simplify definitions, we avoid this redundant specification in the formal language, and when the subsequent session type is a variant, the return type of the method is always the special type variant-tag, which indicates that the method will return a label from the variant.
3.1. Syntax. We separate the syntax into the top-level language ( Figure 5) and the extensions required by the type system and operational semantics ( Figure 6). Identifiers C, m, f and l are taken from disjoint countable sets representing names of classes, methods, fields and labels respectively. The vector arrow indicates a sequence of zero or more elements of The productions for types, values and expressions extend those in Figure 5. Session types may never contain types of the form link f , even in the extended syntax. Figure 6: Extended syntax, used only in the type system and semantics the syntactic class it is above. Similarly, constructs indexed by a set denote a finite sequence.
We use E to specifically denote finite sets of labels l, whereas I is any finite indexing set. Field names always refer to fields of the current object; there is no qualified field specification o.f . In other words, all fields are private. Method call is only available on a field, not an arbitrary expression. This is because calling a method changes the session type of the object on which the method is called, and in order for the type system to record this change, the object must be in a specified location (field).
Conversely, there is no unqualified method call in the core language. Calling a method on the current object this, which we call a self-call, behaves differently from external calls with respect to typing and will be discussed as an extension in Section 4.5, A program consists of a sequence of class declarations D. In the core language, types in a top-level program only occur in the session part of a class declaration: no type is declared for fields because they can vary at run-time and are always initially null, and method declarations are also typeless, as explained earlier.
A session type S corresponds to a view of an object from outside. It shows which methods can be called, and their signatures, but the fields are not visible. We refer to {T ′ i m i (T i ) : S i } i∈I as a branch type and to l : S l l∈E as a variant type. Session type end abbreviates the empty branch type {}. The core language does not include named session types, or the session and where clauses from the examples; we just work with recursive session type expressions of the form µX.S, which are required to be contractive, i.e. containing no subexpression of the form µX 1 .· · · µX n .X 1 . We require contractivity so that every session type defines some behaviour. The µ operator is a binder, giving rise, in the standard way, to notions of bound and free variables and alpha-equivalence. A type is closed if it includes no free variables. We denote by T { U / X } the capture-avoiding substitution of U for X in T .
Value types which can occur either as parameter or return type for a method are: Null which has the single value null, a session type S which is the type of an object, or an enumerated type E which is an arbitrary finite set of labels l. Additionally, the specific return type variant-tag is used for method occurrences after which the resulting session type is a variant, and means that the method result will be the tag of the variant. The set of possible labels appears in the variant construct of the session type, so it is not necessary to specify it in the return type of the method. However, in the example code, the set of labels is written instead of variant-tag, so that the method signature shows the return type in the usual way.
The type system, which we will describe later, enforces the following restrictions on session types: the immediate components of a variant type are always branch types, and the session type in a class declaration is always a branch. This is because a variant type is used only to represent the effect of method calls and the dependency between a method result and the subsequent session type, so it only makes sense immediately within a branch type. Figure 6 defines additional syntax that is needed for the formal system but is not used in top-level programs. This includes: • some extra forms of types, which are used internally to type some subexpressions but cannot be the argument type or return type of a method, and thus are never written in a program; • intermediate expressions that cannot appear in a program but arise from the operational semantics; • syntax for the heap.
Internal types. The first internal type we add is the type link f , where f is the name of a field. This type is related to variant session types and the variant-tag type, in the way illustrated by the following example: suppose that, in some context, field f of the current object contains an object whose type is {variant-tag m(Null) : l : S l l∈E }. This means that the expression f.m(null) is allowed in this context and will both: change the abstract state of the object in f to one of the S l , and return the label l corresponding to that particular state. Thus, there is a link between the value of the expression and the type of field f after evaluating the expression, and the type system needs to keep track of this link; to this end, the expression f.m(null) is given, internally, the type link f , rather than variant-tag which does not contain enough information. The use of link f is also illustrated in Figure 13.
The second internal type is an alternative form of object type, C[F ], which has a field typing instead of a session type. Recall that in our language, all object fields are private; therefore, normally the type of an object is a session type which only refers to methods. However, an object has access to its own fields, so for typechecking a method definition, the type environment needs to provide types for the fields of the current object (this). Since the types of the fields change throughout the life of the object, the class definition is not enough to know their types at a particular point. We thus use a field typing F , which is usually a record type associating one type to each field of the object. For example, C[{Null f 1 ; S f 2 }] represents an object of class C with exactly two fields, f 1 , which currently contains a value of type Null, and f 2 which currently contains an object in state S. Note that in C[F ], F must provide types for exactly all the fields of class C.
The other form of field typing is a variant field typing, which regroups several possible sets of field types indexed by labels. For example, represents an object of class C, whose two fields are f 1 and f 2 , and where either f 1 has type Null and f 2 has type S, or f 1 has type E and f 2 has type S ′ , depending on the value of the label.
These field typings cannot be the type of expressions (which cannot evaluate to this); they only represent the type of the current object in type environments. The relation between field typings and session types will be discussed in Section 3.5.2.
Internal expressions, heap, and states. These other additions are used to define the operational semantics. A heap h maps object identifiers o, taken from yet another countable set of names, to object records R. We write dom(h) for the set of object identifiers in h. The identifiers are values, which may occur in expressions. The operation h, {o = R} represents adding a record for identifier o to the heap h and we consider it to be associative and commutative, that is, h is essentially an unordered set of bindings. It is only defined if o ∈ dom(h). Paths r represent locations in the heap. A path consists of a top-level object identifier followed by an arbitrary number of field specifications. We use the following notation to interpret paths relative to a given heap. • For any value v and any j ∈ I, we also define • In any other case, these operations are not defined. Note in particular that h(r) is not defined if r is a path that exists in h but does not point to an object identifier.
There is a new form of expression, return e, which is used to represent an ongoing method call.
Finally, a state consists of a heap and an expression, and the operational semantics will be defined as a reduction relation on states; E are evaluation contexts in the style of Wright and Felleisen [76], used in the definition of reduction.
The semantic and typing rules we will present next are implicitly parameterized by the set of declarations D which constitute the program. It is assumed that the whole set is available at any point and that any class is declared only once. We do not require the sets of method or field names to be disjoint from one class to another. We will use the following notation: if class C {S; f ; M } is one of the declarations, C.session means S and C.fields means f , and if m(x) {e} ∈ M then C.m is e. Figure 7 defines an operational semantics on states (h * r , e) consisting of a heap h, a path r in the heap indicating the current object, and an expression e. In general, e is an expression obtained by a series of reduction steps from a method body, where the method was called on the object identified by the path r. All rules have the implicit premise that the expressions appearing in them must be defined. For example, f ↔ v only reduces if h(r) is an object record containing a field named f . An example of reduction, together with typing, is presented in Figure 13 and discussed at the end of the present section. The current object path r is used to resolve field references appearing in the expression e. It behaves like a call stack: as shown in R-Call, when a method call on a field f (relative to the current object located at r) is entered, the object in r.f becomes the current object; this is indicated by changing the path to r.f . Additionally, the method body, with the actual parameter substituted for the formal parameter, is wrapped in a return expression and replaces the method call. When the body has reduced to a value, this value is unwrapped by R-Return which also pops the field specification f from the path, recovering the previous current object r. This is illustrated in Figure 13, which also shows the typing of expressions in a series of reductions. R-New creates a new object in the heap, with null fields. R-Swap updates the value of a field and reduces to its former value.

Operational Semantics.
R-Switch is standard. R-Seq discards the result of the first part of a sequential composition. R-Context is the usual rule for reduction in contexts.
To complete the definition of the semantics we need to define the initial state. The idea is to designate a particular method m of a particular class C as the main method, which is called in order to begin execution. The most convenient way to express this is to have an initial heap that contains an object of class C, which is also chosen as the current object, and an initial expression e which is the body of m. The initial state is therefore where top is the identifier of the top-level object of class C, which is the only object in the heap, and the current object path is also top. Strictly speaking, method m must have a parameter x; we take x to be of type Null and assume that it does not occur in e.

3.3.
Example of reduction. Assume that the top-level class is C, containing fields f and g. Assume also that there is another class C ′ which defines the set of methods {m i | i ∈ I}. Finally, assume that the body of the main method of C is where j is some element of I.
The initial state is where for simplicity we have ignored the parameter of m j . Figure 8 shows the sequence of reduction steps until one of the cases of the switch is reached. The first step is expansion of the syntactic sugar for assignment, translating it into a swap followed by null. Another similar translation step occurs later. The first real reduction step is R-New, creating an object o, followed by R-Swap to complete the assignment into field f and then R-Seq to tidy up. Next, the step labelled "simplify" informally removes null in order to avoid carrying it through to an uninteresting R-Seq reduction later. Now assume that the body of method m j is e. Reduction by R-Call changes the current object path to top.f because the current object is now o. Several reduction steps convert e to a particular element of the enumerated type E, which we call l 0 . After that, R-Return changes the current object path back to top, and then some R-Swap and R-Seq steps bring l 0 into the guard of the switch, finally allowing the appropriate case e l 0 to be selected.
We will return to this example in Section 3.6, to show how each state is typed. Subtyping is an essential ingredient of the theory of session types. Originally proposed by Gay and Hole [37], it has been widely used in other session-based systems, with subject-reduction and type-safety holding. The guiding principle is the "safe substitutability principle" of Liskov and Wing [51], which states that, if S is a subtype of T , then objects of type T in a program may be safely replaced with objects of type S. Two kinds of types in the top-level core language are subject to subtyping: enumerated types and session types. The internal language also has field typings; subtyping on them is derived from subtyping on top-level types by the rules in Figure 9.
Subtyping for enumerated types is defined as simple set inclusion: E <: E ′ if and only if E ⊆ E ′ . We refer to subtyping for session types as the sub-session relation. Because session types can be recursive, the sub-session relation is defined coinductively, by defining necessary conditions it must satisfy and taking the largest relation satisfying them. The definition involves checking compatibility between different method signatures, which itself is dependent on the whole subtyping relation. We proceed as follows: given a candidate sub-session relation R, we define an R-compatibility relation between types and between method signatures which uses R as a sub-session relation. We then use R-compatibility in the structural conditions that R must satisfy in order to effectively be a sub-session relation.
Let S denote the set of contractive, closed, class session types. We deal with recursive types using the following unfold operator: We now define the two compatibility relations we need. Definition 3.3 (R-Compatibility (Types)). Let R be a binary relation on S. We say that type T is R-compatible with type T ′ if one of the following conditions is true.
(1) T = T ′ (2) T and T ′ are enumerated types and T ⊆ T ′ (3) T, T ′ ∈ S and (T, T ′ ) ∈ R. The compatibility relation on method signatures is, as expected, covariant in the return type and the subsequent session type and contravariant in the parameter type, but with one addition: if a method has an enumerated return type E and subsequent session type S, then it can always be used as if it had a return type of variant-tag and were followed by the uniform variant session type l : S l∈E . Indeed, both signatures mean that the method can return any label in E and will always leave the object in state S.
We can now state the necessary conditions for a sub-session relation.
Definition 3.5 (Sub-session). Let R be a binary relation on S. We say that R is a subsession relation if (S, S ′ ) ∈ R implies: For the sake of simplicity we will now, when we refer to this definition later on, make the unfolding step implicit by assuming, without loss of generality, that neither S nor S ′ is of the form µX.S ′′ .
Lemma 3.6. The union of several sub-session relations is a sub-session relation.
Proof. Let R = i∈I R i , where the R i are sub-session relations. Let (S, S ′ ) ∈ R. Then there is j in I such that (S, S ′ ) ∈ R j . This implies that (S, S ′ ) satisfies the conditions in Definition 3.5 with respect to R j . Just notice that, because R j ⊆ R, the conditions are satisfied with respect to R as well -in particular, R j -compatibility implies R-compatibility. Indeed, the conditions for R only differ from those for R j by requiring particular pairs of session types to be in R rather than in R j , so they are looser.
We now define the subtyping relation <: on session types to be the largest sub-session relation, i.e. the union of all sub-session relations. The subtyping relation on general toplevel types is just <:-compatibility.
Subtyping on session types means that either both are branches or both are variants. In the former case, the supertype must allow fewer methods and their signatures must be compatible; in the latter case, the supertype must allow more labels and the common cases must be in the subtyping relation. Like the definition of subtyping for channel session types [37], the type that allows a choice to be made (the branch type here, the ⊕ type for channels) has contravariant subtyping in the set of choices.
The following lemma shows that the necessary conditions of Definition 3.5 are also sufficient in the case of <:.
Finally, we prove that this subtyping relation provides a preorder on types. Proof. First note that session types can only be related by subtyping to other session types; the same applies to enumerated types and, in the internal system, field typings. Since the relation for enumerated types is just set inclusion, we already know the result for it. We now prove the properties for session types; the fact that they hold for field typings is then a straightforward consequence.
For reflexivity, just notice that the diagonal relation {(S, S) | S ∈ S} is a sub-session relation, hence included in <:.
For transitivity, what we need to prove is that the relation R = {(S, S ′ ) | ∃S ′′ , S <: S ′′ ∧ S ′′ <: S ′ } is a sub-session relation. Let (S, S ′ ) ∈ R and let S ′′ be as given by the definition of R.
In case (1) where we have S = {U i m i (T i ) : S i } i∈I , we know that: We deduce it from the two <:-compatibilities we know by looking into the definition of compatibility point by point: • We have T ′′ k <: T k and T ′ k <: T ′′ k . Either these types are all session types, and then (T ′ k , T k ) ∈ R by definition of R, or none of them is and we have T ′ k <: T k by transitivity of subtyping on base types. In both cases, T ′ k is R-compatible with T k . • We also have either: In this case, the former two conditions imply, similarly to the above, that which is all we need. -Or, finally, U k is an enumerated type E, U ′′ k is an enumerated type E ′′ such that E ⊆ E ′′ , S k <: S ′′ k and l : S ′′ k l∈E ′′ <: S ′ k . Then from S k <: S ′′ k and E ⊆ E ′′ we deduce, using case (2) of Lemma 3.7, l : S k l∈E <: l : S ′′ k l∈E ′′ . We thus have, again, ( l : S k l∈E , S ′ k ) ∈ R which is the required condition. In case (2) where we have S = l : S l l∈E , we obtain S ′′ = l : S ′′ l l∈E ′′ and S ′ = l : S ′ l l∈E ′ , with E ⊆ E ′′ ⊆ E ′ and for any l in E, S l <: S ′′ l and S ′′ l <: S ′ l , which imply by definition of R that (S l , S ′ l ) is in R. Definition 3.9 (Type equivalence). We define equivalence of session types S and S ′ as S <: S ′ and S ′ <: S. This corresponds precisely to S and S ′ having the same infinite unfoldings (up to the ordering of cases in branches and variants). Henceforth types are understood up to type equivalence, so that, for example, in any mathematical context, types µX.S and S{ (µX.S) / X } can be used interchangeably, effectively adopting the equi-recursive approach [63, Chapter 21].
3.5. Type System. We introduce a static type system whose purpose is to ensure that typable programs satisfy a number of safety properties. As usual, we make use of a type preservation theorem, which states that reduction of a typable expression produces another typable expression. Therefore the type system is formulated not only for top-level expressions but for the states (i.e. (heap, expression) pairs) on which the reduction relation is defined.
An important feature of the type system is that the method definitions within a particular class are not checked independently, but are analyzed in the order specified by the session type of the class. This is expressed by rule T-Class, the last rule in Figure 10, which uses a consistency relation − − → Null f ⊢ C : S between field typings and session types, defined in  Figure 10.
In the following sections we describe the type system in several stages.

Typing expressions.
Definition 3.10 (Type environments). We use type environments of the form Γ = α 1 : T 1 , . . . , α n : T n where each α is either a method parameter x or an object identifier o.
The typing judgement for expressions is Γ * r ⊲ e : T ⊳ Γ ′ * r ′ . In such a judgement, Γ and Γ ′ are type environments and r and r ′ are paths. The paths parameters are in fact only necessary for typing runtime expressions; they are needed for our type preservation theorem, but not for type-checking a program, where they always have the value this. They will be discussed in Section 3.5.4.
The expression e and its type T appear in the central part of the judgement. The Γ ′ on the right hand side shows the change, if any, that e causes in the type environment. There are several reasons for Γ ′ to differ from Γ; the most important is that if e contains a method call on an object, then the session type of that object is different in Γ ′ than it was in Γ.
Another one is linearity: if a linear parameter x is used in e, then x does not appear in Γ ′ because it has been consumed.
When type-checking a program (as opposed to typing a runtime expression, which needs not be implemented), the judgements for expressions always have the particular form this : C[F ], V * this ⊲ e : T ⊳ this : C[F ′ ], V ′ * this, with only one object identifier in the environment, this, representing the object to which fields referred to in e belong. The rest of the environment, V , is either empty or has the form x : U where U is the type of the parameter x of the method currently being type-checked, and is thus a top-level type. The initial type of this is the internal type C[F ], where F is a field typing; the final type is C[F ′ ], as e may change the types of the fields (for example, by calling methods on them). The final parameter typing V ′ is either the same as V or empty, depending whether the parameter was consumed by e.
We extend subtyping to a relation on type environments, as follows.
Essentially Γ <: Γ ′ if Γ is more precise (contains more information) than Γ ′ : it contains types for everything in Γ ′ (and possibly more) and those types are more specific.

3.5.2.
Consistency between field typings and session types. There are two possible forms for the type of an object. One is a session type S, which describes the view of the object from outside, i.e. from the perspective of code in other classes. The session type specifies which methods may be called, but does not reveal information about the fields. The other form, C[F ], contains a field typing F , and describes the internal view of the object, i.e. from the perspective of code in its own methods. Consider a sequence of method calls in a particular class. There are two senses in which it may be considered correct or incorrect. (1) In the sense that it is allowed, or not allowed, by the session type of the class. (2) In the sense that each call in the sequence leaves the fields of the object in a state which ensures the next call does not produce a type error. For example, if we consider the class FileReader of Figure 3, we see that the session type allows calling read() just after init () , making the sequence init (); read() correct in sense (1). It is correct in sense (2) if and only if the body of read() typechecks under the precondition that the fields file and text have the types produced by the evaluation of init () .
In order to type a class definition, these two senses of correctness must be consistent according to the following coinductive definition. Definition 3.12. Let C be a class and let R be a relation between field typings F and session types S. We say that R is a C-consistency relation if (F, S) ∈ R implies: ( In clause (1), S is the session type (external view) before calling one of the methods m i , and F is the field typing (internal view). If a particular m i is called then the subsequent session type is S i , and the subsequent field typing, arising from the typing judgement for the method body, is F i . These types must be related. Clause (2) requires variant session types and field typings, arising from a method call that returns an enumeration label, to match. The inclusion E ′ ⊆ E allows the method to return labels from a smaller set than the one defined by the session type.
Lemma 3.13. The union of several C-consistency relations is a C-consistency relation.
Proof. Similar to (but simpler than) the proof of Lemma 3.6.
For any class C, we define the relation F ⊢ C : S between field typings F and session types S to be the largest C-consistency relation, i.e. the union of all C-consistency relations.
The relation F ⊢ C : S represents the fact that an object of class C with internal (private) field typing F can be safely viewed from outside as having type S. Clause (2) in Definition 3.12 accounts for correspondence between variant types. The main clause is clause (1): if the object's fields have type F and its session type allows a certain method to be called, then it means that the method body is typable with an initial field typing of F and the declared type for the parameter. Furthermore, the type of the expression must match the declared return type and the final type of the fields must be compatible with the subsequent session type. The parameter may or may not be consumed by the method, but T-SubEnv at the end of Figure 10 (see next section) allows discarding it silently in any case, hence its absence from the final environment.
The definition implies that a method with return type variant-tag must be followed by a variant session type, for the following reason. Suppose that in clause (1), some T i is variant-tag. The only way for e i to have type variant-tag is by using rule T-VarF (discussed in the next section), which implies that F i must be a variant field typing. The condition The rule T-Class, last rule in Figure 10, checks that the initial session type of a class is consistent with the initial Null field typing. It refers to the above definition of consistency, which itself refers to typing judgements built using the other rules in the figure.
3.5.3. Typing rules for top-level expressions. The typing rules for top-level expressions (the syntax in Figure 5) are in Figure 10. They use the following notation for interpreting paths relative to type environments, analogously to Definition 3.1 for heaps. Definition 3.14 (Locations in environments). • We now comment on these rules: T-Null and T-Label type constants. A label is given a singleton enumerated type, which is the smallest type it can have, but subsumption can be used to increase its type. T-New types a new object, giving it the initial session type from the class declaration. T-LinVar and T-Var are used to access a method's parameter, removing it from the environment if it has an object type (which is linear). For simplicity, this is the only way to use a parameter. In particular, we do not allow calling methods directly on parameters: to call a method on a parameter, it must first be assigned to a field. This is just a simplification for this formal presentation and does not limit expressivity; this will be discussed in Section 4.
T-Swap types the combined read-write field access operation, exchanging the types of the field and expression. There are two restrictions on its use. T is not allowed to be variant-tag, because this particular type only makes sense as the return type of a method. This condition effectively forbids the use of rule T-VarF in the typing derivation for e, because e is not what the method returns. It also has the consequence that Γ ′ (r ′ ) is not a variant type, because the only rule that could produce one is T-VarF; hence Γ ′ (r ′ .f ) is defined. The other condition is that T ′ is not allowed to be a variant type, because it is not allowed in our system to extract from a field a variantly-typed value without having switched on the associated tag first. Indeed, link types refer to fields by name, so moving variantly-typed values around would lose the connection between the value and its tag.
T-Call checks that field f has a session type that allows method m j to be called. The type of the parameter is checked as usual, and the final type environment is updated to contain the new session type of the object in f . If the return type of the method is variant-tag (method open() in Figure 2 is one such example), it means that the value returned is a label describing the state of this object; since the object is in f , it is changed into link f . Because the return type appears in the session type and is therefore expressed in the top-level syntax, it cannot already be of the form link f . Observe that although types of the form link f are not written by the programmer, they can nevertheless occur in a typechecking derivation as the types of top-level expressions (as in the example in Figure 13).
T-Seq accounts for the effects of the first expression on the environment and checks that a label is not discarded, which would leave the associated variant unusable.
T-Switch types a switch whose expression e does not have a link type. All relevant branches are required to have the same type and final environment, and the whole switch expression inherits them. A typical example is if the branches just contain different labels: in that case they are given singleton types by T-Label and then T-Sub is used to give all of them the same enumerated type. If the type E ′ of the parameter expression is strictly smaller than the set E of case labels in the switch expression, branches corresponding to the extra cases are ignored.
T-SwitchLink is the only rule for deconstructing variants. It types a switch, similarly to the previous one, but the type of e must be a link to a field f with a variant session type. The relevant branches are then typed with initial environments containing the different case types for f according to the value of the label. As before, they must all have the same type and final environment, and if the switch expression defines extra branches for labels which do not appear in the variant type of f , they are ignored.
T-VarF constructs a variant field typing for the current object. Here E is typically, but not necessarily, a singleton type, and e is typically a literal label. The field typing before applying the rule must be a record as nested variants are not permitted, and the rule transforms it into a variant with identical cases for all labels in E. It can then be extended to a variant with arbitrary other cases using rule T-SubEnv. This rule is used for methods leading to variant session types, which, as Definition 3.12 implies, must finish with a variant field typing. As a simple example, consider the following expression, which could end a method body in some class D: If S is the declared session type of class C, we have, using rules T-New, T-Swap, T-Label and T-Seq, the following judgements (T is just the initial type of f ): Figure 11: Typing rules for expressions in the internal language and T-SubEnv can thus be applied to both judgements to increase the final type of this to this common supertype. It is then possible to use T-Switch to type the whole expression. Note that the final type of the expression is always variant-tag: as T-VarF is the only rule for constructing variants, this is the only possible return type for a method leading to a variant.
T-Sub is a standard subsumption rule, and T-SubEnv allows subsumption in the final environment. The main use of the latter rule, as illustrated above, is to enable the branches of a switch to be given the same final environments.
3.5.4. Typing rules for internal expressions, heaps and states. The type system described so far is all we need to type check class declarations and hence programs, which are sequences of class declarations. In order to describe the runtime consequences of well-typedness, we now introduce an extended set of typing rules for expressions that occur only at runtime ( Figure 11) and for program states including heaps ( Figure 12). When typing an expression e as part of a runtime state, the path r, which was always this when typing programs, varies and indicates the currently active object (the one the method at the top of the call stack belongs to). Any difference between r and r ′ means that e contains return; in that case, r and r ′ represent the call stack during and after a method call. Recall that return expressions are what a method call reduces to and are introduced by R-Call and suppressed by R-Return. Therefore, these expressions can be nested and can appear, at runtime, in any part of the expression in which reduction can happen. For example they can appear in the argument of a switch or of a function call, but not in the second term of a sequence (which does not reduce until the sequence itself has reduced). In the rules of Figure 10, the difference between r and r ′ in some, but not all, of the premises, accounts for that fact.

Runtime expressions may contain object identifiers, typed by T-Ref.
In this rule, the current object path r must not be within o, meaning that the current object or any object containing it cannot be used within an expression. This is part of the linear control of objects: somewhere there must be a reference to the object at r, in order for a method to have been called on that object, which is what gives rise to the evaluation of an expression whose current object path is r. So obtaining another reference to the object at r, within the active expression, would violate linearity.
Another new rule is T-VarS, which constructs a variant session type for a field of the current object. At the top level, the only expression capable of constructing a variant session type is a method call, but once the method call has reduced into something else this rule is necessary for type preservation.
The last additional rule for expressions is T-Return, which types a return expression representing an ongoing method call. The subexpression e represents an intermediate state in a method of object r ′ .f . If e itself does not contain return, we have r = r ′ .f ; otherwise they are different. For example, if we have an expression of the form f.m(), the body of m is f ′ .m ′ (), and the body of m ′ is e ′ (omitting parameters for simplicity), then (h * r ′ , f.m()) reduces to (h * r ′ .f.f ′ , return (return e ′ )) in two steps of R-Call. The typing derivation for this last expression would look like: As e is an intermediate state in a method of r ′ .f , it is typed with final current object r ′ .f and a final environment where the type of r ′ .f is of the form C[F ], representing an inside view of the object, where the fields are visible. This T-Return rule then steps out of the object, hides its fields and changes its type into the outside view of a session type, which must be consistent with the internal type (F ⊢ C : S). The particular case where T is variant-tag is the same as in T-Call. T is not allowed to already be of the form link f ′ since it would break encapsulation (f ′ would refer to a field of r ′ .f which is not known outside of the object).
An important point is that the only expression that changes the current object is return. Several rules besides T-Return can inherit in the conclusion a change of current object from a subexpression in a premise, but they do not add further changes. Thus the final current object path is always a prefix of the initial one, and the number of field specifications removed is equal to the number of returns contained in the expression. Also note that the second part of a sequence and the branches of a switch are not reduction contexts; therefore, they should not contain return and are not allowed by the rules to change the current object.
As we saw in Section 3.2, a runtime state consists of a heap, a current object path, and a runtime expression. Figure 12 describes how these parts are related by typing: by rule T-State, a typing judgement for the expression gives one for the state provided the current object is the same and the initial environment reflects the content of the heap; this last constraint is represented by the judgement ⊢ h : Γ. Such a judgement is constructed starting from the axiom T-HEmpty which types an empty heap and adding objects into the heap one by one with rule T-HAdd, converting their types into sessions using T-Hide as needed. As T-HAdd is the only rule that adds to Γ, we have the property that ⊢ h : Γ implies that every identifier in Γ also appears in h. T-HAdd essentially says that adding a new object with given field values to the heap affects the environment in the same way as an expression that starts from an empty object and puts the values into the fields one by one. The most important feature of this rule is that whenever a v i is an object identifier, the typing derivation for the expression has to use T-Ref, which implies both that the initial environment contains v i and that the final one, which represents the type of the extended heap, does not. This means that a type environment corresponding to a heap never contains entries for object identifiers that appear in fields of other objects, and it also implies that a heap with multiple references to the same object is not typable. The numbering of the fields in the rightmost premise is arbitrary, meaning it must not be interpreted as requiring the sequence of swaps to be done in any particular order; all possible orders are valid instances of the premise. This is important if the type of the object being added is to contain links and variants: suppose that field f contains an object o and field g a label l; it must be possible to attribute a variant type to f and the type link f to g, but this can only be done as a result of typing the sequence of swaps if f ↔ o occurs before g ↔ l.
3.6. Example of reduction and typing. We now return to the example from Section 3.3, to illustrate the way in which the environment used to type an expression changes as the expression reduces (see Theorem 3.19 on page 26). To shorten the series of steps in which the current object path does not change, Figure 13 starts from the point at which the initial expression has reduced to Recall that this expression is an abbreviation for The initial typing environment is with top as the current object, where S j = l : S l l∈E . The body of method m j is e with the typing this : C ′ [F ] * this ⊲ e : variant-tag ⊳ this : C ′ [ l : F l l∈E ] * this and we assume that m j returns l 0 ∈ E. According to Definition 3.12 and the typing of the declaration of class The figure shows the environment in which each expression is typed; the environment changes as reduction proceeds, for several reasons explained below. The typing of an expression is Γ ⊲ e : T ⊳ Γ ′ but we only show Γ because Γ ′ does not change and T is not the interesting part of this example. We also omit the heap, showing the typing of expressions instead of states. However, an important point to keep in mind is that Γ corresponds to the typing environment obtained after typing the heap: ⊢ Γ : h is obtained after a number of T-Hadd steps and corresponds to the final typing environment for the heap.
Calling f.m j () changes the type of field f to C ′ [F ] because we are now inside the object; the current object path changes from top to top.f . As e reduces to l 0 the type of f may change, finally becoming C ′ [F l 0 ] so that it has the component of the variant field typing C ′ [ l : F l l∈E ] corresponding to l 0 . The reduction by R-Return changes the type of f to S l 0 because we are now outside the object again, but the type is still the component of a variant typing corresponding to l 0 . At this point f is popped from the current object path.
Here the type of l 0 is link f (which was the expected result type for f.m j ()), and this type is obtained by applying T-VarS, so in the intermediate typing environment after typing l 0 , f has the variant type l : S l l∈E . The next step, swap, moves l 0 from the expression to the heap. Therefore the application of T-VarS needed to type it now occurs in the derivation for typing the heap, of which Γ is the result. This is why in Γ the type of f is now l : S l l∈E , which is S j , the type we were expecting after the method call. At this point the information about which component of the variant typing we have is stored in top.g, the field the label was swapped into: the type of the expression f.m j () is link f , which appears as the type of top.g after the swap is executed. When extracting the value of g in order to switch on it, the type link f disappears from the environment and becomes the type of the subexpression g ↔ null, at the same time resolving the variant type of f according to the particular enumerated value l 0 .

3.7.
Typing the initial state. Recall the discussion of the initial state for execution of a program, from the end of Section 3.2. The initial state is where class C has a designated main method m with body e. In order to type this initial state, we require that m is immediately available in C.session, and assume that the program is typable, i.e. that rule T-Class is applicable to every class definition. If C.fields 3.8. Properties of the type system. The main results in this sequential setting are standard: type preservation under reduction (also known as Subject Reduction) and absence of stuck states for well-typed programs. Furthermore, the system also enjoys of a conformance property: all executions of well-typed programs follow what is specified by the classes' session types.
3.8.1. Soundness of subtyping. In this section we prove that the subtyping relation is sound with respect to the type system, in the sense that it preserves not only typing judgements but also consistency between field typings and session typings, reflecting the safe substitution property.
Proof. Straightforward induction on the derivation. Before proving the case where subtyping is on the right, we first remark that, similarly to sub-session, the necessary conditions in the definition of C-consistency (Definition 3.12) become sufficient once we consider the largest relation: Lemma 3.17. Let C be a class and F a field typing for that class.
(1) Suppose F is not a variant, and suppose there is a set of method definitions {m i (x i ) {e i }} i∈I in the declaration of class C such that, for all i, we have: (2) Suppose F = l : F l l∈E ′ and let (S l ) l∈E be a family of session types such that E ′ ⊆ E and F l ⊢ C : S l for all l ∈ E ′ . Then F ⊢ C : l : S l l∈E holds. Proof. For any class C, we define the following relation: and prove that it is a C-consistency relation (Definition 3.12). Let (F, S ′ ) ∈ R C , and let S be as given by the definition of the relation. We have two cases depending on the form of S ′ (branch or variant).
The first one is Let j ∈ J, we know from F ⊢ C : S that C contains a method declaration m j (x) {e} such that the following judgement: x : T j , this : C[F ] * this ⊲ e : U j ⊳ this : C[F j ] * this holds, with F j ⊢ C : S j . <:-compatibility between the two signatures of m j (Definition 3.4) gives us, first, T ′ j <: T j , which allows us to apply Lemma 3.15 to this judgement and replace T j by T ′ j in it, and second, either: (1) U j <: U ′ j and S j <: S ′ j . The former allows us to use T-Sub to replace U j by U ′ j in the typing judgement for e, fulfilling the first condition in the definition of Cconsistency. The latter, together with F j ⊢ C : S j , implies (F j , S ′ j ) ∈ R C , fulfilling the second one.
(2) U j is an enumerated type E, U ′ j = variant-tag and l : S j l∈E <: S ′ j . In this case we first apply T-VarF to the judgement, yielding: Theorem 3.19 (Subject Reduction). Let D be a set of well-typed declarations, that is, such that for every class declaration D in D we have ⊢ D.
If, in a context parameterised by D, Proof. This theorem is a particular case of Theorem 7.16 which will be proved in Section 7.

Type safety.
Theorem 3.20 (No Stuck Expressions). Let D be a set of well-typed declarations, that is, such that for every class declaration D in D we have ⊢ D.
If, in a context parameterised by D, Proof. This theorem is also a consequence of Theorem 7.16, so we postpone its proof until Section 7.
3.8.4. Conformance. We show that, in well-typed programs, a sequence of method calls (interleaved with their respective return labels) of a given class is a path of its session type. In order to state this property precisely, we introduce a few definitions.
then the derivation of this reduction consists of a number of applications of R-Context, preceded by another rule which forms a unique leaf node in the derivation. We say that the rule at the leaf node is the original reduction rule for the reduction, or that the reduction originates from this rule.
Definition 3.27 (Extension of call traces). Suppose tr is a call trace mapping for h and (h * r , e) −→ (h ′ * r ′ , e ′ ). Define a call trace mapping tr ′ for h ′ as follows: • If the reduction originates from R-Call with method m and field f then tr ′ = tr {h(r.f ) → tr (h(r.f ))m}. • If the reduction originates from R-Return with value v, and v is a label l, then tr ′ = tr {h(r) → tr (h(r))l}. • If the reduction originates from R-New and the fresh object is o then The conformance property is the following: in a sequence of reductions starting from the initial state of a well-typed program, the call traces built using the extension mechanism defined above are valid throughout the sequence. We need a couple of lemmas to properly relate call traces and typings in the case of variant types. Proof. It suffices to show that rule T-VarF cannot be used in the derivation of ⊢ h : Γ, since it is the only rule that introduces variant-tag or variant field typings.
This rule can only be used on an expression of enumerated type, and the only place where such an expression can occur in the derivation of ⊢ h : Γ is as the right member of a swap in the second premise of T-Hadd (the swap expression itself has type Null because of the initial environment). It corresponds to the first premise of T-Swap. However, the third premise of T-Swap forbids that the type of the expression be variant-tag, hence T-VarF cannot be used there. ( Proof. Consider how a variant session type can be introduced in the derivation of ⊢ h : Γ. (2) follows from the structure of the derivation: the label T-VarS is applied to is then swapped into a field of the same object.
Definition 3.30 (Actual session type). Let Γ and h be such that ⊢ h : Γ. For any r in Γ such that Γ(r) is a session type S, we define S ′ , the actual session type of r in h according to Γ, as follows: where r ′ and f ′ are as given by Lemma 3.29.  Let (h 1 * r 1 , e 1 ) be a program state together with a valid call trace mapping tr 1 , and suppose that (h 1 * r 1 , e 1 ) −→ · · · −→ (h n * r n , e n ) is a reduction sequence such that r 1 is a prefix of all r i . Definition 3.27 gives a corresponding sequence of call traces tr i .
If there exists Γ such that tr 1 is consistent with Γ and Γ * r 1 ⊲ (h 1 * r 1 , e 1 ) : T ⊳ Γ ′ * r ′ then for all i, tr i is valid.
Proof. Postponed, again, to Section 7 as it makes use of the proof of Theorem 7.16 which will be proved there.
Corollary 3.33. Given a well-typed program, starting from the initial state described at the end of Section 3.2 with the initial call trace mapping {top → m}, and given a reduction sequence from there, the call trace mappings obtained by Definition 3.27 following the reductions are valid throughout the sequence.
Proof. We just have to see that: (1) the initial call trace mapping is valid, as the main method m is required to appear in the initial session type of the main class; (2) it is also consistent with the initial typing given in Section 3.7, as the initial Γ contains no session type; Top-level syntax (add to Figure 5): e ::= . . . | while (e) {e} Reduction rule (add to Figure 7): Top-level typing rules (add to Figure 10): Figure 14: Rules for while (3) the initial current object path is reduced to an object identifier and, therefore, stays a prefix of the current object path throughout any reduction sequence.

Towards a Full Programming Language
In this section, we show how the core calculus presented in the previous section can be extended towards a full programming language. The extensions include constructs which can be considered abbreviations and may be translated into the core calculus without changing it, and actual extensions to the formal system. 4.1. Assignment and Field Access. As explained in the introduction of Section 3, we add to expressions: • the field access expression f (not followed by a dot or by ↔), which translates into the core expression f ↔ null. This expression evaluates to the content of f and has the side effect of setting f to null; • the assignment expression f = e, which translates into the core expression f ↔ e; null.
This expression stores the value of e in field f and evaluates to null.

Multiple Parameters.
It is straightforward to generalise the reduction and typing rules so that methods have multiple parameters. In rule T-Call, the environments would be threaded through a series of parameter expressions, in the same way as in rule T-Seq. The language can easily be extended to include while loops, by adding the rules in Figure 14. The reduction rule defines while recursively in terms of switch. There are two typing rules, derived from T-Switch and T-SwitchLink. The first deals with a straightforward while loop that has no interaction with session types, and the second deals with the more interesting case in which the condition of the loop is linked to the session type of an object. Reduction rule (add to Figure 7): Top-level typing rules (add or replace in Figure 10):  Figure 15 extend the language to include self-calls (method calls on this). This extension also supports recursive calls, which are necessarily self-calls. Self-calls do not check or advance the session type, and a method that is only self-called does not appear in the session type. A method that is self-called and called from outside appears in the session type, and calls from outside do check and advance the session type. The reason why it is safe to not check the session type for self-calls is that the effect of the self-call on the field typing is included in the effect of the method that calls it. All of the necessary checking of session types is done because of the original outside call that eventually leads to the self-call. Because they are not in the session type, self-called methods must be explicitly annotated with their initial (req) and final (ens) field typings. The annotations are used to type self-calls (T-SelfCall) and method definitions (T-AnnotMeth). The result type and parameter type are also specified as part of the method definition, again because the method is not in the session type.
If a method is in the session type then its body is checked by the first hypothesis of T-Class, but the annotations (if present) are ignored except when they are needed to check recursive calls. If a method has an annotation then its body is checked by the second hypothesis of T-Class. If both conditions apply then the body is checked twice. An implementation could optimize this.
An annotated method cannot produce a variant field typing or have a link type, because T-SwitchLink can only analyze a variant session type, not a variant field typing. 4.6. Shared Types and Base Types. The formal language described in this paper has a very strict linear type system. It is straightforward to add non-linear classes as an orthogonal extension: they would not have session types and their instances would be shared objects, treated in a completely standard way. Including them in the formalisation, however, would only complicate the typing rules.
More interesting, and more challenging, is the possibility of introducing a more refined approach to aliasing and ownership, for example along the lines of the systems discussed in Section 9. We intend to investigate this in the future. Base types such as int are also straightforward to add, and would be treated non-linearly.

4.7.
Inheritance. The formal language uses a structural type system in which class names are only used in order to obtain their session types; method availability is determined solely by the session type, and method signatures are also in the session type. In particular, the subtyping relation is purely structural and makes no reference to class names. It is straightforward to adapt the language to include features associated with nominal subtyping, such as an explicitly declared inheritance hierarchy for classes with inheritance and overriding of method definitions. In this case, if class C is declared to inherit from class D, and both define session types (alternatively, C might inherit its session type from D), then the condition C.session <: D.session would be required in order for the definition of C to be accepted.

A Distributed Example
We now present an example of a distributed system, illustrating the way in which our language unifies session-typed channels and more general typestate. Recall that our programming model is based on communication over TCP/IP-style socket connections, which we refer to as channels. The scenario is a file server, which clients can communicate with via a channel. The file server uses a local file, represented by a File object as defined in Section 2, and responds to requests such as OPEN and HASNEXT on the channel. On the client side, the remote file is represented by an object of class RemoteFile, whose interface is similar to File. In this "stub" object, methods such as open are implemented by communicating with the file server.
The channel between the client and the server has a session type in the standard sense [70], which defines a communication protocol. In our language, each endpoint of the channel is represented by an object of class Chan, with a class session type derived from the channel session type. This class session type also expresses the definition of the communication protocol, by specifying when the methods send and receive are available.
For the purpose of this example, we imagine that the communication protocol (channel session type) is defined by the provider of the file server, while the class session type of RemoteFile is defined by the implementor of a file system API. We therefore present two versions of the example: one in which the channel session type, and the class session type of RemoteFile, have the same structure; and one in which they have different structures. 5.1. Distributed Example Version 1. Figure 16 defines a channel session type for interaction between a file server and a client. The type of the server's endpoint is shown, and the type FileReadCh is the starting point of the protocol. The type constructor & means that the server offers a choice, in this case between OPEN and QUIT; the client makes a choice by sending one of these labels. If OPEN is selected, the server receives (constructor ?) a String The structure of the channel session type is similar to that of the class session type of File from Section 2, in the sense that HASNEXT is used to discover whether or not data can be read.
We regard each endpoint of a channel as an object with send and receive methods. For every channel session type there is a corresponding class session type that specifies the availability and signatures of send and receive. The general translation is defined in Section 6, Figure 26. For the particular case of FileReadCh, the client and server class session types are as defined in Figure 17: FileRead_cl for the client and FileRead_s for the server.
The requirement to make a choice (⊕) in the channel session type corresponds to availability of send with a range of signatures, each with a parameter type representing one of the possible labels; here we are taking advantage of overloading, disambiguated by parameter type. The requirement to offer a choice (&) in the channel session type corresponds to availability of receive, with the subsequent session depending on the label that is received. Sending (!) and receiving (?) data in the channel session type correspond straightforwardly to send and receive with appropriate signatures. Figure 18 defines the class RemoteFile, which acts as a local proxy for a remote file server. Its interface is similar to that of the class File from Section 2; the only difference is that RemoteFile has an additional method connect, which must be called in order to establish a connection to the file server. The types RemoteFile.Init and File . Init are equivalent (Definition 3.9): each is a subtype of the other, and they can be used interchangeably.
The methods of RemoteFile are implemented by communicating over a channel to a file server. The connect method has a parameter of type FileReadCh . A value of this type represents an access point, analogous to a URL, on which a connection can be requested by calling the request method (line 11); the resulting channel endpoint has type FileRead_cl.
The remaining methods communicate on the channel, and thus advance the type of the field channel. The similarity of structure between the channel session type FileReadCh and the class session type Init is reflected in the simple definitions of the methods, which just copy information between their parameters and results and the channel. There is one point of interest in relation to the close method. It occurs three times in the class session type, and according to our type system, its body is type checked once for each occurrence. Each time, the initial type environment in which the body is checked has a different type for the  Figure 19: Remote file server version 1: server code 1 F i l e C h a n n e l = &{OPEN: ? S t r i n g .⊕{OK: CanRead , ERROR: F i l e C h a n n e l } , QUIT : End } 2 CanRead = &{READ: ⊕{EOF: FileChannel , DATA : ! S t r i n g . CanRead } , CLOSE: F i l e C h a n n e l }  Figure 19 defines the class FileServer, which accesses a local file system and uses the server endpoint of a channel of type FileReadCh. The session type of this class contains the single method main, with a parameter of type FileReadCh . We imagine this main method to be the top-level entry point of a stand-alone application, with the parameter value (the access point or URL for the server) being provided when the application is launched. The server uses accept to listen for connection requests, and when a connection is made, it obtains a channel endpoint of type FileRead_s.
The remaining methods of FileServer are mutually recursive in a pattern that matches the structure of FileRead_s. The methods are self-called, and do not appear in the class session type; instead, they are annotated with pre-and post-conditions on the types of the fields channel and file . The direct correspondence between the structure of the channel session type and the class session type of File is again reflected in the code, for example on lines 29 and 30 where the result of calling hasNext on file directly answers the HASNEXT query on channel.
Most systems of session types support delegation, which is the ability to send a channel as a message on another channel. It is indicated by the occurrence of a session type as the type of the message in a send (!) or receive (?) constructor. In our language, delegation is realised by sending an object representing a channel endpoint; it corresponds to a send method with a parameter of type, for example, Chan[FileRead_cl]. Transfer of channel endpoints from one process to another is supported by the operational semantics in Section 6.  Figure 20, which does not match the class session type FileRead. The difference is that there is no HASNEXT option; instead, the READ option is always available. If there is no more data then EOF is returned in response to READ; alternatively, DATA is returned, followed by the desired data. The corresponding class session types for the client and server endpoints are defined in Figure 21.
The implementation of RemoteFile must now mediate between the different structures of the class session type FileRead and the channel session type FileChannel. The new definition is in Figure 22. The main point is that the definition of the close method must depend on the state of the channel. For example, if close is called immediately after a call of hasNext that returns TRUE, then the channel session type requires data to be read before CLOSE can  Figure 23: Remote file server version 2: server code be sent. We therefore introduce the field state, which stores a value of the enumerated type {EOF, READ, DATA}. This field represents the state of the channel (equivalently, the session type of the channel field): EOF corresponds to ClientCh, READ corresponds to CanRead, and DATA corresponds to the point after the DATA label in CanRead. The definition of close contains a switch on state, with appropriate behaviour for each possible value. It is also possible for state to be null, but this only occurs before open has been called, and at this point close is not available.
In order to type check this example we take advantage of the fact that the body of the close method is repeatedly checked, according to its occurrence in the class session type. The value of state always corresponds to the state of channel. This correspondence is not represented in the type system -that would require some form of dependent type -but whenever the body of close is type checked, the type of channel is compatible with the value of state, and so typechecking succeeds. More precisely, each possible value of state corresponds to a different singleton type for state (typing rule T-Label), and rule  T-Switch only checks the branches that correspond to possible values in the enumerated type of the condition. So each time the body of close is type checked, only one branch (because the type of state is a singleton) of the switch is checked, corresponding to the value of state for that occurrence of close.

A Core Distributed Language
We now define the core of the distributed language illustrated in Section 5. For simplicity, communication is synchronous. Formalising asynchronous communication is wellunderstood (for example, Gay and Vasconcelos [39] define a functional language with similar communication primitives but adds the complication of message buffers to the operational semantics). Our integration of channel session types and the typestate system of this paper is based on binary session types [70] (actually, we adopt the now standard constructs of session types). It should be straightforward to adapt the technique to multi-party session types [42], because that system also depends on specifying the sequence of send and receive operations on each channel endpoint.
The only additions to the top-level language are access points and their types T , channel session types and their translation to class session types, and the spawn primitive. However, there are significant changes to the internal language, in order to introduce a layer of concurrently executing components that communicate on channels.
6.1. Syntax. Figure 24 defines the new syntax. The types of access points are top-level declarations. Of the new values, access points n can appear in top-level programs, but channel endpoints, c + and c − , are part of the internal language. If an access point n is declared with access Σ n then we define n.protocol to mean Σ. The spawn primitive was not used in the example in Section 5, but its behaviour is to start a new thread executing the specified method on a new instance of the specified class (just like it happens in Java; other works, e.g. [27], use similar approaches). Although a parameter is required as in any method call, for simplicity the type system restricts the parameter's type to be Null in this case, so that there is only one form of inter-thread communication. The syntax of channel session types Σ is included so that the types of access points can be declared. Channels are created by the interaction of methods request and accept in different threads, one thread keeps the c − endpoint whereas the other keeps c + . The two threads then communicate on channel c by reading and writing on their channel ends. The syntax of states is extended to include parallel composition and a channel binder νc, which binds both endpoints c + Structural congruence: Reduction rules:  [39]. In a parallel composition, the states are exactly states from the semantics of the sequential language; in particular, each one has its own heap. This means that spawn generates a new heap as well as a new executing method body. Communication between parallel expressions is only via channels.
The syntax extensions do not include request, accept, send and receive, as they are treated as method names.
6.2. Semantics. Figure 25 defines the reduction rules for the distributed language, as well as the top-level typing rules. The reduction rules make use of a pi-calculus style structural congruence relation, again following Gay and Vasconcelos [39]. It is the smallest congruence (with respect to parallel and binding) that is also closed under the given rules.
Rule R-Init defines interaction between accept and request, which creates a fresh channel c and substitutes one endpoint into each expression.
There are two rules for communication, involving interaction between send and receive. Rule R-ComBase is for communication of non-objects and rule R-ComObj is for communication of objects. Let O be the set of all object identifiers. R-ComBase expresses a straightforward transfer of a value, while R-ComObj also transfers part of the heap corresponding to the contents of a transferred object. In R-ComObj, ϕ is an arbitrary renaming function which associates to every identifier in dom(h) an identifier not in dom(h ′ ). This rule can easily be made deterministic in practice by using a total ordering on identifiers and a mechanism to generate fresh ones.
R-Spawn creates a new parallel state whose heap contains a single instance of the specified class. As discussed above, communication between threads is only through channels in order to keep the formal system a reasonable size; therefore, no data is transmitted to the new thread and the body of the method being spawned always has its parameter replaced by the literal null. The type system will ensure that v = null, so that this semantics makes sense. The remaining rules are standard.
Returning to R-ComObj, there is some additional notation associated with identifying the part of the heap that must be transferred; we now define it.   6.3. Type System. The type system treats send, receive, request and accept as method calls on objects whose session types are defined by the translations in Figure 26. A channel endpoint with (channel) session type Σ is treated as an object with (class) session type Σ . The type constructor & (offer) is translated into a receive method with return type variant-tag in order to capture the relationship between the received label and the subsequent type. The type constructor ⊕ (select) is translated into a collection of send methods with different parameter types, each being a singleton type for the corresponding label. In a similar but much simpler way, an access type Σ is translated into a (class) session type that allows both request and accept to be called repeatedly and at any time. These two methods need to return dual channel endpoints, which requires the following definition. In the type system, an access point with type Σ is treated as an object with type Σ . Calls of request and accept are typed as standard method calls.

Figure 26: Object types for channels and access points
These rules add to or replace the rules in Figure 12. where ι is the identity substitution, σ{ Σ ′ / X } denotes the extension of substitution σ by the mapping X → Σ ′ , and Σσ denotes the application of the substitution σ to the session type Σ; the auxiliary function dual (Σ, σ) is defined on session types Σ and substitutions σ by: With this definition, duality commutes with unfolding [6]; this property is essential in order to use the equi-recursive convention (Definition 3.9). By convention, request returns a channel endpoint of type Σ and accept returns an endpoint of type Σ .
Because access points n are global constants, they can be used repeatedly even though their session types are linear; there is no restriction to a single occurrence of a given name.
The only new typing rules for the top-level language are in Figure 25. T-Spawn allows a method to be used in a spawn expression if it is available in the initial session type of the specified class. T-Name obtains the type of an access point from its declaration, and assigns an object type according to the translation described above. Figure 27 contains typing rules for the internal language with the concurrency extensions. Rules with the same names as rules in Figure 12 are replacements.
Rule T-Chan takes the type of a channel endpoint from the typing environment. The remaining rules involve a new typing environment Θ, which maps channel endpoints to channel session types Σ; these are indeed channel session types, not their translations into class session types. T-HAdd, T-Hide and T-State are just the corresponding rules from Figure 12 with Θ added. In T-HEmpty the notation Θ means that the translation from channel session types to class session types is applied to the type of each channel endpoint. In combination with T-State, this means that the typing of expressions uses class session types for channel endpoints; the T in T-Chan is a class session type.
T-Thread lifts a typed state to a typed concurrent component, preserving only the channel typing Θ, which is used in T-Par and T-NewChan. In T-Par, Θ + Θ ′ means union, with the assumption that Θ and Θ ′ have disjoint domains. T-NewChan requires the complementary endpoints of each channel to have dual session types. 6.4. Subtyping. We have two subtyping relations between channel session types: Σ <: Σ ′ as defined by Gay and Hole [37], and Σ <: Σ ′ as defined in this paper. To avoid a detour into the definition of Σ <: Σ ′ , we state the following result without proof.  [37], of generalizing the subtyping relation between channel session types by considering branch/select labels as values in an enumerated type. We do not explore this idea further in the present paper.

results
The key results concerning the distributed language supporting self-calls are, again, Subject-Reduction, Type Safety, and Conformance. Notice that we can no longer guarantee the absence of stuck states for all well-typed programs, as one endpoint of a channel may try to send when the other endpoint is not available to receive. 7.1. Properties of typing derivations. This subsection is mostly a collection of lemmas which will be used to prove the main theorems in the following subsections. They draw various useful consequences from the fact that a program state is well-typed. Their proofs can be found in Appendix 10.
We define chans(h) as the set of channel endpoints appearing in object records in h. We define chans(Γ) and objs(Γ) as the sets of, respectively, channel endpoints and object identifiers in dom(Γ). We have dom(Γ) = chans(Γ) ∪ objs(Γ).    These two lemmas show, if we apply them repeatedly, that a typing derivation for a heap can be considered as a set of separate typing derivations leading to each root of the heap. This will allow us in particular to show results for particular cases where a heap has only one root and generalize them.    (1) if v is an object identifier or a channel endpoint, then: if v is not an object or channel and T is not a link type, then: for some E such that l 0 ∈ E and some set of branch session types S l . Note that this implies f = f ′ . (1) if T ′ is a base type (i.e. neither an object type nor a link) and v is a literal value of that type, or if v is an access point name declared with type Σ and Σ <: T ′ , we have: (2) if T ′ is an object type and v is an object identifier or a channel endpoint, we have:  This global result is a consequence of a subject reduction theorem for a single thread, which is similar but not identical to what we stated as Theorem 3.19 (which will be a particular case). The reason it is not identical is that we need to prove that the type of an expression is preserved not only when this expression reduces on its own but also when it communicates with another thread. In order to state precisely this thread-wise type preservation theorem, we introduce a labelled transition system for threads. Transition labels can be: τ indicating internal reduction, c p ! [v] or c p ? [v] indicating that the nonobject value v is sent or received on channel c p , c p ! [h] or c p ? [h], where h is a heap with a single root o, indicating that the object o (together with its content) is sent or received on channel c p , n[c p ] indicating that the channel endpoint c p is received from access point n, or, finally, C.m() indicating that the thread spawns another one using method m of class C. Definition 7.14 (Labelled transition system). We define a labelled transition system for threads by the following rules: Note that both τ and C.m() correspond to the thread being able to reduce on its own. An important feature of this transition relation is that, for all rules, the right-hand state is fully determined by the left-hand one and the transition label. Moreover, the only case where several different transitions are possible from a given state is when applying the rule receive, as the right-hand side depends on the value received.
Definition 7.15. A similar transition relation, with the same set of labels, is defined on channel environments Θ as follows: We can now state our thread-wise type preservation theorem. Theorem 7.16 (Thread-wise progress and type preservation). Let D be a set of well-typed declarations, that is, such that for every class declaration D in D we have ⊢ D. In a context parameterised by D, suppose we have Θ; Γ ⊲ (h * r , e) : T ⊳ Γ ′ * r ′ .
Then either e is a value or there exists a transition label λ such that we have (h * r , e) λ −→ (h ′ * r ′′ , e ′ ) for some h ′ , r ′′ and e ′ .
(Theorem). We always use typing derivations where subsumption steps only occur at the positions described in Lemma 7.2. Furthermore, it is sufficient to consider only cases where subsumption does not occur at the end: indeed, if it does occur, then we can add a similar subsumption step to the new judgement. The hypothesis in the theorem that Θ; Γ ⊲ (h * r , e) : T ⊳ Γ ′ * r ′ holds is necessarily a result of T-State and therefore is equivalent to the two hypotheses Θ ⊢ h : Γ and Γ * r ⊲ e : T ⊳ Γ ′ * r ′ , which we will sometimes refer to directly.
We prove the theorem by induction on the structure of e with respect to contexts, and present the inductive case first: If e is of the form E[e 1 ] where e 1 is not a value and E is not just [_], then Lemma 7.11 tells us that Γ * r ⊲ e 1 : U ⊳ Γ 1 * r 1 appears in the typing derivation of Γ * r ⊲ e : T ⊳ Γ ′ * r ′ for some U , r 1 and Γ 1 . From there we can apply T-State and derive Θ; Γ ⊲ (h * r , e 1 ) : U ⊳ Γ 1 * r 1 . This allows us to use the induction hypothesis and get λ, e 2 , r ′′ and h ′ such that (h * r , e 1 ) • If e is a value, there is nothing to prove.
• e cannot be a variable. Indeed, Θ ⊢ h : Γ implies that dom(Γ) contains only object identifiers and channel endpoints. Therefore, Γ * r ⊲ e : T ⊳ Γ ′ * r ′ cannot be a conclusion of T-Var or T-LinVar, thus e is not a variable. • e = v; e ′ . Then the expression reduces by R-Seq and the initial derivation is as follows: • e = new C(). Then the expression reduces by R-New and the initial reduction is as follows: Let S = C.session. From the hypothesis that D is well-typed, we have ⊢ class C {S; f ; M }. This must come from T-Class, therefore we have − − → Null f ⊢ C : S (b). We build the following derivation: • e = switch (v) {l : e l } l∈E . Then we have two cases. The slightly more complex one is if the initial derivation is as follows: v is a label • e = f ↔ v. Then the initial derivation is as follows: and we also have, as usual, Θ ⊢ h : Γ (a). The fact that Γ 1 (r.f ) is defined implies that Γ(r.f ) is also defined, indeed the effect of typing v can only remove from the environment or create a variant type, so it can only decrease the set of valid field references. Thus h(r).f is defined as well, and the expression reduces by R-Swap.
Let v = h(r).f . From (a), (b), (c) and (d), we use Lemma 7.9 to get Γ ′′ such that Θ ⊢ h{r.f → v ′ } : Γ ′′ . We then notice that in each of the three cases of the lemma we have Γ ′′ * r ⊲ v : T ⊳ Γ 1 {r.f → T ′ } * r: • e = return v. Then the expression reduces by R-Return. The initial derivation is as follows: then implies that S is of the form l : S l l∈E with v ∈ E and F ′ ⊢ C : S v . Note that because F ′ is not a variant, S v must be a branch. Now, from that judgement and (a), we use the closing lemma to get Θ ⊢ h : Since Γ only differs from Γ 1 by the type of r.f , it is also the case of Γ ′′ , and as v : From all this, we build the following derivation: The initial derivation involves T-Spawn, and v is null. The premise that the method exists implies that the state can reduce by R-Spawn, which corresponds to a C.m() transition. The new derivation is obtained replacing T-Spawn with T-Null.
• e = f.m(v). The initial derivation is as follows, with m = m j and j ∈ I: j is a part of a method signature and that only a restricted set of types is allowed there: it cannot be of the form link f ′ . Furthermore, (1) cannot be T-VarF because of (b), thus T ′ is not variant-tag either. Indeed, if Γ 1 (r) were a variant, Γ 1 (r.f ) would not be defined. Therefore (1) is either T-Null, T-Label, T-Chan, T-Name or T-Ref and in all cases we have Γ(r.f ) = Γ 1 (r.f ). As it is a session type, it implies because of (a) that h(r).f exists and is either an object identifier, an access point name or a channel endpoint. We distinguish these three cases: h(r).f is an object identifier o. We use (a) and the opening lemma (Lemma 7.7) to get a field typing C[F ] such that Θ ⊢ h : Γ{r.f → C[F ]} and F ⊢ C : S. This last judgement implies, by definition, that F is not a variant; that, among others, method m j appears in the declaration of class C; and that, if e j is its body and x its parameter, we have x : T ′ j , this : C[F ] * this ⊲ e j : T j ⊳ this : C[F j ] * this and F j ⊢ C : S j . The fact that the method is declared implies (h * r , e) −→ (h * r.f , return e j { v / x }); we now have to type this resulting state. For this, we apply the substitution lemma (Lemma 7.10) to the typing judgement for e j , using Γ 1 {r.f → C[F ]} as the Γ of the lemma and r.f as the r of the lemma. The first case of the lemma corresponds to (1) being T-Null, T-Label or T-Name; the second one corresponds to (1) being T-Ref or T-Chan. In both cases, the resulting judgement is: Indeed, the difference between Γ and Γ 1 depends on (1) in the same way as the lemma's result. From this and F j ⊢ C : S j we can now apply T-Return and get: where T is the same as in the initial derivation. We then conclude, using the heap typing that was provided by the opening lemma, with T-State.
h(r).f is an access point name n. Then Γ(r.f ) must come, in the derivation of Θ ⊢ h : Γ, from T-Name, which implies that n is declared, that m j is either accept or request, and that T j :> Σ where Σ is either the declared type or its dual depending on which one m j is. All this implies that the state does a n[c p ] transition where c is fresh and p depends, again, on m j , and that −→ Θ, c p : Σ. The resulting state is typed using Γ ′′ = Γ, c p : Σ and T-Chan.
h(r).f is a channel endpoint c p . Then Θ ⊢ h : Γ implies that c p ∈ dom(Θ) and S :> Θ(c p ) . Hence m j is either send or receive. We distinguish the two cases.
In the first case, the fact that S contains send implies that Θ(c p ) is either of the form ! T ′′ j .Σ with T ′ j <: T ′′ j or ⊕ {l : Σ l } l∈E and then T ′ j = {v} and v ∈ E. If v is not an object identifier, then the state does a c p ! [v] transition. We can see that in both cases (send and select), Θ is able to follow that transition and evolves in such a way that Θ ′ ⊢ h : Γ ′ holds: the session type of c p is advanced and if v was a channel it is removed from the environment, which corresponds to the difference between Γ and Γ ′ , thus it suffices to change the instance of T-Hempty at the root of the derivation leading to (a) to get this new typing. Then the new state is typed using T-Null and T-State. If v is an object identifier, then (1) is T-Ref and thus v ∈ dom(Γ), which implies (using (a)) that v is a root of h, so the state does a c p ! [h ↓ v] transition. We use the splitting lemma (Lemma 7.4) to see that Θ is able to follow this transition and yields a Θ ′ such that we have Θ ′ ⊢ h ↑ v : Γ ′ . We can then again conclude using T-Null and T-State.
In the case where m j is receive, the state can straightforwardly do a transition, which will be a receive on channel c p , however the transition label is not completely determined by the original state as we do not know what will be received. So we have to prove type preservation in all cases where the transition label λ is such that Θ λ −→ Θ ′ for some Θ ′ . If λ is of the form c p ? [v ′ ], then this hypothesis tells us that Θ(c p ) is either of the form ? [T 0 ] .Σ, and then v ′ must be a literal value of type T 0 or a channel endpoint which gets added to the environment with a type smaller that T 0 , or of the form & {l : Σ l } l∈E , and then v ′ ∈ E. In the first case we must have T 0 <: T j , thus the resulting expression, which is v ′ , can be typed using the appropriate literal value rule, or T-Chan, and subsumption. In the second one, T j = variant-tag so that T = link f ; the resulting expression can be typed using T-Label and T-VarS. As for the new initial environment, it is obtained, as in the case of send, by replacing the instance of T-Hempty at the top of the derivation for (a) with one using Θ ′ instead of Θ, so that v ′ gets added to the initial environment if it is a channel and that the session type of r.f is correctly advanced, meaning, in the case of a branch, that it is advanced to the particular session corresponding to v ′ , the variant type being reconstituted in the final environment by T-VarS. Finally, if λ is of the form c p ? [h ′ ], then we have Θ ′ = Θ + Θ ′′ with Θ ′′ ⊢ h ′ : o : T j , where o is the only root of h ′ . The merging lemma (Lemma 7.5) gives us a typing for the new heap and, as in the other cases, advancing the session type of c p yields a session type change in r.f , corresponding to the difference between Γ and Γ ′ . We conclude using T-Ref and T-State.
The following two lemmas will allow us to deduce from this theorem the proof of subject reduction for configurations.
and (h * r , e) Proof. This is nothing more than a reformulation of the reduction rules in terms of labelled transitions: the derivation for s −→ s ′ can contain any number of instances of R-Par, R-Str or R-NewChan but must have one of the other rules at the top. It is straightforward to see that depending on that top rule we are in one of the five cases listed: (1) for any of the single-thread rules in Figure 7, (2) for R-Init, (3) for R-ComBase, (4) for R-ComObj, and (5) for R-Spawn.
We can now prove Theorem 7.13.
(Theorem 7.13). Because of Lemma 7.18 we only need to look at the different cases described in Lemma 7.19.
In cases (1) and (5), the initial derivation is as follows: In case (1), Theorem 7.13 gives us Θ 1 ; Γ ′′ ⊲ (h ′ * r ′ , e ′ ) : T ⊳ Γ ′ * r ′′ ; from there the final derivation is the same. In case (5), the theorem gives us the same result, but the final derivation is more complicated as there is one more parallel component. The C.m() transition tells us that e must be of the form E[spawn C.m(v)]. From Lemma 7.11, this implies that the subexpression spawn C.m(v) is typable, which must be a consequence of T-Spawn, implying that m appears in the initial session type S of C with a Null argument type. As, by hypothesis, the declaration of class C is well-typed, this implies (from T-Class) x : Null, this : C[ − − → Null f ] * this ⊲ e ′′ : T ⊳ this : C[F ] * this. We apply the substitution lemma (7.10) to this judgement to replace this with o and x with null, and we build the heap typing from T-Hempty and T-Hadd. This gives a typing for the new thread, with an empty Θ, using T-State and T-Thread and we can conclude with T-Par.
In cases (2), (3), and (4), the initial derivation is: Furthermore, we can deduce from the transition labels that the expressions in the two topmost premises are of the form with h 1 (r 1 ).f 1 and h 2 (r 2 ).f 2 being, in case (2), n, and in cases (3) and (4), respectively c p and c p . These two topmost premises must come from T-State, which implies Θ 1 ⊢ h 1 : Γ 1 and Θ 2 ⊢ h 2 : Γ 2 , from which we deduce, in case (2), that n is a declared access point name and in cases (3) and (4) that Θ 1 (c p ) <: Γ 1 (r 1 .f 1 ) and Θ 2 (c p ) <: Γ 2 (r 2 .f 2 ). We use Theorem 7.13 on these two topmost premises and distinguish cases. In case (2), Θ 1 and Θ 2 make transitions which introduce two dual types for d + and d − , which are fresh so that the disjoint unions are still possible, and we just need to add an additional step of T-NewChan before the last one.
In cases (3) and (4), we first remark that because T-NewChan in the derivation leads to an empty environment, c must be one of the channels in (ν c) and we must have Θ 1 (c p ) = Σ and Θ 2 (c p ) = Σ for some Σ. Then we use Lemma 7.11 to get a typing judgement for the method call subexpression on the sending side (thread 1). This judgement has Γ 1 as an initial typing environment and comes from T-Call; as we have Σ <: (3)) or o (in case (4)) of type T , or (only in case (3)) of the form ⊕ {l : Σ l } l∈E with v ∈ E. The simplest case is (3): then this typing information, together with the duality of the two endpoint types, shows that Θ 2 follows the transition with the new type of c p still dual to the new type of c p . In the case where v is a channel endpoint, its typing goes from Θ 1 to Θ 2 but stays the same, so that it is unchanged in the sum environment yielded by T-Par. Thus we can still apply T-NewChan.
Case (4) is similar but, additionally, a renaming function is applied to the transmitted heap. We use Lemma 7.6 to see that the type of its only root, which is all we need, stays the same, so that again Θ 2 can follow the transition. We also have that a whole part of the channel environment can go from Θ 1 to Θ 2 but the effect is the same as with just one channel: it does not affect the sum environment resulting from T-Par. So again we can still apply T-NewChan. 7.3. Type safety. We now have the following safety result, ensuring not only race-freedom (no two sends or receives in parallel on the same endpoint of a channel) but also that the communication is successful.
As the statement is true in particular when s ′ is empty, it implies that communication between the two threads is possible.
Proof. This is an essentially straightforward consequence of ⊢ s. The typing derivation is similar to the one shown for cases 2/3/4 in Theorem 7.13 above; the two top premises must be consequences of T-State and the heap typing necessary to apply this rule implies, respectively, Γ 1 (r.f ) :> Θ 1 (c p ) and Γ 2 (r ′ .f ′ ) :> Θ 2 (c q ) . Because of the disjoint unions in T-Par, c p ∈ dom(Θ 1 ) and c q ∈ dom(Θ 2 ) immediately imply (1) and (2); (3) is then a consequence of the duality constraint imposed by T-NewChan: looking at the translations of dual channel types, and because the method call subexpressions must be typed by T-Call, if m is send then m ′ must be receive and vice-versa.
This theorem, together with the progress aspect of Theorem 7.16, restricts the set of blocked configurations to the following: if ⊢ s and s −→, then all parallel components in s are either terminated (reduced to values), unmatched accepts or requests, or method calls on pairwise distinct channels -this last case corresponding to a deadlock. 7.4. Conformance. We now have the technical material necessary to prove Theorem 3.32 (conformance). Note that we do not formally extend this result to the distributed setting, as stating a similar property in that case would require more complex definitions describing, among other things, how call traces are moved around between threads; however we can see informally that, because objects keep their content and session type when transmitted, all necessary information is kept such that we still have a conformance property.
Proof. We first prove, by strong induction on n, a slightly different result, namely the following: We suppose that this property is true for any reduction sequence of length n or less whose initial state satisfies the hypotheses and prove that it is true also for length n + 1. The base case n = 1 is trivial.
If the nth reduction step (h n * r n , e n ) −→ (h n+1 * r n+1 , e n+1 ) does not originate from R-Return, we use the induction hypothesis on the beginning of the sequence; we refer to the cases in the proof of Theorem 7.16 to show that the Γ n+1 it allows to construct from Γ n indeed is consistent with tr n+1 . Because we are only interested in Γ n+1 and not Γ ′ , in most cases we can use Lemmas 7.11 and 7.12 to ignore any context E and proceed as if the reduction is exactly an instance of its original rule.
If the rule is R-Seq, R-Switch or R-Swap then tr n+1 = tr n .
If the rule is R-Seq or R-Switch then the proof of Theorem 7.16 shows that we can choose Γ n+1 = Γ n , so there is nothing more to prove.
If the rule is R-Swap then the proof of Theorem 7.16 indicates that Γ n+1 (called Γ ′′ in subject reduction) can be defined using Lemma 7.9 from the Γ ′′′ (called Γ 1 in subject reduction) obtained after typing v ′ , the value that gets swapped into the field. First of all note that most objects, notably all those which are not v ′ and not in a field of r, have the same type and position in the heap in Γ n+1 as they have in Γ. For all them the result is straightforward: we only concentrate on those objects that move or change type. Depending on the nature of T and T ′ (object, link, or base type), there may be one or two of them. Recall that neither type can be variant-tag as else the expression would not be typable. We distinguish cases separately for T and T ′ , knowing that any combination is possible (except both linking to the same field). Cases for T ′ : Thus consistency is preserved for r.f ′ .
Cases for T (corresponding respectively to cases 1 and 3 of Lemma 7.9): • If T is an object type (thus h n (r.f ) is an object name o), then Γ n+1 contains a new entry for o, with type Γ n (r.f ). Consistency for this new entry comes from consistency for r.f at the previous step. • If T is link f ′′ , then Γ n (r.f ′ ) = l : S l l∈E and h n (r.f ) = l 0 is in E. Thus the actual session type of r.f ′′ in h n according to Γ n is S l 0 . Lemma 7.9 also gives us Γ n+1 (r.f ′ ) = S l 0 , hence the actual session type of r.f ′′ has not changed, and consistency is preserved.
If the rule is R-New then the proof of Theorem 7.16 shows that a suitable Γ n+1 is of the form Γ n , o : C.session where o is the fresh object name introduced by the reduction. Definition 3.27 states that tr n+1 extends tr n by assigning an empty call trace to o; clearly tr n+1 is consistent with Γ n+1 .
If the rule is R-Call then the proof of Theorem 7.16 shows that a suitable Γ n+1 is Γ n with the type of r.f replaced by a type which is not a session. So there is no consistency requirement in Γ n+1 for r.f , and every other reference is given the same call trace by tr n+1 as by tr n . Therefore tr n+1 is consistent with Γ n+1 . Now if the nth step originates from R-Return, we reason slightly differently. We know by hypothesis that r 1 is a prefix of r n+1 . Furthermore, since the nth step is R-Return, r n is of the form r n+1 .f . Reduction rules can only alter the current object by removing or adding one single field reference at once, therefore there must be a previous reduction step in the sequence, say the ith, that last went from r n+1 to r n+1 .f . That is, we chose i such that r n+1 .f is a prefix of all r j for j between i + 1 and n and that r i = r n+1 . That step must originate from R-Call as it is the only rule which adds a field specification to the current object. Thus, it is of the form where e is the method body of m with the parameter substituted. Then it is straightforward to see that the whole reduction sequence from i + 1 to n consists of reductions of e inside the context E(return [_]).
We first use the induction hypothesis on the first part of the reduction (1 to i) so as to get judgments up to Γ i ⊲ (h i * r n+1 , E(f.m(v ′ ))) : T ⊳ Γ ′ * r ′ . We then use Lemma 7.11 to get Γ i ⊲ (h i * r n+1 , f.m(v ′ )) : T ′ ⊳ Γ ′′ * r ′′ and note that this judgment must come from T-Call, which implies that r ′′ = r n+1 , that Γ i (r n+1 .f ) is of the form {T ′ m(. . .) : S, . . .} and that Γ ′′ (r n+1 .f ) = S. Furthermore, T ′ is either a base type if S is a branch or link f if it is a variant. We know that tr i is consistent with Γ i , therefore we have class(o).session We now use the induction hypothesis again on the reduction sequence from i to n for this particular call subexpression, recalling that i has been defined such that the hypothesis on the current object is indeed satisfied by this sequence. We can also use Lemma 7.12 at each step in order to lift the judgements thus obtained to the whole expression. To summarise, this means that for any j between i + 1 and n we have: e j = E(return e ′ j ) for some e ′ j , Γ j ⊲ (h j * r j , return e ′ j ) : T ′ ⊳ Γ ′′ * r n+1 and Γ j ⊲ (h j * r j , e j ) : T ⊳ Γ ′ * r ′ , and that tr j is consistent with Γ j .
For the last reduction step, R-Return, the proof of Theorem 7.16 tells us that we can choose a Γ n+1 which is identical to Γ n except for the type of r n+1 .f , and as the call trace for other references is not modified, consistency is preserved for them. For r n+1 .f we have to look back at the initial subexpression on step i. First note that R-Swap can only act on a field of the current object, therefore since r n+1 .f is a prefix of the current object during the whole subsequence, its content cannot change and is the same object o throughout. Similarly, there is no other R-Call or R-Return acting on that particular object, hence tr n (o) = tr i+1 (o) = tr i (o)m. We saw above that this call trace leads the initial session of o to S. Then the judgement for the final subexpression, at step n + 1, is of the form Γ n+1 ⊲ (h n+1 * r n+1 , v) : T ′ ⊳ Γ ′′ * r n+1 . There are two cases, as in the proof of Theorem 7.16. If T ′ is a base type then S is a branch and it is possible to decide that Γ n+1 (r n+1 .f ) is equal to S. In that case the call trace either does not change or has a label appended, but as S is a branch it can do a transition to itself with any label, therefore tr n+1 (o) is consistent with Γ n+1 (r n+1 .f ) in both cases. If T ′ is link f , then v is a label, S is a variant l : S l l∈E and Γ n+1 (r n+1 .f ) can be chosen equal to S v . We have tr n+1 (o) = tr n (o)v and S v −→ S v , so consistency is preserved.
This completes the inductive proof that for every step i in the reduction sequence there is Γ i such that Γ i ⊲ (h i * r i , e i ) : T ⊳ Γ ′ * r ′ and tr i is consistent with Γ i . This fact obviously implies that tr i is valid for all the objects which have a session type in Γ i ; we now argue that it is also the case for the other objects, namely those which either are not at all in Γ i or do not have a session type. We know by hypothesis that it is the case for tr 1 and show by a very simple induction that it cannot change from i to i + 1. The ith step can only change the call trace for an object o if it originates from R-Call or R-Return concerning that object. R-Call can only occur if the reducible part of the expression is indeed a method call on a field which contains o, and that is only typable if Γ i contains a session type for that field which is a branch containing the method, and thus allows the appropriate transition: therefore validity of the call trace for o is preserved in that case. R-Return on the other hand can only occur if the reducible part of the expression is a return and if the current object is (the address of) o, and we saw that in that case the Γ i+1 constructed in our proof contains a session type for o, so this case is covered by the consistency result.
i is an enumeration E and Ti = variant-tag then let ∆i = AC (Si, l : Figure 28: Typechecking: algorithms W and A.

Type Checking Algorithm
This section introduces a type checking algorithm, sound and complete with respect to the type system in Section 6, and describes a prototype implementation of a programming language based on the ideas of the paper.
8.1. The Algorithm. Figures 28 and 29 define a type checking algorithm for the distributed language, including the sequential extensions from Section 4. The algorithm is applied to each component of a distributed system, and in order to ensure type safety of the complete system there must be some separate mechanism to check that each access point n is given the same type everywhere. A program is type checked by calling algorithm W on each class definition and checking that no call generates an error. The definition of algorithm W follows the typing rule T-Class in Figure 15. It calls algorithm A to check the relation F ⊢ C : S and algorithm B to type check the bodies of the methods that have req/ens annotations. Algorithm A also calls algorithm B to typecheck the bodies of the methods that appear in the session type.
In both A and B there are several "if" and "where" clauses; they should be interpreted as conditions which, if not satisfied, cause termination with a typing error.
Because of the coinductive definition of F ⊢ C : S, algorithm A uses a set ∆ of assumed relationships between field typings F and session types S. If there is no error then the algorithm returns ∆, but at the top level we are only interested in success or failure, not in the returned value.
Algorithm B checks the typing judgement for expressions, defined in Figure 10 Si}i∈I and j ∈ I and T ′ <: T ′ j and if T = variant-tag then F ′ = l : F l l∈E and T ′′ is an enumeration E ′ and E ⊆ E ′ and l∈E F l <: F ′′ else T <: T ′′ and F ′ <:  variant field typing with the single label l. More general variant field typings are produced when typing switch expressions, as the ∨ operator is used to combine the field typings arising from the branches. This is the typical situation when typing the body of a method whose return type is variant-tag: the body contains a switch whose branches return different labels with different associated field typings. It is possible, however, that giving type variant-tag to l is incorrect. It might turn out that the expression needs to have an enumerated type E, for example in order to be passed as a method parameter or returned as a method result of type E. An expression that has been inappropiately typed with variant-tag can, in general, be associated with any variant field typing, for example if it contains a switch whose branches yield different field typings. In this case, the algorithm uses ∨ to combine the branches of the variant field typing into a single field typing; the join is always over all of the labels in the variant. This happens in several places in algorithm B, indicated by conditions of the form "if T = variant-tag", and in the final "else" branch of the third clause of algorithm A.
The algorithm for checking subtyping is not described here but is similar to the one defined for channel session types by Gay and Hole [37]. We write S ∨ S ′ for the least upper bound of S and S ′ with respect to subtyping. It is defined by taking the intersection of sets of methods and the least upper bound of their continuations. Details of a similar definition (greatest lower bound of channel session types) can be found in the work of Mezzina [52].
The type checking algorithm is modular in the sense that to check class C we only need to know the session types of other classes, not their method definitions.
We have not yet investigated type inference, but there are two ways in which it might be beneficial. One would be to infer the req/ens annotations. The other would be to support some form of polymorphism over field typings, along the lines that if method m does not use field f then it should be callable independently of the type of f . This might reduce the need to type check the definition of m every time it occurs in the session type.

8.2.
Examples of Type Checking. Figure 30 defines classes C and D. In class C, only the outer layer of the session type is of interest; the example uses an object of class C but does not need the definition of method m. Class D, as well as the outer layer of the session type, contains a field f and one or two candidate definitions for each of the methods a, b, c and d. The definitions of a and aa are alternatives for the method a specified in the session type, and so on.
The definition of a is not typable because the type of the returned expression is link f . Allowing this would let the caller of a have access to field f. Instead, the result of f .m(x) must be analyzed with a switch, as in the definition of aa, which is typable. The linkthis type required by the signature of a is introduced by the enumeration labels FALSE and TRUE in the branches of the switch. A compiler could insert switches of this kind automatically, allowing the definition of a as syntactic sugar.
The remaining method definitions are all typable and illustrate different features of the type system and the algorithm. In the definition of b, the method even is supposed to be the obvious function for testing parity of an integer, returning TRUE or FALSE. This definition is typable even though the body of b does not introduce a linkthis type, because algorithm A constructs a variant field typing over {TRUE,FALSE} in which both options are the same. This is seen in the first else clause of A. The definition of bb achieves the same effect by using the labels FALSE and TRUE to introduce the type linkthis. Each label corresponds to a partial variant field typing, and checking the switch combines them by means of the ∨ operator. Because the field f is not involved in the method body, the field typing is the same in both options of the variant.
Method c has the same definition as a, but this time the signature in the session type specifies a simple enumeration as the return type. This is allowed, by using the ∨ operator to construct the join of the field typings, in the second else clause of A. This means that when the algorithm proceeds to type check method definitions in the session type SD1, the type of f is taken to be the join of SCf and SCt. Whether or not this loss of information causes a problem will depend on the particular definitions of those types, which we have not shown. Method cc is handled in the same way, but this time there is no loss of information because the types being joined are identical; this in turn is because f is not involved in the method body.  (1) T ′ <: T and F ′′ <: F ′ , or (2) T = variant-tag, T ′ is an enumeration E, F ′ = l : F l l∈E ′ , E ⊆ E ′ and ∀l ∈ E. F ′′ <: F l , or (3) T is an enumeration E, T ′ = variant-tag, F ′′ = l : F l l∈E ′ , E ′ ⊆ E and ∀l ∈ E ′ . F l <: F ′ .
Proof. By induction on the typing derivation. Proof. Similar to the proof of Theorem 8.5, by induction on the recursive calls within a given top-level call.
Proof. Similar to the proof of Theorem 8.5, by induction on the recursive calls within a given top-level call.
Proof. By Corollary 8.9 and the fact that F ⊢ C : S is defined in terms of the unfolded structure of session types, it is sufficient to consider the case in which S is guarded.
Similarly to the proof of Theorem 8.5, consider the recursive calls in the execution of A C (S 0 , F 0 , ∅). We show that the following relation is a C-consistency relation: This is easily checked, using the three cases of Lemma 8.4 to correspond to the three cases in the third clause of the definition of A.
8.4. Implementation. The ideas introduced in this paper can be used to extend a conventional Java compiler, by including @session annotations in classes and in method parameters, as well as @req and @ens annotations for recursive methods (cf. Section 4.5). The extension only concerns type checking; there is no need to touch the back-end of the compiler.
To keep in line with the expectations of Java programmers, annotations follow the first style in Figure 1, page 4. Also, the type system is nominal (cf. Section 4.7); label sets (cf. Figure 5) are explicitly introduced via enum declarations. The concepts contained in our core language can then be extended towards the whole of Java. In particular: • The while loop technique described in Section 4.4 can be extended to handle for and do−while loops. • The same idea can be used to type the various goto instructions present in Java: exceptions, break, continue and return, labelled versions included. • All control flow instructions (including if −then, not discussed in the paper) can be used with conventional or with session-related boolean/enum values. • Classes not featuring a @session annotation are considered shared rather than linear.
Their objects can be treated very much like the null value (cf. Section 4.6). We do not allow a shared class to contain a linear field, even though it is perfectly acceptable for a method of a shared class to have a linear parameter. • The same technique used for "top-level" classes can be used for inner, nested, local (defined within methods) and anonymous classes. • In order to mention overloaded methods in @session annotations, alias names for these methods can be introduced via extra annotations. • Static fields are always shared.
• Class inheritance (cf. Section 4.7) can be supported. We have used the Polyglot [61] system for an initial prototype extension of Java, but a more thorough design and implementation are left for future work.

Related Work
There is a large amount of related work, originating from several different approaches. Our discussion of related work is organised according to these approaches.
Previous work on session types for object-oriented languages. Dezani-Ciancaglini, Yoshida et al. [17,27,28,30] have taken an approach in which a class define sessions instead of methods. Invoking a session on an object creates a channel which is used for communication between two blocks of code: the body of the session, and a co-body defined by the invoker of the session. A session is therefore a generalization of a method, in which there can be an extended dialogue between caller and callee instead of a single exchange of parameters and result. The structure of this dialogue is defined by a session type. This approach proposes a new paradigm for concurrent object-oriented programming, and as far as we know it has not yet been implemented. In contrast, our approach maintains the standard execution model of method calls.
The SJ (Session Java) language, developed by Hu [45], is a less radical extension of the object-oriented paradigm. Channels, described by session types, are essentially the same as those in the original work based on process calculus. Program code is located in methods, as usual, and can create channels, communicate on them, and pass them as messages. SJ has a well-developed implementation and has been applied to a range of situations. However, SJ has one notable restriction: a channel cannot be stored in a field of an object. This means that a channel, once created, must be either completely used, or else delegated (sent along another channel), within the same method. It is possible for a channel to be passed as a parameter to another method, but it is not possible for a session to be split into methods that can be called separately, each implementing part of the session type of a channel that is stored in a field. A distinctive feature of our work is that we can store a channel in a field of an object and allow several methods to use it. This is illustrated in Figures 18 and 22.
Hu et al. [44,59] have also extended SJ to support event-driven programming, with a session type discipline to ensure safe event handling and progress. We have not considered event-driven programming in our setting.
Campos and Vasconcelos [15,16] developed MOOL, a simple class-based object-oriented language, to study object usage and access. The novelties are that class usage types are attached to class definitions, and the communication mechanism is based on method call instead of being channel-based. The latter feature is the main difference with respect to our work.
Non-uniform concurrent objects/active objects. Another related line of research, started by Nierstrasz [60], aimed at describing the behaviour of non-uniform active objects in concurrent systems, whose behaviour (including the set of available methods) may change dynamically. He defined subtyping for active objects, but did not formally define a language semantics or a type system. The topic has been continued, in the context of process calculi, by several authors [13,14,21,22,56,57,65,66,67]. The work by Caires [13] is the most relevant work; it uses an approach based on spatial types to give very fine-grained control of resources, and Militão [53] has implemented a Java prototype based on this idea. Damiani et al. [24] define a concurrent Java-like language incorporating inheritance and subtyping and equipped with a type-and-effect system, in which method availability is made dependent on the state of objects.
The distinctive feature of our approach to non-uniform objects, in comparison with all of the above work, is that we allow an object's abstract state to depend on the result of a method call. This gives a very nice integration with the branching structure of channel session types, and with subtyping.
Specifically related to the notion of subtyping between session types, the work of Rossie [68] is worth mentioning. He proposes a type-based approach to ensure that both component objects and their clients have compatible protocols. The typing discipline specifies not only how to use the component's methods, but also the notifications it sends to its clients. Rossie calls this enhanced specification a Logical Observable Entity (LOE), which is a finite-state machine equipped with a subtyping notion. An LOE is a high-level description of an object, specifying which transitions (method executions) change its state, providing for each state both the available methods and notifications to be sent to the clients. LOEs support behavioural subtyping, in its afferent aspects (how clients may affect the LOE)a subtype must allow at least the traces of its supertype, and in its efferent aspects (how a LOE processing a method request has effects on clients) -the subtype must not send more notifications than the supertype. This behavioural subtyping notion on finite-state machines, which is in its spirit very similar to the one of session types -"more offers, less requests", is defined as a simulation relation. Rossie shows that this relation ensures safe substitutability.
Typestate. Based on the fact that method availability depends on an object's internal state (the situation identified by Nierstrasz, as mentioned above), Strom and Yemini [69] proposed typestate. The concept consists of identifying the possible states of an object and defining pre-and post-conditions that specify in which state an object should be so that a given method would be available, and in which state the method execution would leave the object.
Vault [25,32] follows the typestate approach. It uses linear types to control aliasing, and uses the adoption and focus mechanism [32] to re-introduce aliasing in limited situations. Fugue [26,33] extends similar ideas to an object-oriented language, and uses explicit preand post-conditions. Bierhoff and Aldrich [7] also work on a typestate approach in an object-oriented language, defining a sound modular automated static protocol-checking setting. They define a state and method refinement relation achieving a behavioural subtyping relation. The work is extended with access permissions, that combine typestate with aliasing information about objects [8], and with concurrency, via the atomic block synchronization primitive used in transactional memory systems [5]. Like us, they allow the typestate to depend on the result of a method call. Plural is a prototype language implementation that embodies this approach, providing automated static analysis in a concurrent object-oriented language [9]. To evaluate their approach they annotated and verified several standard Java APIs [10].
Militão et al. [54] develop a new aliasing control mechanism, finer and more expressive than previous proposals, based on defining object views according to specific access constraints. The discipline is implemented in a type system combining views and a typestate approach, checking user defined aliasing patterns.
Sing# [31] is an extension of C# which has been used to implement Singularity, an operating system based on message-passing. It incorporates session types to specify protocols for communication channels, and introduces typestate-like contracts. Bono et al. [12] have formalised a core calculus based on Sing# and proved type safety. A technical point is that Sing# uses a single construct switch receive to combine receiving an enumeration value and doing a case-analysis, whereas our system allows a switch on an enumeration value to be separated from the method call that produces it.
Aldrich et al. [1] have proposed typestate-oriented programming. The aim is to integrate typestate into language design from the beginning, instead of adding typestate constraints to an existing language. Their prototype language is called Plaid. Instead of class definitions, a program consists of state definitions; each state has methods which cause transitions to other states when they are called. Like classes, states are organised into an inheritance hierarchy. The specifications of state transitions caused by methods are similar to the preand post-conditions of Plural. Aliasing is managed by a system of access permissions [8]. More recent work [35,75] combines gradual typing and typestate, to integrate static and dynamic typestate checking.
Session types and typestate are related approaches, but there are stylistic and technical differences. With respect to the former, session types are like labelled transition systems or finite-state automata, capturing the behaviour of an object. When developing an application, one may start from session types and then implement the classes. Typestates take each transition of a session type and attach it to a method as pre-and post-conditions. Because typestate systems allow pre-and post-conditions to be specified arbitrarily, the possible sequences of method calls are less explicit. With respect to technical differences, the main ones are: (a) session types unify types and typestates in a single class type as a global behavioural specification; (b) our subtyping relation is structural, while the typestates refinement relation is nominal; (c) Plural uses a software transactional model as concurrency control mechanism (thus, shared memory), which is lighter and easier than locks, but one has to mark atomic blocks in the code, whereas our communication-centric model (using channels) is simpler and allows us to use the same type abstraction (session types) instead of a new programming construct; moreover, channel-based communication also allows us to specify the client-server communication protocol as the channel session type, and to implement it modularly, in several methods which may even be in different classes; (d) typestate approaches allow flexible aliasing control, whereas our approach uses only linear objects (to add better alias/access control is simple and an orthogonal issue).
Affine types. Tov and Pucella [71] have developed Alms, a language in the style of OCaml with an affine type system as a generalisation of linear typing. Alms is a general-purpose programming language, in which the affine type system provides an infrastructure suitable for defining a variety of type-based resource control patterns including alias control, session types and typestate. It has been implemented, and type safety has been proved for a formal calculus. Representing a particular approach to typestate, such as our specifications of allowed sequences of method calls, would require an encoding; in contrast, our language aims to provide a convenient high-level programming style.
Static verification of protocols. Cyclone [40] and CQual [34] are systems based on the C programming language that allow protocols to be statically enforced by a compiler. Cyclone adds many benefits to C, but its support for protocols is limited to enforcing locking of resources. Between acquiring and releasing a lock, there are no restrictions on how a thread may use a resource. In contrast, our system uses types both to enforce locking of objects (via linearity) and to enforce the correct sequence of method calls. CQual expects users to annotate programs with type qualifiers; its type system, simpler and less expressive than the above, provides for type inference.
Unique ownership of objects. In order to demonstrate the key idea of modularizing session implementations by integrating session-typed channels and non-uniform objects, we have taken the simplest possible approach to ownership control: strict linearity of non-uniform objects. This idea goes back at least to the work of Baker [4] and has been applied many times. However, linearity causes problems of its own: linear objects cannot be stored in shared data structures, and this tends to restrict expressivity. There is a large literature  [62] among others. In future work we intend to use an off-the-shelf technique for more sophisticated alias analysis. The property we need is that when changing the type of an object (by calling a method on it or by performing a switch or a while on an enumeration constant returned from a method call) there must be a unique reference to it.
Resource usage analysis. Igarashi and Kobayashi [48] define a general resource usage analysis problem for an extended λ-calculus, including a type inference system, that statically checks the order of resource usage. Although quite expressive, their system only analyzes the sequence of method calls and does not consider branching on method results as we do.
Analysis of concurrent systems using pi-calculus. Some work on static analysis of concurrent systems expressed in pi-calculus is also relevant, in the sense that it addresses the question (among others) of whether attempted uses of a resource are consistent with its state. Igarashi and Kobayashi have developed a generic framework [47] including a verification tool [49] in which to define type systems for analyzing various behavioural properties including sequences of resource uses [50]. In some of this work, types are themselves abstract processes, and therefore in some situations resemble our session types. Chaki at al. [19] use CCS to describe properties of pi-calculus programs, and verify the validity of temporal formulae via a combination of type-checking and model-checking techniques, thereby going beyond static analysis.
All of this pi-calculus-based work follows the approach of modelling systems in a relatively low-level language which is then analyzed. In contrast, we work directly with the high-level abstractions of session types and objects.

Conclusion
We have extended existing work on session types for object-oriented languages by allowing the implementation of a session to be divided between several methods which can be called independently. This supports a modular approach which is absent from previous work. Technically, it is achieved by integrating session types for communication channels and a static type system for non-uniform objects. A session-typed channel is one kind of nonuniform object, but objects whose fields are non-uniform are also, in general, non-uniform. Typing guarantees that the sequence of messages on every channel, and the sequence of method calls on every non-uniform object, satisfy specifications expressed as session types.
We have formalized the syntax, operational semantics and static type system of a core distributed class-based object-oriented language incorporating these ideas. Soundness of the type system is expressed by type preservation, conformance and correct communication theorems. The type system includes a form of typestate and uses simple linear type theory to guarantee unique ownership of non-uniform objects. It allows the typestate of an object after a method call to depend on the result of the call, if this is of an enumerated type, and in this situation, the necessary case-analysis of the method result does not need to be done immediately after the call.
We have illustrated our ideas with an example based on a remote file server, and described a prototype implementation. By incorporating further standard ideas from the related literature, it should be straightforward to extend the implementation to a larger and more practical language.
In the future we intend to work on the following topics. (1) More flexible control of aliasing. The mechanism for controlling aliasing should be orthogonal to the theory of how operations affect uniquely-referenced objects. We intend to adapt existing work to relax our strictly linear control and obtain a more flexible language. (2) In Section 4.7 we outlined an adaptation of our structural type system to a nominal type system as found in languages such as Java. We would also like to account for Java's distinction and relationship between classes and interfaces. (3) Specifications involving several objects. Multi-party session types [11,42] and conversation types [14] specify protocols with more than two participants. It would be interesting to adapt those theories into type systems for more complex patterns of object usage.
not changed in the rest of the derivation except possibly by the subsumption step at the end; therefore T ′ is a subtype of T . If they are session types, using Proposition 3.18 we can change the last occurrence of T-Hide to use T instead of T ′ and get Θ 2 ⊢ (h ↓ o) : o : T . Otherwise, we can add a subsumption step to the derivation for the sequence of swaps on the right of T-Hadd to get the same result.
For the rest of the derivation, we know that o is not used, therefore it can be removed from the initial environment without affecting the derivation except by the fact that it will not be in the final environment either. Furthermore, we know from Lemma 7.1 that the initial environment minus its only object identifier o is included in Θ \ chans(h ↓ o) = Θ 1 . More precisely, the lemma gives us inclusion of domains, but because subsumption is not used in the first part of the derivation we also know that the types are the same. Thus we can replace the first part of the derivation by an instance of T-Hempty using Θ 1 and the second part is still valid (with all the descendants of o removed from the heap), yielding Θ 1 ⊢ (h ↑ o) : Γ at the bottom. Proof. Since Θ and Θ ′ are disjoint, the channels in Θ ′ cannot appear anywhere in the typing derivation for h. Thus, it is possible to add Θ ′ to every typing environment occurring in the derivation for h without altering its validity, yielding Θ + Θ ′ ⊢ h : Γ + Θ ′ . Looking now at the derivation for h ′ , since the domains of the heaps are disjoint and objs(Γ) ⊆ roots(h), none of the identifiers in Γ can appear anywhere in it. Thus we can add Γ to every typing environment and h to every heap occurring in the derivation for h ′ , replacing the T-Hempty at the top with the conclusion of the other derivation, which yields the result we want. Proof. Straightforward. Changing the names does not affect the typing derivation in any way.
Lemma 7.7 (Opening). If Θ ⊢ h : Γ, if Γ(r) is a branch session type S and if h(r) is an object identifier o, then we know from Lemma 7.1 that h contains an entry for o. Let C be the class of this entry, then there exists a field typing F for C such that Θ ⊢ h : Γ{r → C[F ]} and F ⊢ C : S.
Proof. We prove this by induction on the depth of r. The base case is r = o. Using Lemmas 7.4 and 7.5, we can restrict ourselves to the case where o is the only root of h. In that case we know that the last rule used in the typing derivation for Θ ⊢ h : o : S must be T-Hide. The result we want is constituted precisely by the premises of that rule.
For the inductive case, r is of the form o ′ .f. f . We consider the case where o ′ is the only root. The typing derivation then ends with T-Hadd and f gets populated in the sequence of swaps by some object identifier 2 o ′′ . Let r ′ = o ′′ . f , and consider what Γ(r ′ ) can be, knowing that in the conclusion r has a branch session type: the only way the type can be modified in the sequence of swaps is by subsumption. Indeed, T-VarS, the other possibility, introduces a variant type. Therefore Γ(r ′ ) = S ′ with S ′ <: S. We can thus use the induction hypothesis to replace Γ with Γ{r → C[F ]} on the left premise, with F ⊢ C : S ′ . Then just toplevel rules, the result is immediate. The only ones for which it is not are T-Var and T-LinVar. For T-Var the result is obtained using either T-Null if T ′ is Null or T-Label and T-Sub if it is an enumerated type. In the case of an extension adding new base types, we assume there is a similar rule to type the corresponding literal values. For T-LinVar, if v is an access point name the result is obtained using T-Name and T-Sub. Otherwise, v is an object identifier and the result is obtained using T-Ref, noticing that because Γ(r) is defined and v is not in Γ, the path r does not start with v and the premise is satisfied.
Lemma 7.11 (Typability of Subterms). If D is a derivation of Γ * r ⊲ E(e) : T ⊳ Γ ′ * r ′ then there exist Γ 1 , r 1 and U such that D has a subderivation D ′ concluding Γ * r ⊲ e : U ⊳ Γ 1 * r 1 and the position of D ′ in D corresponds to the position of the hole in E.
Proof. A straightforward induction on the structure of E; the expression e is always at the extreme left of the typing derivation for E(e).