Unification in the Description Logic EL

The Description Logic EL has recently drawn considerable attention since, on the one hand, important inference problems such as the subsumption problem are polynomial. On the other hand, EL is used to define large biomedical ontologies. Unification in Description Logics has been proposed as a novel inference service that can, for example, be used to detect redundancies in ontologies. The main result of this paper is that unification in EL is decidable. More precisely, EL-unification is NP-complete, and thus has the same complexity as EL-matching. We also show that, w.r.t. the unification type, EL is less well-behaved: it is of type zero, which in particular implies that there are unification problems that have no finite complete set of unifiers.


Introduction
Description logics (DLs) [6] are a family of logic-based knowledge representation formalisms, which can be used to represent the conceptual knowledge of an application domain in a structured and formally well-understood way.They are employed in various application domains, such as natural language processing, configuration of technical systems, databases, and biomedical ontologies, but their most notable success so far is the adoption of the DLbased language OWL [20] as standard ontology language for the semantic web.
In DLs, concepts are formally described by concept terms, i.e., expressions that are built from concept names (unary predicates) and role names (binary predicates) using concept constructors.The expressivity of a particular DL is determined by which concept constructors are available in it.From a semantic point of view, concept names and concept terms represent sets of individuals, whereas roles represent binary relations between individuals.For example, using the concept name Woman, and the role name child, the concept of women having a daughter can be represented by the concept term Woman ⊓ ∃ child.Woman, and the concept of women having only daughters by Woman ⊓ ∀ child.Woman.
The equivalence test can, for example, be used to find out whether a concept term representing a particular notion has already been introduced, thus avoiding multiple introduction of the same concept into the concept hierarchy.This inference capability is very important if the knowledge base containing the concept terms is very large, evolves during a long time period, and is extended and maintained by several knowledge engineers.However, testing for equivalence of concepts is not always sufficient to find out whether, for a given concept term, there already exists another concept term in the knowledge base describing the same notion.On the one hand, different knowledge engineers may use different names for concepts, like Male versus Masculine.On the other hand, they may model on different levels of granularity.For example, assume that one knowledge engineer has defined the concept of men loving fast cars by the concept term Human ⊓ Male ⊓ ∃ loves.Sports car.
A second knowledge engineer might represent this notion in a somewhat different way, e.g., by using the concept term Man ⊓ ∃ loves.(Car⊓ Fast).These two concept terms are not equivalent, but they are meant to represent the same concept.The two terms can obviously be made equivalent by substituting the concept name Sports car in the first term by the concept term Car ⊓ Fast and the concept name Man in the second term by the concept term Human ⊓ Male.This leads us to unification of concept terms, i.e., the question whether two concept terms can be made equivalent by applying an appropriate substitution, where a substitution replaces (some of the) concept names by concept terms.Of course, it is not necessarily the case that unifiable concept terms are meant to represent the same notion.A unifiability test can, however, suggest to the knowledge engineer possible candidate terms.A unifier (i.e., a substitution whose application makes the two terms equivalent) then proposes appropriate definitions for the concept names.In our example, we know that, if we define Man as Human ⊓ Male and Sports car as Car ⊓ Fast, then the concept terms Human ⊓ Male ⊓ ∃ loves.Sports car and Man ⊓ ∃ loves.(Car⊓ Fast) are equivalent w.r.t.these definitions.
Unification in DLs was first considered in [12] for a DL called FL 0 , which has the concept constructors conjunction (⊓), value restriction (∀ r.C), and the top concept (⊤).It was shown that unification in FL 0 is decidable and ExpTime-complete, i.e., given an FL 0unification problem, we can effectively decide whether it has a solution or not, but in the worst-case, any such decision procedure needs exponential time.This result was extended in [8] to a more expressive DL, which additionally has the role constructor transitive closure.Interestingly, the unification type of FL 0 had been determined almost a decade earlier in [2].In fact, as shown in [12], unification in FL 0 corresponds to unification modulo the equational theory of idempotent Abelian monoids with several homomorphisms.In [2] it was shown that, already for a single homomorphism, unification modulo this theory has unification type zero, i.e., there are unification problems for this theory that do not have a minimal complete set of unifiers.In particular, such unification problems cannot have a finite complete set of unifiers.
In this paper, we consider unification in the DL EL.The EL-family consists of inexpressive DLs whose main distinguishing feature is that they provide their users with existential restrictions (∃ r.C) rather than value restrictions (∀ r.C) as the main concept constructor involving roles.The core language of this family is EL, which has the top concept, conjunction, and existential restrictions as concept constructors.This family has recently drawn considerable attention since, on the one hand, the subsumption problem stays tractable (i.e., decidable in polynomial time) in situations where FL 0 , the corresponding DL with value restrictions, becomes intractable: subsumption between concept terms is tractable for both FL 0 and EL [25,10], but allowing the use of concept definitions or even more expressive terminological formalisms makes FL 0 intractable [26,3,23,5], whereas it leaves EL tractable [4,17,5].On the other hand, although of limited expressive power, EL is nevertheless used in applications, e.g., to define biomedical ontologies.For example, both the large medical ontology Snomed ct1 and the Gene Ontology2 can be expressed in EL, and the same is true for large parts of the medical ontology Galen [27].The importance of EL can also be seen from the fact that the new OWL 2 standard3 contains a sub-profile OWL 2 EL, which is based on (an extension of) EL.
Unification in EL has, to the best of our knowledge, not been investigated before, but matching (where one side of the equation(s) to be solved does not contain variables) has been considered in [7,24].In particular, it was shown in [24] that the decision problem, i.e., the problem of deciding whether a given EL-matching problem has a matcher or not, is NPcomplete.Interestingly, FL 0 behaves better w.r.t.matching than EL: for FL 0 , the decision problem is tractable [9].In this paper, we show that, w.r.t. the unification type, FL 0 and EL behave the same: just as FL 0 , the DL EL has unification type zero.However, w.r.t. the decision problem, EL behaves much better than FL 0 : EL-unification is NP-complete, and thus has the same complexity as EL-matching.
Regarding unification in DLs that are more expressive than EL and FL 0 , one must look at the literature on unification in modal logics.It is well-known that there is a close connection between modal logics and DLs [6].For example, the DL ALC, which can be obtained by adding negation to EL or FL 0 , corresponds to the basic (multi-)modal logic K. Decidability of unification in K is a long-standing open problem.Recently, undecidability of unification in some extensions of K (for example, by the universal modality) was shown in [29].The undecidability results in [29] also imply undecidability of unification in some expressive DLs (e.g., SHIQ [21]).The unification types of some modal (and related) logics have been determined by Ghilardi; for example in [19] he shows that K4 and S4 have unification type finitary.Unification in sub-Boolean modal logics (i.e., modal logics that are not closed under all Boolean operations, such as the modal logics corresponding to EL and FL 0 ) has, to the best of our knowledge, not been considered in the modal logic literature.
In addition to unification of concept terms as introduced until now, we will also consider unification w.r.t. a so-called acyclic TBox in this article.Until now, we have only talked about concept terms, i.e., complex descriptions of concepts that are built from concept and role names using the concept constructors of the given DL.In applications of DLs, it is, of course, inconvenient to always use such complex descriptions when referring to concepts.For this reason, DLs are usually also equipped with a terminological formalism.In its simplest form, this formalism allows to introduce abbreviations for concept terms.For example, the two concept definitions Mother ≡ Woman ⊓ ∃ child.Human and Woman ≡ Human ⊓ Female introduce the abbreviation Woman for the concept term Human ⊓ Female and the abbreviation Mother for the concept term Human ⊓ Female ⊓ ∃ child.Human.A finite set of such concept definitions is called an acyclic TBox if it is unambiguous (i.e., every concept name occurs at most once as left-hand side) and acyclic (i.e., there are no cyclic dependencies between concept definitions).These restrictions ensure that every defined concept (i.e., concept name occurring on the left-hand side of a definition) has a unique expansion to a concept term that it abbreviates.Inference problems like subsumption and unification can also be considered w.r.t.such acyclic TBoxes.As mentioned above, the complexity of the subsumption problem increases for the DL FL 0 if acyclic TBoxes are taken into account [26].In contrast, for EL, the complexity of the subsumption problem stays polynomial in the presence of acyclic TBoxes.We show that, for unification in EL, adding acyclic TBoxes is also harmless, i.e., unification in EL w.r.t.acyclic TBoxes is also NP-complete.
This article is structured as follows.In the next section, we define the DL EL and unification in EL more formally.In Section 3, we recall the characterization of subsumption and equivalence in EL from [24], and in Section 4 we use this to show that unification in EL has type zero.In Section 5, we show that unification in EL is NP-complete.The unification algorithm establishing the complexity upper bound is a typical "guess and then test" NP-algorithm, and thus it is unlikely that a direct implementation of this algorithm will perform well in practice.In Section 6, we introduce a more goal-oriented unification algorithm for EL, in which non-deterministic decisions are only made if they are triggered by "unsolved parts" of the unification problem.In Section 7, we point out that our results for EL-unification imply that unification modulo the equational theory of semilattices with monotone operators [28] is NP-complete and of unification type zero.
More information about Description Logics can be found in [6], and about unification theory in [16].This article is an extended version of a paper [11] published in the proceedings of the 20th international Conference on Rewriting Techniques and applications (RTA'09).In addition to giving more detailed proofs, we have added the goal-oriented unification algorithm (Section 6) and the treatment of unification modulo acyclic TBoxes (Subsection 2.3).

Name
Syntax Semantics concept name

Unification in EL
In this section, we first define the syntax and semantics of EL-concept terms as well as the subsumption and the equivalence relation on these terms.Then, we introduce unification of EL-concept terms, and finally extend this notion to unification modulo an acyclic TBox.
2.1.The Description Logic EL.Starting with a set N con of concept names and a set N role of role names, EL-concept terms are built using the following concept constructors: the nullary constructor top-concept (⊤), the binary constructor conjunction (C ⊓ D), and for every role name r ∈ N role , the unary constructor existential restriction (∃ r.C).The semantics of EL is defined in the usual way, using the notion of an interpretation I = (D I , • I ), which consists of a nonempty domain D I and an interpretation function • I that assigns binary relations on D I to role names and subsets of D I to concept terms, as shown in the semantics column of Table 1.
The It is well-known that subsumption (and thus also equivalence) of EL-concept terms can be decided in polynomial time [10].

Unification of concept terms.
In order to define unification of concept terms, we first introduce the notion of a substitution operating on concept terms.To this purpose, we partition the set of concepts names into a set N v of concept variables (which may be replaced by substitutions) and a set N c of concept constants (which must not be replaced by substitutions).Intuitively, N v are the concept names that have possibly been given another name or been specified in more detail in another concept term describing the same notion.The elements of N c are the ones of which it is assumed that the same name is used by all knowledge engineers (e.g., standardized names in a certain domain).
A substitution σ is a mapping from N v into the set of all EL-concept terms.This mapping is extended to concept terms in the obvious way, i.e., Definition 2.1.An EL-unification problem is of the form Γ = {C 1 ≡ ?D 1 , . . ., C n ≡ ?D n }, where C 1 , D 1 , . . ., C n , D n are EL-concept terms.The substitution σ is a unifier (or solution) of Γ iff σ(C i ) ≡ σ(D i ) for i = 1, . . ., n.In this case, Γ is called solvable or unifiable.
When we say that EL-unification is decidable, then we mean that the following decision problem is decidable: given an EL-unification problem Γ, decide whether Γ is solvable or not.Accordingly, we say that EL-unification is NP-complete if this decision problem is NP-complete.
In the following, we introduce some standard notions from unification theory [16], but formulated for the special case of EL-unification rather than for an arbitrary equational theory.Unifiers can be compared using the instantiation preorder ≤ • .Let Γ be an ELunification problem, V the set of variables occurring in Γ, and σ, θ two unifiers of this problem.We define σ ≤ • θ iff there is a substitution λ such that θ(X) ≡ λ(σ(X)) for all X ∈ V.
If σ ≤ • θ, then we say that θ is an instance of σ.Definition 2.2.Let Γ be an EL-unification problem.The set of substitutions M is called a complete set of unifiers for Γ iff it satisfies the following two properties: (1) every element of M is a unifier of Γ; (2) if θ is a unifier of Γ, then there exists a unifier σ ∈ M such that σ ≤ • θ.The set M is called a minimal complete set of unifiers for Γ iff it additionally satisfies (3) if σ, θ ∈ M , then σ ≤ • θ implies σ = θ.
The unification type of a given unification problem is determined by the existence and cardinality4 of such a minimal complete set.Definition 2.3.Let Γ be an EL-unification problem.This problem has type • unitary iff it has a minimal complete set of unifiers of cardinality 1; • finitary iff it has a finite minimal complete set of unifiers; • infinitary iff it has an infinite minimal complete set of unifiers; • zero iff it does not have a minimal complete set of unifiers.
Note that the set of all unifiers of a given EL-unification problem is always a complete set of unifiers.However, this set is usually infinite and redundant (in the sense that some unifiers are instances of others).For a unitary or finitary EL-unification problem, all unifiers can be represented by a finite complete set of unifiers, whereas for problems of type infinitary or zero this is no longer possible.In fact, if a problem has a finite complete set of unifiers M , then it also has a finite minimal complete set of unifiers, which can be obtained by iteratively removing redundant elements from M .For an infinite complete set of unifiers, this approach of removing redundant unifiers may be infinite, and the set reached in the limit need no longer be complete.This is what happens for problems of type zero.The difference between infinitary and type zero is that a unification problem of type zero cannot even have a non-redundant complete set of unifiers, i.e., every complete set of unifiers must contain different unifiers σ, θ such that σ ≤ • θ.More information on unification type zero can be found in [1].
When we say that EL has unification type zero, we mean that there exists an ELunification problem that has type zero.Before we can prove in Section 4 that this is indeed the case, we must have a closer look at equivalence in EL in Section 3.But first, we consider unification modulo acyclic TBoxes.= C where A is a concept name and C is a concept term.A TBox T is a finite set of concept definitions such that no concept name occurs more than once on the left-hand side of a concept definition in T .The TBox T is called acyclic if there are no cyclic dependencies between its concept definitions.To be more precise, we say that the concept name A directly depends on the concept name B in a TBox T if T contains a concept definition A .= C and B occurs in C. Let depends on be the transitive closure of the relation directly depends on.Then T contains a terminological cycle if there is a concept name A that depends on itself.Otherwise, T is called acyclic.Given a TBox T , we call a concept name A a defined concept if it occurs as the left-side of a concept definition A .= C in T .All other concept names are called primitive concepts.
The interpretation I is a model of the TBox T iff A I = C I holds for all concept definitions A .= C in T .Subsumption and equivalence w.r.t. a TBox are defined as follows: Subsumption and equivalence w.r.t. an acyclic TBox can be reduced to subsumption and equivalence of concept terms (without TBox) by expanding the concept terms w.r.t. the TBox: given a concept term C, its expansion C T w.r.t. the acyclic TBox T is obtained by exhaustively replacing all defined concept names A occurring on the left-hand side of concept definitions A .= C in T by their defining concept terms C. Given concept terms C, D, we have C ⊑ T D iff C T ⊑ D T [14].The same is true for equivalence, i.e., C ≡ T D iff C T ≡ D T .This expansion process may, however, result in an exponential blow-up [26,14], and thus this reduction of subsumption and equivalence w.r.t. an acyclic TBox to subsumption and equivalence without a TBox is not polynomial.Nevertheless, in EL, subsumption (and thus also equivalence) w.r.t.acyclic TBoxes can be decided in polynomial time [4].
In our definition of unification modulo acyclic TBoxes, we assume that all defined concepts are concept constants.In fact, defined concepts already have a definition in the given TBox, and thus it does not make sense to introduce new ones for them by unification.In this setting, a substitution σ is a mapping from N v into the set of all EL-concept terms not containing any defined concepts. 5The extension of σ to concept terms is defined as in the previous subsection, and its application to T is defined as where all the concept names occurring on the left-hand side of these definitions are primitive concepts.Then the substitution that replaces Sports car by Car ⊓ Fast and Man by Human ⊓ Male is a unifier of {Real man ≡ ?T Stupid man} w.r.t. the TBox T consisting of these two definitions.
Using expansion, we can reduce unification modulo an acyclic TBox to unification without a TBox.In fact, the following lemma is an easy consequence of the fact that Since expansion can cause an exponential blow-up, this is not a polynomial reduction.In the remainder of this subsection, we show that there actually exists a polynomial-time reduction of unification modulo an acyclic TBox to unification without a TBox.
We say that the EL-unification problem Γ is in dag-solved form if it can be written as Γ = {X 1 ≡ ?C 1 , . . ., X n ≡ ?C n }, where X 1 , . . ., X n are distinct concept variables such that, for all i ≤ n, X i does not occur in C i , . . ., C n .For i = 1, . . ., n, let σ i be the substitution that maps X i to C i and leaves all other variables unchanged.We define the substitution . ., n, and σ Γ (X) := X for all other variables X.The following is an instance of a well-known fact from unification theory [22].
There is a close relationship between acyclic TBoxes and unification problems in dagsolved form.In fact, if T is an acyclic TBox, then there is an enumeration A 1 , . . ., A n of the defined concepts in (where A 1 , . . ., A n are now viewed as concept variables) is in dag-solved form.In addition, it is easy to see that, for any EL-concept term C, we have Consequently, if we define the substitution τ by setting τ (X) := θ(σ Γ(T ) (X)) for all concept variables and defined concepts X, then τ is a unifier of Conversely, assume that τ is a unifier of {C 1 ≡ ?D 1 , . . ., C n ≡ ?D n } ∪ Γ(T ).In particular, this implies that τ is a unifier of Γ(T ).By Lemma 2.6, {σ Γ(T ) } is a complete set of unifiers for Γ(T ), and thus there is a substitution θ such that τ (X) = θ(σ Γ(T ) (X)) for all concept variables occurring in the unification problem Since the size of Γ(T ) is basically the same as the size of T , the size of Γ ∪ Γ(T ) is linear in the size of Γ and T .Thus, the above lemma provides us with a polynomial-time reduction of EL-unification w.r.t.acyclic TBoxes to EL-unification.
Theorem 2.8.EL-unification w.r.t.acyclic TBoxes can be reduced in polynomial time to EL-unification.

Equivalence and subsumption in EL
In order to characterize equivalence of EL-concept terms, the notion of a reduced ELconcept term is introduced in [24].A given EL-concept term can be transformed into an equivalent reduced term by applying the following rules modulo associativity and commutativity of conjunction: Obviously, these rules are equivalence preserving.We say that the EL-concept term D is reduced if none of the above rules is applicable to it (modulo associativity and commutativity of ⊓), and that C can be reduced to D if D can be obtained from C by applying the above rules (modulo associativity and commutativity of ⊓).The EL-concept term D is a reduced form of C if C can be reduced to D and D is reduced.The following theorem is an easy consequence of Theorem 6.3.1 on page 181 of [24].This theorem can also be used to derive a recursive characterization of subsumption in EL.In fact, if C ⊑ D, then C ⊓ D ≡ C, and thus C and C ⊓ D have the same reduced form.Thus, during reduction, all concept names and existential restrictions of D must be "eaten up" by corresponding concept names and existential restrictions of C.
Note that this corollary also covers the cases where some of the numbers k, ℓ, m, n are zero.The empty conjunction should then be read as ⊤.The following lemma, which is an immediate consequence of this corollary, will be used in our proof that EL has unification type zero.
The following lemma states several other obvious consequences of Corollary 3.2.
If, for all i, If D 1 , . . ., D n are concept names or existential restrictions, then the implication in the other direction also holds.
In the proof of decidability of EL-unification, we will make use of the fact that the inverse strict subsumption order is well-founded.
Proof.We define the role depth of an EL-concept term C as the maximal nesting of existential restrictions in C. Let n 0 be the role depth of C 0 .Since C 0 ⊑ C i for i ≥ 1, it is an easy consequence of Corollary 3.2 that the role depth of C i is bounded by n 0 , and that C i contains only concept and role names occurring in C 0 .In addition, it is known that, for a given natural number n 0 and finite sets of concept names N con and role names N role , there are, up to equivalence, only finitely many EL-concept terms built using concept names from C and role names from R and of a role depth bounded by n 0 [15].Consequently, there are indices i < j such that C i ≡ C j .This contradicts our assumption that C i ⊏ C j .

An EL-unification problem of type zero
To show that EL has unification type zero, we exhibit an EL-unification problem that has this type.Proof.It is enough to show that any complete set of unifiers for this problem is redundant, i.e., contains two different unifiers that are comparable w.r.t. the instantiation preorder.Thus, let M be a complete set of unifiers for Γ.
Thus, let σ ∈ M be such that σ(X) ≡ ⊤ and σ(X) ≡ ∃ r.⊤.Without loss of generality, we assume that C := σ(X) and D := σ(Y ) are reduced.Since σ is a unifier of Γ, we have ∃ r.D ⊑ C. Consequently, Lemma 3.3 yields that C is of the form C = ∃ r.C 1 ⊓ . . .⊓ ∃ r.C n where n ≥ 1, C 1 , . . ., C n are reduced and pairwise incomparable w.r.t.subsumption, and We use σ to construct a new unifier σ as follows: where Z is a new variable (i.e., one not occurring in C, D).The second part of Lemma 3.3 implies that σ is indeed a unifier of Γ.
Next, we show that σ ≤ • σ.To this purpose, we consider the substitution λ that maps Z to C 1 , and does not change any of the other variables.Then we have λ Note that the second equivalence holds since we have D ⊑ C 1 .
Since M is complete, there exists a unifier θ ∈ M such that θ ≤ • σ.Transitivity of the relation ≤ • thus yields θ ≤ • σ.Since σ and θ both belong to M , we have completed the proof of the theorem once we have shown that σ = θ.Assume to the contrary that σ = θ.Then we have σ ≤ • σ, and thus there exists a substitution µ such that µ(σ(X)) ≡ σ(X), i.e., To sum up, we have shown that M contains two distinct unifiers σ, θ such that θ ≤ • σ.Since M was an arbitrary complete set of unifiers for Γ, this shows that this unification problem cannot have a minimal complete set of unifiers.

The decision problem
Before we can describe our decision procedure for EL-unification, we must introduce some notation.An EL-concept term is called an atom iff it is a concept name (i.e., concept constant or concept variable) or an existential restriction ∃ r.D. 7 Obviously, any EL-concept term is (equivalent to) a conjunction of atoms, where the empty conjunction is ⊤.The set At(C) of atoms of an EL-concept term C is defined inductively: Concept names and existential restrictions ∃ r.D where D is a concept name or ⊤ are called flat atoms.An EL-concept term is flat iff it is a conjunction of flat atoms (where the empty conjunction is ⊤).The EL-unification problem Γ is flat iff it consists of equations between flat EL-concept terms.By introducing new concept variables and eliminating ⊤, any EL-unification problem Γ can be transformed in polynomial time into a flat ELunification problem Γ ′ such that Γ is solvable iff Γ ′ is solvable.Thus, we may assume without loss of generality that our input EL-unification problems are flat.Given a flat ELunification problem Γ = {C 1 ≡ ?D 1 , . . ., C n ≡ ?D n }, we call the atoms of C 1 , D 1 , . . ., C n , D n the atoms of Γ. Atoms of Γ that are not variables (i.e., not elements of N v ) are called nonvariable atoms of Γ.
The unifier σ of Γ is called reduced iff, for all concept variables X occurring in Γ, the EL-concept term σ(X) is reduced.It is ground iff, for all concept variables X occurring in Γ, the EL-concept term σ(X) does not contain variables.Obviously, Γ is solvable iff it has a reduced ground unifier.Given a ground unifier σ of Γ, the atoms of σ are the atoms of all the concept terms σ(X), where X ranges over all variables occurring in Γ.
Remark 5.1.In the following, we consider situations where all occurrences of a given reduced atom D in a reduced concept term C are replaced by a more general concept term, i.e., by a concept term D ′ with D ⊏ D ′ .However, when we say occurrence of D in C, we mean occurrence modulo equivalence (≡) rather than syntactic occurrence.For example, if , and D ′ = ∃ r.A, then the term obtained by replacing all occurrences of D in C by D ′ should be ∃ r.A ⊓ ∃ r.A, and not ∃ r.A ⊓ ∃ r.(B ⊓ A).Since C and D are reduced, equivalence is actually the same as being identical up to associativity and commutativity of ⊓.In particular, this means that any concept term that (syntactically) occurs in C and is equivalent to the atom D is also an atom, i.e., only atoms can be replaced by D ′ .In order to make this meaning of occurrence explicit we will call it occurrence modulo AC in the following.We will write D ..⊓C ′ n is not reduced, then its reduced form is actually a conjunction of m < n atoms, which contradicts However, then C i ≡ C ′ 1 ⊐ C 1 contradicts the fact that the atoms C 1 , . . ., C n are incomparable w.r.t.subsumption.Proposition 3.5 says that the inverse strict subsumption order on concept terms is well-founded.We use this fact to obtain a well-founded strict order ≻ on ground unifiers.Definition 5.3.Let σ, θ be ground unifiers of Γ.We define (1) σ θ iff σ(X) ⊑ θ(X) holds for all variables X occurring in Γ.
If Γ contains n variables, then is the n-fold product of the order ⊑ with itself.Since the strict part ⊏ of the inverse subsumption order ⊑ is well-founded by Proposition 3.5, the strict part ≻ of is also well-founded [13].The ground unifier σ of Γ is called is-minimal iff there is no ground unifier θ of Γ such that σ ≻ θ.The following proposition is an easy consequence of the fact that ≻ is well-founded.
Proposition 5.4.Let Γ be an EL-unification problem.Then Γ is solvable iff it has an is-minimal reduced ground unifier.
In the following, we show that is-minimal reduced ground unifiers of flat EL-unification problems satisfy properties that make it easy to check (with an NP-algorithm) whether such a unifier exists or not.Lemma 5.5.Let Γ be a flat EL-unification problem and γ an is-minimal reduced ground unifier of Γ.If C is an atom of γ, then there is a non-variable atom D of Γ such that C ≡ γ(D).
The main idea underlying the proof of this crucial lemma is that an atom C of a unifier σ that violates the condition of the lemma (i.e., that is not of the form C ≡ γ(D) for a non-variable atom D of Γ) can be replaced by a concept term D such that C ⊏ D, which yields a unifier of Γ that is smaller than σ w.r.t.≻.
Before proving the lemma formally, let us illustrate this idea by two examples.
Example 5.6.First, consider the unification problem The substitution σ 1 := {X → A ⊓ B} is a unifier of Γ 1 that does not satisfy the condition of Lemma 5.5.In fact, B is an atom of σ 1 , but none of the non-variable atoms D of Γ 1 (which are A, ∃ r.A, and ∃ r.X) satisfy B ≡ σ 1 (D).The unifier σ 1 is not is-minimal since γ 1 := {X → A}, which can be obtained from σ 1 by replacing the offending atom B with ⊤, is a unifier of Γ 1 that is smaller than σ 1 w.r.t.≻.The unifier γ 1 is is-minimal, and it clearly satisfies the condition of Lemma 5.5.Second, consider the unification problem Proof of Lemma 5.5.Assume that γ is an is-minimal reduced ground unifier of Γ.Since γ is reduced, all atoms of γ are reduced.In particular, this implies that C is reduced, and since γ is ground, we know that C is either a concept constant or an existential restriction.
First, assume that C is of the form A for a concept constant A, but there is no nonvariable atom D of Γ such that A ≡ γ(D).This simply means that A does not appear in Γ.Let γ ′ be the substitution obtained from γ by replacing every occurrence of A by ⊤.Since equivalence in EL is preserved under replacing concept names by ⊤, and since A does not appear in Γ, it is easy to see that γ ′ is also a unifier of Γ.However, since γ ≻ γ ′ , this contradicts our assumption that γ is is-minimal.
Second, assume that C is an existential restriction of the form ∃ r.C 1 , but there is no non-variable atom D of Γ such that C ≡ γ(D).We assume that C is maximal (w.r.t.subsumption) with this property, i.e., for every atom C ′ of γ with C ⊏ C ′ , there is a nonvariable atom D ′ of Γ such that C ′ ≡ γ(D ′ ).Let D 1 , . . ., D ℓ be all the non-variable atoms of Γ with C ⊑ γ(D i ) (i = 1, . . ., ℓ).By our assumptions on C, we actually have C ⊏ γ(D i ) and, by Lemma 3.3, the atom D i is also an existential restriction D i = ∃ r.D ′ i (i = 1, . . ., ℓ).We consider the conjunction We will show in the following that γ [C/ D] is a unifier of Γ that is smaller than γ w.r.t.≻.This will then again contradict our assumption that γ is is-minimal.
Proof.Obviously, D subsumes C. We claim that this subsumption relationship is actually strict.In fact, if ℓ = 0, then D = ⊤, and since C is an atom, it is not equivalent to ⊤.
, which contradicts the fact that C ⊏ γ(D i ).Thus, we have shown that C ⊏ D. Lemma 5.2 implies that γ ≻ γ ′ .
To complete the proof of Lemma 5.5, it remains to show the next lemma.Lemma 5.9.γ [C/ D] is a unifier of Γ.
(1) Since C is an atom, we obviously have L We concentrate on proving the first identity since the second one can be shown analogously.To show the first identity, it is enough to prove that Thus, we have γ By our assumption on C, we have C ≡ γ(L j ), and thus γ(L j ) [C/ D] = ∃ r j .γ(L j ) [C/ D] .In addition, we have . Since L j is a flat atom, we know that L ′ j is either a concept constant, the top-concept ⊤, or a concept variable.In the first to cases, we can show γ(L ′ j ) [C/ D] = γ [C/ D] (L ′ j ) as in (1b), and in the third case we can show this identity as in (1a).
(2) Because of (1), if we can prove that L Without loss of generality, we concentrate on showing that , it is thus sufficient to show that, for every i, 1 ≤ i ≤ ν, there exists a j, 1 ≤ j ≤ µ, such that A (see (3) of Lemma 3.4).Since L = A 1 ⊓ . . .⊓ A µ ⊑ B 1 ⊓ . . .⊓ B ν = R and A 1 , . . ., A µ , B 1 , . . ., B ν are atoms, we actually know that, for every i, 1 ≤ i ≤ ν, there exists a j, 1 ≤ j ≤ µ, such that A j ⊑ B i .Thus, it is sufficient to show that . This is an easy consequence of the next lemma since A i , B j satisfy the conditions of this lemma.Lemma 5.10.Let A, B be reduced ground atoms such that B is an atom of γ or of the form γ(D) for a non-variable atom D of Γ.
by induction on the size of A.
(1) First, assume that A = AC C, which implies that Thus, the maximality of C implies that there is a non-variable atom D of Γ such that B ≡ γ(D).Thus, we are actually in case (a), which yields Otherwise, A is of the form A = ∃ s.E and C occurs in E (modulo AC ).Obviously, A ⊑ B then implies that B is of the form B = ∃ s.F with E ⊑ F .The concept terms E, F are conjunctions of reduced ground atoms, i.e., E = E 1 ⊓ . . .⊓ E κ and F = F 1 ⊓ . . .⊓ F λ where E 1 , . . ., E κ , F 1 , . . ., F λ are reduced ground atoms.By Corollary 3.2, for every h, 1 ≤ h ≤ λ, there exists k, In order to be able to assume, by induction, that , we must show that the conditions in the statement of the lemma hold for the concept terms E k , F h , where E k plays the rôle of A and F h plays the rôle of B. Since we already know that E 1 , . . ., E κ , F 1 , . . ., F λ are reduced ground atoms, it is sufficient to show that each of the atoms F 1 , . . ., F λ is an atom of γ or of the form γ(D) for a non-variable atom D of Γ.We know that B = ∃ s.(F 1 ⊓ . . .⊓ F λ ) is an atom of γ or an instance (w.r.t.γ) of a non-variable atom of Γ.In the first case, the atoms F 1 , . . ., F λ are clearly also atoms of γ.In the second case, B = γ(D ′ ) for a non-variable atom D ′ of Γ.If D ′ is a ground atom, then F 1 , . . ., F λ are also ground atoms that are atoms of Γ, and thus they are instances (w.r.t.γ) of non-variable atoms of Γ.Otherwise, since Γ is flat, D ′ is of the form ∃ s.X for a variable X and γ(X) = F 1 ⊓ . . .⊓ F λ .In this case, F 1 , . . ., F λ are clearly atoms of γ.
Thus, we can assume by induction: It remains to show that this implies ).Since we have A ), property ( * ) yields Thus, we have shown in all cases that A [C/ D] ⊑ B [C/ D] , which completes the proof of Lemma 5.10.
Overall, we have thus completed the proof of Lemma 5.5.The next proposition is an easy consequence of this lemma.Proposition 5.11.Let Γ be a flat EL-unification problem and γ an is-minimal reduced ground unifier of Γ.If X is a concept variable occurring in Γ, then γ(X) ≡ ⊤ or there are non-variable atoms Proof.If γ(X) ≡ ⊤, then it is a non-empty conjunction of atoms, i.e., there are atoms C 1 , . . ., C n (n ≥ 1) such that γ(X) = C 1 ⊓ . . .⊓ C n .Then C 1 , . . ., C n are atoms of γ, and thus Lemma 5.5 yields non-variable atoms D 1 , . . ., D n of Γ such that This proposition suggests the following non-deterministic algorithm for deciding solvability of a given flat EL-unification problem.Algorithm 5.12.Let Γ be a flat EL-unification problem.
(1) For every variable X occurring in Γ, guess a finite, possibly empty, set S X of non-variable atoms of Γ. (2) We say that the variable X directly depends on the variable Y if Y occurs in an atom of S X .Let depends on be the transitive closure of directly depends on.If there is a variable that depends on itself, then the algorithm returns "fail."Otherwise, there exists a strict linear order > on the variables occurring in Γ such that X > Y if X depends on Y .(3) We define the substitution σ along the linear order >: • If X is the least variable w.r.t.>, then S X does not contain any variables.We define σ(X) to be the conjunction of the elements of S X , where the empty conjunction is ⊤.• Assume that σ(Y ) is defined for all variables Y < X.Then S X only contains variables Y for which σ(Y ) is already defined.If S X is empty, then we define σ(X) := ⊤.
Otherwise, let S X = {D 1 , . . ., D n }.We define σ(X) := σ(D 1 ) ⊓ . . .⊓ σ(D n ).( 4) Test whether the substitution σ computed in the previous step is a unifier of Γ.If this is the case, then return σ; otherwise, return "fail." This algorithm is trivially sound since it only returns substitutions that are unifiers of Γ.In addition, it obviously always terminates.Thus, to show correctness of our algorithm, it is sufficient to show that it is complete.Lemma 5.13 (Completeness).If Γ is solvable, then there is a way of guessing in Step 1 subsets S X of the non-variable atoms of Γ such that the depends on relation determined in Step 2 is acyclic and the substitution σ computed in Step 3 is a unifier of Γ.
We show that the relation depends on induced by these sets S X is acyclic, i.e., there is no variable X such that X depends on itself.If X directly depends on Y , then Y occurs in an element of S X .Since S X consists of non-variable atoms of the flat unification problem Γ, this means that there is a role name r such that ∃ r.Y ∈ S X .Consequently, we have γ(X) ⊑ ∃ r.γ(Y ).Thus, if X depends on X, then there are k ≥ 1 role names r 1 , . . ., r k such that γ(X) ⊑ ∃ r 1 .• • • ∃ r k .γ(X).This is clearly not possible since γ(X) cannot be subsumed by an EL-concept term whose role depth is larger than the role depth of γ(X).
To show that the substitution σ induced by the sets S X is a unifier of Γ, we prove that σ is equivalent to γ, i.e., σ(X) ≡ γ(X) holds for all variables X occurring in Γ.The substitution σ is defined along the linear order >.If X is the least variable w.r.t.>, then the elements of S X do not contain any variables.If S X is empty, then σ(X) = ⊤ ≡ γ(X).Otherwise, let S X = {D 1 , . . ., D n }.Since the atoms D i do not contain variables, we have D i = γ(D i ).Thus, the definitions of S X and of σ yield σ( Assume that σ(Y ) ≡ γ(Y ) holds for all variables Y < X.If S X = ∅, then we have again σ(X) = ⊤ ≡ γ(X).Otherwise, let S X = {D 1 , . . ., D n }.Since the atoms D i contain only variables that are smaller than X, we have σ(D i ) ≡ γ(D i ) by induction.Thus, the definitions of S X and of σ yield σ(X) = σ(D 1 )⊓. ..⊓σ(D n ) ≡ γ(D 1 )⊓. ..⊓γ(D n ) ≡ γ(X).
Note that our proof of completeness actually shows that, up to equivalence, the algorithm returns all is-minimal reduced ground unifiers of Γ.
Proof.NP-hardness follows from the fact that EL-matching is NP-complete [24]. 8To show that the problem can be decided by a non-deterministic polynomial-time algorithm, we analyze the complexity of our algorithm.Obviously, guessing the sets S X (Step 1) can be done within NP.Computing the depends on relation and checking it for acyclicity (Step 2) is clearly polynomial.
Steps 3 and 4 are more problematic.In fact, since a variable may occur in different atoms of Γ, the substitution σ computed in Step 3 may be of exponential size.This is actually the same reason that makes a naive algorithm for syntactic unification compute an exponentially large most general unifier [16].As in the case of syntactic unification, the solution to this problem is basically structure sharing.Instead of computing the substitution σ explicitly, we view its definition as an acyclic TBox.To be more precise, for every concept variable X occurring in Γ, the TBox T σ contains the concept definition X .
. Instead of computing σ in Step 3, we compute T σ .Because of the acyclicity test in Step 2, we know that T σ is an acyclic TBox.The size of T σ is obviously polynomial in the size of Γ, and thus this modified Step 3 is polynomial.
It is easy to see that applying the substitution σ to a concept term C is the same as expanding C w.r.t. the TBox T σ , i.e., σ(C) = C Tσ .This implies that, for every equation C ≡ ?D in Γ, we have C ≡ Tσ D iff σ(C) ≡ σ(D).Thus, testing in Step 4 whether σ is a unifier of Γ can be reduced to testing whether C ≡ Tσ D holds for every equation C ≡ ?D in Γ.Since subsumption (and thus equivalence) in EL w.r.t.acyclic TBoxes can be decided in polynomial time [4], this completes the proof of the theorem.
In Subsection 2.3, we have shown that there exists a polynomial-time reduction of unification modulo an acyclic TBox to unification without a TBox.Thus, Theorem 5.14 also yields the exact complexity for EL-unification w.r.t.acyclic TBoxes.
Proof.The problem is in NP since Theorem 2.8 states that there is a polynomial-time reduction of EL-unification w.r.t.acyclic TBoxes to EL-unification, and we have just shown that EL-unification is in NP.
NP-hardness for EL-unification w.r.t.acyclic TBoxes follows from NP-hardness of ELunification since EL-unification can be viewed as the special case of EL-unification w.r.t.acyclic TBoxes where the TBox is empty.

A goal-oriented algorithm
The NP-algorithm introduced in the previous section is a typical "guess and then test" NP-algorithm, and thus it is unlikely that a direct implementation of this algorithm will perform well in practice.Here, we introduce a more goal-oriented unification algorithm for EL, in which non-deterministic decisions are only made if they are triggered by "unsolved parts" of the unification problem.
As in the previous section, we assume without loss of generality that our input unification problem Γ 0 is flat.For a given flat equation C ≡ ?D, the concept terms C, D are thus conjunctions of flat atoms.We will often view such an equation as consisting of four sets: the left-hand side C is given by the set of variables occurring in the top-level conjunction of C, together with the set of non-variable atoms occurring in this top-level conjunction; the right-hand side D is given by the set of variables occurring in the top-level conjunction of D, together with the set of non-variable atoms occurring in this top-level conjunction.To be more precise, let e denote the equation Obviously, the equation e : C ≡ ?D is uniquely determined (up to associativity, commutativity, and idempotency of conjunction) by the four sets LVar (e), LAto(e), RVar (e), RAto(e).Instead of viewing an equation e as being given by a pair of concept terms, we can thus also view it as being given by these four sets.In the following, it will often be convenient to employ this representation of equations.If, with this point of view, we say that we add an atom to the set LAto(e) or RAto(e), then this means, for the other point of view, that we conjoin this atom to the top-level conjunction of the left-hand side or right-hand side of the equation.In addition, if we say that the equation e contains the variable X, then we mean that X ∈ LVar (e) ∪ RVar (e).Similarly, if we say that the left-hand side of e contains X, then we mean that X ∈ LVar(e), and if we say that the right-hand side of e contains X, then we mean that X ∈ RVar (e)). 9n addition to the unification problem itself, the algorithm also maintains, for every variable X occurring in the input problem Γ 0 , a set S X of non-variable atoms of Γ 0 .Initially, all the sets S X are empty.We call the set S X the current assignment for X, and the collection of all these sets the current assignment.Throughout the run of our goal-oriented algorithm, we will ensure that the current assignment is acyclic in the sense that no variable depends on itself w.r.t.this assignment (see (2) of Algorithm 5.12).An acyclic assignment induces a substitution σ, as defined in (3) of Algorithm 5.12.We call this substitution the current substitution.Initially, the current substitution maps all variables to ⊤.
The algorithm applies rules that can (1) change an equation of the unification problem by adding non-variable atoms of the input problem Γ 0 to one side of the equation; (2) introduce a new flat equation of the form C ⊓ B ≡ B, where C, B are atoms of the input problem Γ 0 or ⊤; (3) add non-variable atoms of the input problem Γ 0 to the sets S X .Another property that is maintained throughout the run of our algorithm is that all equations e are expanded w.r.t. the current assignment in the following sense: for all variables X we have Given a flat equation e that contains the variable X, the expansion of e w.r.t. the assignment S X for X is defined as follows: if X ∈ LVar (e) then all elements of S X are added to LAto(e), and if X ∈ RVar (e) then all elements of S X are added to RAto(e).
The L-variant of the Eager-Assignment rule applies to the equation e if there is an unfinished variable X ∈ LVar (e) such that • all variables Z ∈ (LVar (e) \ {X}) ∪ RVar (e) are finished; • LAto(e) = ∅.Its application sets S X := RAto(e).
(1) If this makes the current assignment cyclic, then return "fail." (2) Otherwise, label X as finished and expand all equations containing X w.r.t. the new assignment for X.
Figure 1: The Eager-Assignment rule in its L-variant.The R-variant is obtained by exchanging the rôles of the two sides of the equation.
The following lemma is an immediate consequence of the definition of expanded equations and of the construction of the current substitution.We say that an equation e is solved if LAto(e) = RAto(e).An atom A ∈ LAto(e) ∩ RAto(e) is called solved in e; atoms A ∈ LAto(e) ∪ RAto(e) that are not solved in e are called unsolved in e. Obviously, an equation e is solved iff all atoms A ∈ LAto(e) ∪ RAto(e) are solved in e.
Basically, in each step, the goal-oriented algorithm considers an unsolved equation and an unsolved atom in this equation, and tries to solve it.Picking the unsolved equation and the unsolved atom in it is don't care non-deterministic, i.e., there is no need to backtrack over such a choice.Once an unsolved equation and an unsolved atom in it was picked, don't know non-determinism comes in since there may be several possibilities for how to solve this atom in the equation, some of which may lead to overall success whereas others won't.In some cases, however, a given equation uniquely determines the assignment for a certain variable X.In this case, we make this assignment and then label the variable X as finished.This has the effect that the set S X can no longer be extended.Initially, none of the variables occurring in the input unification problem is labeled as finished.We say that the variable X is unfinished if it is not labeled as finished.Algorithm 6.2.Let Γ 0 be a flat EL-unification problem.We define Γ := Γ 0 and S X := ∅ for all variables X occurring in Γ 0 .None of these variables is labeled as finished.
As long as Γ contains an unsolved equation, do the following: (1) If the Eager-Assignment rule applies to some equation e, then apply it to this equation (see Figure 1).(2) Otherwise, let e be an unsolved equation and A an unsolved atom in e.If neither of the rules Decomposition (see Figure 2) and Extension (see Figure 3) applies to A in e, then return "fail."If one of these rules applies to A in e, then (don't know) non-deterministically choose one of these rules and apply it.Once all equations of Γ are solved, return the substitution σ that is induced by the current assignment.
The Eager-Assignment rule is described in Figure 1.Note that, after a non-failing application of this rule, the equation it was applied to is solved since the expansion of this  Since Y, Z are finished, the Eager-Assignment rule can now be applied to the third equation.This changes the assignment for X to S X = {∃ r.⊤}, labels X as finished, and adds ∃ r.⊤ to the left-hand side of the third equation.Now all equations are solved.The current assignment induces a substitution σ with σ(X) = ∃ r.⊤ = σ(Z) and σ(Y ) = ⊤, which is a unifier of the original set of equations.The Decomposition rule is described in Figure 2.This rule solves the unsolved atom A = ∃ r.C by adding it to the other side.For this to be admissible, one needs a more specific atom ∃ r.B on that side, where the "more specific" is meant to hold after application of the unifier.Thus, to ensure that the unifier σ computed by the algorithm satisfies σ(∃ r.B) ⊑ σ(∃ r.C), the rule adds the new equation C ⊓ B ≡ ?B. Obviously, if the substitution σ solves this equation, then it satisfies σ(B) ⊑ σ(C), and thus σ(∃ r.B) ⊑ σ(∃ r.C).As an example, consider the equation ∃ r.X ⊓ ∃ r.A ≡ ?∃ r.A, and assume that S X = ∅ and that X is unfinished.An application of the L-variant of the Decomposition rule to this equation adds ∃ r.X to the right-hand side of this equation, and thus solves it.In addition, it generates the new equation X ⊓ A ≡ ?A, which is solved.The current assignment induces a substitution σ with σ(X) = ⊤, which solves the original equation.
The L-variant of the Extension rule applies to the unsolved atom A of the equation e if • A ∈ LAto(e) \ RAto(e); • there is at least one unfinished variable X ∈ RAto(e) Its application chooses (don't know) non-deterministically an unfinished variable X ∈ RAto(e) and adds A to S X .
• If this makes the current assignment cyclic, then return "fail." • Otherwise, expand all equations containing X w.r.t. the new assignment for X.The Extension rule is described in Figure 3. Basically, this rule solves the unsolved atom A by extending with this atom the assignment of an unfinished variable contained in the other side of the equation.As an example, consider the equation where A is a concept constant, S X = ∅, and X is unfinished.An application of the Extension rule to A in this equation extends the assignment for X to S X = {A}, and expands this equation by adding A to the right-hand side.The equation obtained this way is solved.The substitution σ induced by the current assignment replaces X by A, and solves the original equation.Theorem 6.3.Algorithm 6.2 is an NP-algorithm for testing solvability of flat EL-unification problems.
First, we show that the algorithm is indeed an NP-algorithm.For this, we consider all runs of the algorithm, where for every (don't care) non-deterministic choice exactly one alternative is taken.Since a single rule application can obviously be realized in polynomial time, it is sufficient to show the following lemma.Lemma 6.4 (Termination).Every run of the algorithm terminates after a polynomial number of rule applications.
Proof.Each application of the Eager-Assignment rule finishes an unfinished variable.Thus, since finished variables never become unfinished again, it can only be applied k times, where k is the number of variables occurring in the input unification problem Γ 0 .This number is clearly linearly bounded by the size of Γ 0 .
Every application of the Decomposition rule or the Extension rule turns an unsolved atom in an equation into a solved one, and a solved atom in an equation never becomes unsolved again in this equation.For a fixed equation, in the worst case every atom of Γ 0 may become an unsolved atom of the equation that needs to be solved.There is, however, only a linear number of atoms of Γ 0 .Each equation considered during the run of the algorithm is either descended from an original equation of Γ 0 , or from an equation of the form C ⊓ B ≡ ?B for atoms ∃ r.B and ∃ r.C of Γ 0 .Thus, the number of equations is also polynomially bounded by the size of Γ 0 .Overall, this shows that the Decomposition rule and the Extension rule can only be applied a polynomial number of times.
Next, we show soundness of Algorithm 6.2.We call a run of this algorithm non-failing if it terminates with a unification problem containing only solved equations.Lemma 6.5 (Soundness).Let Γ 0 be a flat EL-unification problem.The substitution σ returned after a successful run of Algorithm 6.2 on input Γ 0 is an EL-unifier of Γ 0 .
Proof.First, note that the rules employed by Algorithm 6.2 indeed preserve the two invariants mentioned before: (1) the current assignment is always acyclic; (2) all equations are expanded.In fact, whenever the current assignment is extended, the rules test acyclicity (and return "fail," if it is not satisfied).In addition, they expand all equations w.r.t. the new assignment.Now, assume that the run of the algorithm has terminated with the EL-unification problem Γ, in which all equations are solved.The first invariant ensures that the final assignment constructed by the run is acyclic, and thus indeed induces a substitution σ.Because of the second invariant, Lemma 6.1 applies, and thus we know that σ is a solution of Γ.
It remains to show that the substitution σ is also a solution of the input problem Γ 0 .To this purpose, we take all the equations that were considered during the run of the algorithm, i.e., present in Γ 0 or in any of the other unification problems generated during the run.Let E denote the set of these equations.We define the relation → on E as follows: e → e ′ if e was transformed into e ′ using one of the rules of Algorithm 6.2.To be more precise, the Eager-Assignment rule transforms equations containing X from the current unification problem Γ by expanding them w.r.t. the new assignment for X.The same is true for the Extension rule.The decomposition rule transforms an equation e containing the unsolved atom A = ∃ r.C by adding this atom to the other side, which needs to contain an atom of the form ∃ r.B.For this new equation e ′ , we have e → e ′ .The decomposition rule may also generate a new equation e ′′ of the form C ⊓ B ≡ ?B (if this equation was not generated before).However, we do not view this equation as a successor of e w.r.t.→, i.e., we do not have e → e ′′ .Equations C ⊓ B ≡ ?B that are generated by an application of the decomposition rule are called D-equations.Equations that are elements of the input problem Γ 0 are called I-equations.Any equation e ′ that is not an I-equation or a D-equation has a unique predecessor w.r.t.→, i.e., there is an equation e ∈ E such that e → e ′ .
Starting with the set F := Γ we will now step by step extend F by a predecessor of an equation in F until no new predecessors can be added.Since E is finite, this process terminates after a finite number of steps.After termination we have E = F, and thus in particular Γ 0 ⊆ F. This is due to the fact that, for every element e 0 of E, there are n ≥ 0 elements e 1 , . . ., e n ∈ E such that e 0 → e 1 → . . .→ e n and e n ∈ Γ.Thus, it is enough to show that the set F satisfies the following invariant: ( * ) the substitution σ solves every equation in F.
Since σ is a solution of Γ, this invariant is initially satisfied.To prove that it is preserved under adding predecessors of equations in F, we start with the equations of minimal role depth.To be more precise, if the equation e is of the form C ≡ ?D, we define the role depth of e w.r.t.σ to be the role depth 10 of the concept term σ(C) ⊓ σ(D).The strict order ≻ on E is defined as follows: e ≻ σ e ′ iff the role depth of e w.r.t.σ is larger than the role depth of e ′ w.r.t.σ.We write e ≈ σ e ′ if e and e ′ have the same role depth w.r.t.σ.The following is an easy consequence of the definition of σ and of our rules: ( * * ) e 1 → e 2 → . . .→ e n implies e 1 ≈ σ e 2 ≈ σ . . .≈ σ e n .
Assume that we have already constructed a set F such that the invariant ( * ) is satisfied.Let e ′ be an equation in F such that • there is an e ∈ E \ F with e → e ′ ; • e ′ is of minimal role depth with this property, i.e., if f ′ ∈ F is such that e ′ ≻ f ′ and f ′ has a predecessor f w.r.t.→, then f ∈ F. If no such equation e ′ exists, then we are finished, and we have E = F. Otherwise, let e ′ be such an equation and e its predecessor w.r.t.→.We add e to F. In order to show that the invariant ( * ) is still satisfied, we make a case distinction according to which rule was applied to e to produce e ′ : (1) Eager-Assignment.By an application of this rule, the assignment for X is modified from S X = ∅ to S X = {A 1 , . . ., A n }, where A 1 , . . ., A n are non-variable atoms.In addition, X is labeled as finished.Since the assignment of a finished variable cannot be changed anymore, we know that we also have S X = {A 1 , . . ., A n } in the final assignment, and thus σ(X) = σ(A 1 ) ⊓ . . .⊓ σ(A n ).The rule modifies equations as follows: all equations containing X are expanded w.r.t. the assignment S X = {A 1 , . . ., A n }.Since e is transformed into e ′ using this rule, it must contain X.We assume for the sake of simplicity that X is contained in the left-hand side of e, but not in the right-hand side, i.e., e is of the form C ⊓ X ≡ ?Consequently, it is sufficient to prove σ(B) ⊑ σ(C).The Decomposition rule also generates the equation C ⊓ B ≡ ?B and expands it w.r.t. the assignments of all the variables contained in this equation, unless this equation has already been generated before.Thus, either this application or a previous one of the Decomposition rule has generated the equation C ⊓B ≡ ?B, and then expanded it (w.r.t. the current assignment at that time) to an equation e 1 .Since atoms are never removed from an assignment, the atoms present in the assignment at the time when the Decomposition rule generated the equation C ⊓ B ≡ ?B are also present in the final assignment used to define the substitution σ.Thus, if we can show that σ solves e 1 , then we have also shown that σ solves C ⊓ B ≡ ?B, and thus satisfies σ(B) ⊑ σ(C).
Since equations are never completely removed by our rules, but only modified, there is a sequence of equations e 1 → e 2 → . . .→ e n such that e n ∈ Γ. Property ( * * ) thus yields e 1 ≈ σ e 2 ≈ σ . . .≈ σ e n .In addition, the role depth of C ⊓ B ≡ ?B w.r.t.σ is the same as the role depth of e 1 w.r.t.σ.Consequently, we have e ′ ≻ e i for all i, 1 ≤ i ≤ n.Now, assume that e 1 ∈ F. Then there is an i > 1 such that e i ∈ F, but e i−1 ∈ E \ F. This contradicts our assumption that e ′ is minimal.Thus, we have shown that e 1 ∈ F, and this implies that σ solves e 1 .
Overall, this finishes the proof that σ solves e. (3) Extension.By an application of this rule, the assignment for X is modified by adding a non-variable atom A to it.Since atoms are never removed from an assignment, we know that we also have A ∈ S X in the final assignment, and thus σ(X) ⊑ σ(A).
The rule modifies equations as follows: all equations containing X are expanded w.r.t. the new assignment for X.Since e is transformed into e ′ using this rule, it must contain X.We assume for the sake of simplicity that X is contained in the left-hand side of e, but not in the right-hand side, i.e., e is of the form C ⊓ X ≡ ?D and the new equation e ′ obtained from e is C ⊓ X ⊓ A ≡ ?D. Since σ solves e ′ , we have σ(D) ≡ σ(C ⊓ X ⊓ A) ≡ σ(C) ⊓ σ(X) ⊓ σ(A) ≡ σ(C) ⊓ σ(X) ≡ σ(C ⊓ X), which shows that σ also solves e.To sum up, we have shown that the invariant ( * ) is still satisfied after adding e to F. This completes the proof of soundness of our procedure.
It remains to show completeness of Algorithm 6.2.Thus, assume that the input unification problem Γ 0 is solvable.Proposition 5.4 tells us that Γ 0 then has an is-minimal reduced ground unifier γ, and Proposition 5.11 implies that, for every variable X occurring in Γ 0 , there is a set S γ X of non-variable atoms of Γ 0 such that γ(X) ≡ γ( S γ X ), where, for a set of non-variable atoms S of Γ 0 , the expression S denotes the conjunction of the elements of S (where the empty conjunction is ⊤).Lemma 6.6 (Completeness).Let Γ 0 be a flat EL-unification problem, and assume that γ is an is-minimal reduced ground unifier of Γ 0 .Then there is a successful run of Algorithm 6.2 on input Γ 0 that returns a unifier σ that is equivalent to γ, i.e., satisfies σ(X) ≡ γ(X) for all variables X occurring in Γ 0 .
Proof.The algorithm starts with Γ := Γ 0 and the initial assignment S X := ∅ for all variables X occurring in Γ 0 .It then applies rules that change Γ and the current assignment as long as the problem Γ contains an unsolved equation.
We use γ to guide the (don't know) non-deterministic choices to be made during the algorithm.We show that this ensures that the run of the algorithm generated this way does not fail and that the following invariants are satisfied throughout this run: (I 1 ) γ is a unifier of Γ; (I 2 ) for all atoms B ∈ S X there exists an atom A ∈ S γ X such that γ(A) ⊑ γ(B); (I 3 ) for all finished variables X we have γ(X) ≡ γ( S X ).Before constructing a run that satisfies these invariants, let us point out two interesting consequences that they have: (C 1 ) The current assignment is always acyclic.In fact, if X directly depends on Y , then there is an atom B ∈ S X that has the form B = ∃ r.Y for some role name r. Invariant I 2 then implies that there is an A ∈ S γ X such that γ(X) ⊑ γ(A) ⊑ γ(B) = ∃ r.γ(Y ).Thus, if X depends on X, then there are k ≥ 1 role names r 1 , . . ., r k such that γ(X) ⊑ ∃ r 1 .• • • ∃ r k .γ(X),which is impossible.already shown that γ(X) ≡ γ(A 1 ) ⊓ . . .⊓ γ(A n ), the invariant I 2 holds by Corollary 3.2.Note that this also implies that the new assignment is acyclic, and thus the application of the Eager-Assignment rule does not fail.Finally, consider the invariant I 1 .The rule application modifies equations containing X by adding the atoms A 1 , . . ., A n .Since γ(X) ≡ γ(A 1 )⊓. ..⊓γ(A n ), an equation that was solved by γ before this modification, is also solved by γ after this modification.To sum up, we have shown that the application of the Eager-Assignment rule does not fail and preserves the invariants.
(3) If there is no unsolved equation to which the Eager-Assignment rule applies, then the algorithm picks an unsolved equation e and an unsolved atom A occurring in this equation.We must show that we can apply either the Decomposition or the Extension rule to A in e such that the invariants stay satisfied.Without loss of generality, we assume that the unsolved atom A occurs on the left-hand side of the equation e.We apply the Decomposition rule to A and B i .The application of this rule modifies the equation e to an equation e ′ by adding the atom A to the righthand side.In addition, it generates the equation C ⊓ B ≡ ?B and expands it w.r.t. the assignments of all variables contained in this equation (unless this equation has been generated before).After the application of this rule, the invariants I 2 and I 3 are still satisfied since the current assignments and the set of finished variables remain unchanged.Regarding invariant I 1 , since γ solves e, it obviously also solves e ′ due to the fact that γ(B i ) ⊑ γ(A) and B i is a conjunct on the right-hand side of e.In addition, γ(B) ⊑ γ(C) implies that γ also solves the equation C ⊓B ≡ ?B. Since invariant I 2 is satisfied, this implies that γ also solves the equation obtained from C ⊓ B ≡ ?B by expanding it w.r.t. the assignments of all variables contained in it.(ii) Assume that there is no i, 1 ≤ i ≤ n, such that B i is an existential restriction satisfying γ(B i ) ⊑ γ(A).Thus, if B i is such that γ(B i ) ⊑ γ(A), then we know that B i = X is a variable.We want to apply the Extension rule to A and X.
To be able to do this, we must first show that X is not a finished variable.Thus, assume that X is finished, and let S X = {C 1 , . . ., C ℓ }.Invariant I 3 yields γ(C 1 ) ⊓ . . .⊓ γ(C ℓ ) = γ(X) = γ(B i ) ⊑ γ(A) = ∃ r.γ(C), and thus there is a j, 1 ≤ j ≤ ℓ, such that γ(C j ) ⊑ γ(A).Since A is an existential restriction, the non-variable atom C j must also be an existential restriction, and since the equation e is expanded, C j ∈ S X occurs on the right-hand side of this equation.This contradicts our assumption that there is no such existential restriction on the right-hand side.Thus, we have shown that X is not finished, which means that we can apply the Extension rule to A and X.The application of this rule adds the atom A to the assignment for X, and it expands all equations containing X w.r.t.this new assignment, i.e., it adds A to the left-hand side and/or right-hand side of an equation whenever X is concept term C is subsumed by the concept term D (written C ⊑ D) iff C I ⊆ D I holds for all interpretations I.We say that C is equivalent to D (written C ≡ D) iff C ⊑ D and D ⊑ C, i.e., iff C I = D I holds for all interpretations I.The concept term C is strictly subsumed by the concept term D (written C ⊏ D) iff C ⊑ D and C ≡ D.

2. 3 .
Unification modulo acyclic TBoxes.A concept definition is of the form A .

Definition 2 . 4 .
An EL-unification problem modulo an acyclic TBox is of the form Γ = {C 1 ≡ ?T D 1 , . . ., C n ≡ ?T D n }, where C 1 , D 1 , . . ., C n , D n are EL-concept terms, and T is an acyclic EL-TBox.The substitution σ is a unifier (or solution) of Γ modulo T iff σ(C i ) ≡ σ(T ) σ(D i ) for i = 1, . . ., n.In this case, Γ is called solvable modulo T or unifiable modulo T .Coming back to our example from the introduction, assume that one knowledge engineer has written the concept definition Real man .= Human ⊓ Male ⊓ ∃ loves.Sports car. to the TBox, whereas a second one has written the definition Stupid man .= Man ⊓ ∃ loves.(Car⊓ Fast),

Theorem 3 . 1 .
Let C, D be EL-concept terms, and C, D reduced forms of C, D, respectively.Then C ≡ D iff C is identical to D up to associativity and commutativity of ⊓.

1 =
AC D 2 to express that the atoms D 1 and D 2 are identical up to associativity and commutativity of ⊓.Obviously, D 1 = AC D 2 implies D 1 ≡ D 2 .Lemma 5.2.Let C, D, D ′ be EL-concept terms such that D is a reduced atom, D ⊏ D ′ , and C is reduced and contains at least one occurrence of D modulo AC .If C ′ is obtained from C by replacing all occurrences of D by D ′ , then C ⊏ C ′ .Proof.We prove the lemma by induction on the size of C. If C = AC D, then C ′ = D ′ , and thus C ≡ D ⊏ D ′ = C ′ , which yields C ⊏ C ′ .Thus, assume that C = AC D. In this case, C cannot be a concept name since it contains the atom D. If C = ∃ r.C 1 , then D occurs in C 1 modulo AC .By induction, we can assume that C 1 ⊏ C ′ 1 , where C ′ 1 is obtained from C 1 by replacing all occurrences of D (modulo AC ) by D ′ .Thus, we have C = ∃ r.C 1 ⊏ ∃ r.C ′ 1 = C ′ by Corollary 3.2.Finally, assume that C = C 1 ⊓ . . .⊓ C n for n > 1 atoms C 1 , . . ., C n .Since C is reduced, these atoms are incomparable w.r.t.subsumption, and since the atom D occurs in C modulo AC we can assume without loss of generality that D occurs in C 1 modulo AC .Let C ′ 1 , . . ., C ′ n be respectively obtained from C 1 , . . ., C n by replacing every occurrence of D (modulo AC ) by D ′ , and then reducing the concept term obtained this way.By induction, we have C 1 ⊏ C ′ 1 .Assume that C ⊏ C ′ .Since the concept constructors of EL are monotone w.r.t.subsumption ⊑, we have C ⊑ C ′ , and thus C ⊏ C ′ means that C ≡ C ′ .Consequently, C = C 1 ⊓ . . .⊓ C n and the reduced form of C ′ 1 ⊓. ..⊓C ′ n must be equal up to associativity and commutativity of ⊓.If C ′ 1 ⊓.

Definition 5 . 7 .
Given an EL-concept term F , the concept term F [C/ D] is obtained from F by replacing every occurrence of C (modulo AC ) by D. The substitution γ [C/ D] is obtained from γ by replacing every occurrence of C (modulo AC ) by D, i.e., γ [C/ D] (X) := γ(X) [C/ D] for all variables X.
) If B is of the form B ≡ γ(D) for a non-variable atom D of Γ, then there is an h, 1 ≤ h ≤ n, such that D = D h , which shows that A [C/ D] ⊑ B. Since C ⊑ D and the constructors of EL are monotone w.r.t.subsumption, we also have B ⊑ B [C/ D] , and thus A [C/ D] ⊑ B [C/ D] .(b) Assume that B is an atom of γ.If B = AC C, then B [C/ D] = D, and thus A [C/ D] = B [C/ D] , which implies A [C/ D] ⊑ B [C/ D] .Otherwise, since C, B are reduced atoms, B = AC C implies B ≡ C. Together with C ≡ A ⊑ B, this shows that C ⊏ B.
The L-variant of the Decomposition rule applies to the unsolved atom A in the equation e if • A ∈ LAto(e) \ RAto(e); • A is of the form A = ∃ r.C;• there is at least one atom of the form ∃ r.B ∈ RAto(e).Its application chooses (don't know) non-deterministically an atom of the form ∃ r.B ∈ RAto(e) and • adds ∃ r.C to RAto(e); • creates a new equation C ⊓ B ≡ ?B and expands it w.r.t. the assignments of all variables contained in this equation, unless this equation has already been generated before.If the equation has already been generated before, it is not generated again.

Figure 2 :
Figure 2: The Decomposition rule in its L-variant.The R-variant is obtained by exchanging the rôles of the two sides of the equation.

Figure 3 :
Figure 3: The Extension rule in its L-variant.The R-variant is obtained by exchanging the rôles of the two sides of the equation.

Table 1 :
Syntax and semantics of EL