Formalizing Randomized Matching Algorithms

Using Jerábek's framework for probabilistic reasoning, we formalize the correctness of two fundamental RNC^2 algorithms for bipartite perfect matching within the theory VPV for polytime reasoning. The first algorithm is for testing if a bipartite graph has a perfect matching, and is based on the Schwartz-Zippel Lemma for polynomial identity testing applied to the Edmonds polynomial of the graph. The second algorithm, due to Mulmuley, Vazirani and Vazirani, is for finding a perfect matching, where the key ingredient of this algorithm is the Isolating Lemma.


Introduction
There is a substantial literature on theories such as PV, S 1  2 , VPV, V 1 which capture polynomial time reasoning [8,4,17,7].These theories prove the existence of polynomial time functions, and in many cases they can prove properties of these functions that assert the correctness of the algorithms computing these functions.But in general these theories cannot prove the existence of probabilistic polynomial time relations such at those in ZPP, RP, BPP because defining the relevant probabilities involves defining cardinalities of exponentially large sets.Of course stronger theories, those which can define #P or PSPACE functions can treat these probabilities, but such theories are too powerful to capture the spirit of feasible reasoning.
Note that we cannot hope to find theories that exactly capture probabilistic complexity classes such as ZPP, RP, BPP because these are 'semantic' classes which we suppose are not recursively enumerable (cf.[30]).Nevertheless there has been significant progress toward developing tools in weak theories that might be used to describe some of the algorithms in these classes.
Paris, Wilkie and Woods [25] and Pudlák [26] observed that we can simulate approximate counting in bounded arithmetic by applying variants of the weak pigeonhole principle.It seems unlikely that any of these variants can be proven in the theories for polynomial time, but they can be proven in Buss's theory S 2 for the polynomial hierarchy.The first connection between the weak pigeonhole principle and randomized algorithms was noticed by Wilkie (cf.[17]), who showed that randomized polytime functions witness Σ b 1 -consequences of S 1 2 + sWPHP(PV) (i.e., Σ B 1 -consequences of V 1 + sWPHP(L FP ) in our two-sorted framework), where sWPHP(PV) denotes the surjective weak pigeonhole principle for all PV functions (i.e.polytime functions).
Building on these early results, Jeřábek [13] showed that we can "compare" the sizes of two bounded P/poly definable sets within VPV by constructing a surjective mapping from one set to another.Using this method, Jeřábek developed tools for describing algorithms in ZPP and RP.He also showed in [13,14] that the theory VPV + sWPHP(L FP ) is powerful enough to formalize proofs of very sophisticated derandomization results, e.g. the Nisan-Wigderson theorem [24] and the Impagliazzo-Wigderson theorem [12].(Note that Jeřábek actually used the single-sorted theory PV 1 + sWPHP(PV), but these two theories are isomorphic.) In [15], Jeřábek developed an even more systematic approach by showing that for any bounded P/poly definable set, there exists a suitable pair of surjective "counting functions" which can approximate the cardinality of the set up to a polynomially small error.From this and other results he argued convincingly that VPV+sWPHP(L FP ) is the "right" theory for reasoning about probabilistic polynomial time algorithms.However so far no one has used his framework for feasible reasoning about specific interesting randomized algorithms in classes such as RP and RNC 2 .
In the present paper we analyze (in VPV) two such algorithms using Jeřábek's framework.The first one is the RNC 2 algorithm for determining whether a bipartite graph has a perfect matching, based on the Schwartz-Zippel Lemma [27,33] for polynomial identity testing applied to the Edmonds polynomial [9] associated with the graph.The second algorithm, due to Mulmuley, Vazirani and Vazirani [22], is in the function class associated with RNC 2 , and uses the Isolating Lemma to find such a perfect matching when it exists.Proving correctness of these algorithms involves proving that the probability of error is bounded above by 1/2.We formulate this assertion in a way suggested by Jeřábek's framework (see Definition 2.1).This involves defining polynomial time functions from {0, 1} n onto {0, 1} × Φ(n), where Φ(n) is the set of random bit strings of length n which cause an error in the computation.We then show that VPV proves that the function is a surjection.
Our proofs are carried out in the theory VPV for polynomial time reasoning, without the surjective weak pigeonhole principle sWPHP(L FP ).Jeřábek used the sWPHP(L FP ) principle to prove theorems justifying the above definition of error probability, but we do not need it to apply his definition.
Many proofs concerning determinants are based on the Lagrange expansion (also known as the Leibniz formula) where the sum is over exponentially many terms.Since our proofs in VPV can only use polynomial time concepts, we cannot formalize such proofs, and must use other techniques.In the same vein, the standard proof of the Schwartz-Zippel Lemma assumes that a multivariate polynomial given by an arithmetic circuit can be expanded to a sum of monomials.But this sum in general has exponentially many terms so again we cannot directly formalize this proof in VPV.

Preliminaries
2.1.Basic bounded arithmetic.The theory VPV for polynomial time reasoning used here is a two-sorted theory described by Cook and Nguyen [7].The two-sorted language has variables x, y, z, . . .ranging over N and variables X, Y, Z, . . .ranging over finite subsets of N, interpreted as bit strings.Two sorted vocabulary L 2  A includes the usual symbols 0, 1, +, •, =, ≤ for arithmetic over N, the length function |X| for strings, the set membership relation ∈, and string equality = 2 (subscript 2 is usually omitted).We will use the notation X(t) for t ∈ X, and think of X(t) as the t th bit in the string X.
The number terms in the base language L 2 A are built from the constants 0, 1, variables x, y, z, . . .and length terms |X| using + and •.The only string terms are string variables, but when we extend L 2 A by adding string-valued functions, other string terms will be built as usual.The atomic formulas are t = u, X = Y , t ≤ u, t ∈ X for any number terms u, t and string variables X, Y .Formulas are built from atomic formulas using ∧, ∨, ¬ and ∃x, ∃X, ∀x, ∀X.Bounded number quantifiers are defined as usual, and bounded string quantifier ∃X ≤ t, ϕ stands for ∃X(|X| ≤ t ∧ ϕ) and ∀X ≤ t, ϕ stands for ∀X(|X| ≤ t → ϕ), where X does not appear in term t.
Σ B 0 is the class of all L 2 A -formulas with no string quantifiers and only bounded number quantifiers.Σ B 1 -formulas are those of the form ∃ X < t ϕ, where ϕ ∈ Σ B 0 and the prefix of the bounded quantifiers might be empty.These classes are extended to Σ B i (and Π B i ) for all i ≥ 0, in the usual way.
Two-sorted complexity classes contain relations R( x, X), where x are number arguments and X are string arguments.In defining complexity classes using machines or circuits, the number arguments are represented in unary notation and the string arguments are represented in binary.The string arguments are the main inputs, and the number arguments are auxiliary inputs that can be used to index the bits of strings.
In the two sorted setting, we can define AC 0 to be the class of relations R( x, X) such that some alternating Turing machine accepts R in time O(log n) with a constant number of alternations, where n is the sum of all the numbers in x and the total length of all the string arguments in X.Then from the descriptive complexity characterization of AC 0 , it can be shown that a relation R( x, X) is in AC 0 iff it is represented by some Σ B 0 -formula ϕ( x, X).
Given a class of relations C, we associate a class FC of string-valued functions F ( x, X) and number functions f ( x, X) with C as follows.We require that these functions to be p-bounded, i.e., the length of the outputs of F and f is bounded by a polynomial in x and |X|.Then we define FC to consist of all p-bounded number functions whose graphs are in C and all p-bounded string functions whose bit graphs are in C.
We write Σ B i (L) to denote the class of Σ B i -formulas which may have function and predicate symbols from L ∪ L 2 A .A string function is Σ B 0 (L)-definable if it is p-bounded and its bit graph is represented by a Σ B 0 (L)-formula.Similarly, a number function is Σ B 0definable from L if it is p-bounded and its graph is represented by a Σ B 0 (L)-formula.The theory V 0 for AC 0 is the basis to develop theories for small complexity classes within P in [7].The theory V 0 consists of the vocabulary L 2 A and axiomatized by the sets of 2-BASIC axioms as given in Figure 1, which express basic properties of symbols in L 2 A , where ϕ ∈ Σ B 0 (L 2 A ) and X does not occur free in ϕ.In [7,Chapter 5], it was shown that V 0 is finitely axiomatizable and a p-bounded function is in FAC 0 iff it is provably total in V 0 .A universally-axiomatized conservative extension V 0 of V 0 was also obtained by introducing function symbols and their defining axioms for all FAC 0 functions.
In [7,Chapter 9], Cook and Nguyen showed how to associate a theory VC to each complexity class C ⊆ P, where VC extends V 0 with an additional axiom asserting the existence of a solution to a complete problem for C. General techniques are also presented for defining a universally-axiomatized conservative extension VC of VC which has function symbols and defining axioms for every function in FC , and VC admits induction on open formulas in this enriched vocabulary.It follows from Herbrand's Theorem that the provably-total functions in VC (and hence in VC ) are precisely the functions in FC .Using this framework, Cook and Nguyen defined explicitly theories for various complexity classes within P.
Since we need some basic linear algebra in this paper, we are interested in the twosorted theory V #L and its universal conservative extension V #L from [6].Recall that #L is usually defined as the class of functions f such that for some nondeterministic logspace Turing machine M , f (x) is the number of accepting computations of M on input x.Since counting the number of accepting paths of nondeterministic logspace is AC 0 -equivalent to matrix powering, V #L was defined to be the extension of the base theory V 0 with an additional axiom stating the existence of powers A k for every matrix A over Z.The closure of #L under AC 0 -reductions is called DET.It turns out that computing the determinant of integer matrices is complete for DET under AC 0 -reductions.In fact Berkowitz's algorithm can be used to reduce the determinant to matrix powering.Moreover, V #L proves that the function Det, which computes the determinant of integer matrices based on Berkowitz's algorithm, is in the language of V #L.Unfortunately it is an open question whether the theory V #L also proves the cofactor expansion formula and other basic properties of determinants.However from results in [29] it follows that V #L proves that the usual properties of determinants follow from the Cayley-Hamilton Theorem (which states that a matrix satisfies its characteristic polynomial).
In this paper, we are particularly interested in the theory VPV for polytime reasoning [7, Chapter 8.2] since we will use it to formalize all of our theorems.The universal theory VPV is based on Cook's single-sorted theory PV [8], which was historically the first theory designed to capture polytime reasoning.A nice property of PV (and VPV) is that their universal theorems translate into families of propositional tautologies with polynomial size proofs in any extended Frege proof system.
The vocabulary L FP of VPV extends that of V 0 with additional symbols introduced based on Cobham's machine independent characterization of FP [5].Let Z <y denote the first y bits of Z. Formally the vocabulary L FP of VPV is the smallest set satisfying (1) L FP contains the vocabulary of V 0 (2) For any two function G( x, X), H(y, x, X, Z) over L FP and a L 2 A -term t = t(y, x, X), if F is defined by limited recursion from G, H and t, i.e., F (0, x, X) = G( x, X), F (y + 1, x, X) = H(y, x, X, F (y, x, X)) <t(y, x, X) , then F ∈ L FP .We will often abuse the notation by letting L FP denote the set of function symbols in L FP .
The theory VPV can then be defined to be the theory over L FP whose axioms are those of V 0 together with defining axioms for every function symbols in L FP .VPV proves the scheme Σ B 0 (L FP )-COMP and the following schemes where ϕ is any Σ B 0 (L FP )-formula.It follows from Herbrand's Theorem that the provablytotal functions in VPV are precisely the functions in L FP .
Observe that VPV extends V #L since matrix powering can easily be carried out in polytime, and thus all theorems of V #L from [6,29] are also theorems of VPV.From results in [29] (see page 44 of [14] for a correction) it follows that VPV proves the Cayley-Hamilton Theorem, and hence the cofactor expansion formula and other usual properties of determinants of integer matrices.
In our introduction, we mentioned V 1 , the two sorted version of Buss's S 1 2 theory [4].The theory V 1 is also associated with polytime reasoning in the sense that the provably total functions of V 1 are FP functions, and V 1 is Σ B 1 -conservative over VPV.However, there is evidence showing that V 1 is stronger than VPV.For example, the theory V 1 proves the Σ B 1 -IND, Σ B 1 -MIN and Σ B 1 -MAX schemes while VPV cannot prove these Σ B 1 schemes, assuming the polynomial hierarchy does not collapse [18].In this paper we do not use V 1 to formalize our theorems, since the weaker theory VPV suffices for all our needs.
We use [X, Y ) to denote {Z ∈ Z | X ≤ Z < Y }, i.e., the interval of integers between X and Y − 1, where strings code integers using signed binary notation.We also use the standard notation [n] to denote the set {1, . . ., n}.
Given a square matrix M , we write M [i | j] to denote the (i, j)-minor of M , i.e., the square matrix formed by removing the ith row and jth column from M .
We write x to denote number sequence x 1 , . . ., x k and write X to denote string sequence X 1 , . . ., X k .We write x k×k and X k×k to denote that x and X have k 2 elements and are treated as two-dimensional arrays x i,j | 1 ≤ i, j ≤ k and X i,j | 1 ≤ i, j ≤ k respectively, where the elements of these two-dimensional arrays are listed by rows.Note that x k×k and X k×k can be simply encoded as integer matrices, and thus we will use matrix notation freely on them.
We write the notation "(T ⊢)" in front of the statement of a theorem to indicate that the statement is formulated and proved within the theory T .

2.3.
The weak pigeonhole principle.The surjective weak pigeonhole principle for a function F , denoted by sWPHP(F ), states that F cannot map [0, nA) onto [0, (n + 1)A).Thus, the surjective weak pigeonhole principle for the class of VPV functions, denoted by sWPHP(L FP ), is the schema sWPHP(F ) | F ∈ L FP .Note that this principle is believed to be weaker than the usual surjective "strong" pigeonhole principle stating that we cannot map [0, A) onto [0, A + 1).For example, sWPHP(L FP ) can be proven in the theory V 3 (the two-sorted version of Buss's theory S 3  2 ) for FP Σ P 3 reasoning (cf.[30]), but it is not known if the usual surjective pigeonhole principle for VPV functions can be proven within the theory i≥1 V i for the polynomial hierarchy (the two-sorted version of Buss's theory S 2 := i≥1 S i 2 in [4]).
2.4.Jeřábek's framework for probabilistic reasoning.In this section, we give a brief and simplified overview of Jeřábek's framework [13,14,15] for probabilistic reasoning within VPV + sWPHP(L FP ).For more complete definitions and results, the reader is referred to Jeřábek's work.Let F (R) be a VPV 0-1 valued function (which may have other arguments).We think of F as defining a relation on binary numbers R.
Observe that bounding the probability Pr R<2 n F (R) = 1 from above by the ratio s/t is the same as showing that t • |Φ(n)| ≤ s • 2 n .More generally, many probability inequalities can be restated as inequalities between cardinalities of sets.This is problematic since even for the case of polytime definable sets, it follows from Toda's theorem [31] that we cannot express their cardinalities directly using bounded formulas (assuming that the polynomial hierarchy does not collapse).Hence we need an alternative method to compare the sizes of definable sets without exact counting.
The method proposed by Jeřábek in [13,14,15] is based on the following simple observation: if Γ(n) and Φ(n) are definable sets and there is a function F mapping Γ(n) onto Φ(n), then the cardinality of Φ(n) is at most the cardinality of Γ(n).Thus instead of counting the sets Γ(n) and Φ(n) directly, we can compare the sizes of Γ(n) and Φ(n) by showing the existence of a surjection F , which in many cases can be easily carried out within weak theories of bounded arithmetic.In this paper we will restrict our discussion to the case when the sets are bounded polytime definable sets and the surjections are polytime functions, all of which can be defined within VPV, since this is sufficient for our results.
The remaining challenge is then to formally verify that the definition of cardinality comparison through the use of surjections is a meaningful and well-behaved definition.The basic properties of surjections like "any set can be mapped onto itself" and "surjectivity is preserved through function compositions" roughly correspond to the usual reflexivity and transitivity of cardinality ordering, i.e., |Φ| ≤ |Φ| and |Φ| ≤ |Γ| ≤ |Λ| → |Φ| ≤ |Λ| for all bounded definable sets Φ, Γ and Λ. However more sophisticated properties, e.g., dichotomy |Φ| ≤ |Γ| ∨ |Γ| ≤ |Φ| or "uniqueness" of cardinality, turn out to be much harder to show.
As a result, Jeřábek proposed in [15] a systematic and sophisticated framework to justify his definition of size comparison.He observed that estimating the size of a P/poly definable set Φ ⊆ [0, 2 n ) within an error 2 n /poly(n) is the same as estimating Pr X∈[0,2 n ) [X ∈ Φ] within an error 1/poly(n), which can be solved by drawing poly(n) independent random samples X ∈ [0, 2 n ) and check if X ∈ Φ.This gives us a polytime random sampling algorithm for approximating the size of Φ.Since a counting argument [13] can be formalized within VPV+ sWPHP(L FP ) to show the existence of suitable average-case hard functions for constructing Nisan-Wigderson generators, this random sampling algorithm can be derandomized to show the existence of an approximate cardinality S of Φ for any given error E = 2 n /poly(n) in the following sense.The theory VPV + sWPHP(L FP ) proves the existence of S, y and a pair of P/poly "counting functions" (F, G) Intuitively the pair (F, G) witnesses that S − E ≤ |Φ| ≤ S + E. This allows him to show many properties, expected from cardinality comparison, that are satisfied by his method within VPV + sWPHP(L FP ) (see Lemmas 2.10 and 2.11 in [15]).It is worth noting that proving the uniqueness of cardinality within some error seems to be the best we can do within bounded arithmetic, where exact counting is not available.
For the present paper, the following definition is all we need to know about Jeřábek's framework.
, where F (R) is a VPV function (which may have other arguments) and let s, t be VPV terms.Then Since we are not concerned with justifying the above definition, our theorems can be formalized in VPV without sWPHP.

Edmonds' Theorem
Let G be a bipartite graph with two disjoint sets of vertices U = {u 1 , . . ., u n } and V = {v 1 , . . ., v n }.We use a pair (i, j) to encode the edge {u i , v j } of G. Thus the edge relation of the graph G can be encoded by a boolean matrix E n×n , where we define (i, j) ∈ E, i.e.
Each perfect matching in G can be encoded by an n×n permutation matrix M satisfying M (i, j) → E(i, j) for all i, j ∈ [n].Recall that a permutation matrix is a square boolean matrix that has exactly one entry of value 1 in each row and each column and 0's elsewhere.
Let A n×n be the matrix obtained from G by letting A i,j be an indeterminate X i,j for all (i, j) ∈ E, and let A i,j = 0 for all (i, j) ∈ E. The matrix of indeterminates A( X) is called the Edmonds matrix of G, and Det(A( X)) is called the Edmonds polynomial of G.In general this polynomial has exponentially many monomials, so for the purpose of proving its properties in VPV we consider Det(A( X)) to be a function which takes as input an integer matrix W n×n and returns an integer Det(A( W )). Thus Det(A( X)) ≡ 0 means that this function is identically zero.
The following theorem draws an important connection between determinants and matchings.The standard proof uses the Lagrange expansion which has exponentially many terms, and hence cannot be formalized in VPV.However we will give an alternative proof which can be so formalized.
Proof.For the direction (⇒) we need the following lemma.where N n = M , N 1 = (1), and we construct N i−1 from N i by choosing j i satisfying N (i, j i ) = 1 and letting From the way the matrices N i are constructed, we can easily show by Σ B 0 (L FP ) induction on ℓ = n, . . ., 1 that the matrices N ℓ are permutation matrices.Finally, using the cofactor expansion formula, we prove by Σ B 0 (L FP ) induction on ℓ = 1, . . ., n that Det(N ℓ ) ∈ {−1, 1}.
From the lemma we see that if M is the permutation matrix representing a perfect matching of G, then VPV proves Det(A(M )) = Det(M ) ∈ {1, −1}, so Det(A( X)) is not identically 0.
For the direction (⇐) it suffices to describe a polytime function F that takes as input an integer matrix B n×n = A( W ), where A( X) is the Edmonds matrix of a bipartite graph G and W n×n is an integer value assignment, and reason in VPV that if Det(B) = 0, then F outputs a perfect matching of G.
Assume Det(B) = 0. Note that finding a perfect matching of G is the same as extracting a nonzero diagonal, i.e., a sequence of nonzero entries B(1, σ(1)), B(2, σ(2)), . . ., B(n, σ(n)), where σ is a permutation of the set [n].For this purpose, we construct a sequence of matrices B n , B n−1 , . . ., B 1 , as follows.We let B n = B.For i = n, . . ., 2, we let and the index j i is chosen using the following method.Suppose we already know B i satisfying Det(B i ) = 0.
By the cofactor expansion along the last row of B i , Thus, since Det(B i ) = 0, at least one of the terms in the sum on the right-hand side is nonzero.Thus, we can choose the least index j i such that B i (i, To extract the perfect matching, we let Q be an n × n matrix, where Q(i, j) = j.Then we construct a sequence of matrices i.e., we delete from Q i exactly the row and column we deleted from B i .We define a permutation σ by letting σ(i) = Q i (i, j i ).Then σ(i) is the column number in B which corresponds to column j i in B i , and the set of edges is our desired perfect matching.

Schwartz-Zippel Lemma
The Schwartz-Zippel Lemma [27,33] is one of the most fundamental tools in the design of randomized algorithms.The lemma provides us a coRP algorithm for the polynomial identity testing problem (Pit): given an arithmetic circuit computing a multivariate polynomial P ( X) over a field F, we want to determine if P ( X) is identically zero.The Pit problem is important since many problems, e.g., primality testing [1], perfect matching [22], and software run-time testing [32], can be reduced to Pit.Moreover, many fundamental results in complexity theory like IP = PSPACE [28] and the PCP theorem [2,3] make heavy use of Pit in their proofs.The Schwartz-Zippel lemma can be stated as follows.
Theorem 4.1 (Schwartz-Zippel Lemma).Let P (X 1 , . . ., X n ) be a non-zero polynomial of degree D ≥ 0 over a field (or integral domain) F. Let S be a finite subset of F and let R denote the sequence R 1 , . . ., R n .Then Using this lemma, we have the following coRP algorithm for the Pit problem when F = Z.Given a polynomial P (X 1 , . . ., X n ) of degree at most D, we choose a sequence R ∈ [0, 2D) n at random.If P is given implicitly as a circuit, the degree of P might be exponential, and thus the value of P ( R) might require exponentially many bits to encode.In this case we use the method of Ibarra and Moran [10] and let Y be the result of evaluating P ( R) using arithmetic modulo a random integer from the interval [1, D k ] for some fixed k.If Y = 0, then we report that P ≡ 0. Otherwise, we report that P ≡ 0. (Note that if P has small degree, then we can evaluate P ( R) directly.)Unfortunately the Schwartz-Zippel Lemma seems hard to prove in bounded arithmetic.The main challenge is that the degree of P can be exponentially large.Even in the special case when P is given as the symbolic determinant of a matrix of indeterminates and hence the degree of P is small, the polynomial P still has up to n! terms.Thus, we will focus on a much weaker version of the Schwartz-Zippel Lemma that involves only Edmonds' polynomials since this will suffice for us to establish the correctness of a FRNC 2 algorithm for deciding if a bipartite graph has a perfect matching.4.1.Edmonds' polynomials for complete bipartite graphs.In this section we will start with the simpler case when every entry of an Edmonds matrix is a variable, since it clearly demonstrates our techniques.This case corresponds to the Schwartz-Zippel Lemma for Edmonds' polynomials of complete bipartite graphs.
Let A be the full n × n Edmonds' matrix A, where A i,j = X i,j for all 1 ≤ i, j ≤ n.We consider the case that S is the interval of integers S = [0, s) for s ∈ N, so |S| = s.Then Det(A( X)) is a nonzero polynomial of degree exactly n, and we want to show that i.e., the set of zeros of the Edmonds polynomial Det(A( X)).Then by Definition 2.1, it suffices to exhibit a VPV function mapping [n] × S n 2 onto S × Z(n, s).For this it suffices to give a VPV function mapping [n] × S n 2 −1 onto Z(n, s).We will define a VPV function We claim that given Det(B) = 0 and given i as in the fact, the element B(i, i) is uniquely determined by the other elements in B. Thus if i = 1 then B(i, i) = 0, and if i > 1 then by the cofactor expansion of Det(B i ) along row i, where The output of F (n, s, (i, r)) is defined as follows.Let B be the n × n matrix determined by the n 2 − 1 elements in r by inserting the symbol * (for unknown) in the position for B(i, i).Try to use the method above to determine the value of * = B i (i, i), assuming that * is chosen so that Det(B) = 0.This method could fail because Det(B i−1 ) = 0.In this case, or if the solution to the equation (4.1) gives a value for B i (i, i) which is not in S, output the default "dummy" zero sequence 0 n×n .Otherwise let C be B with * replaced by the obtained value of B i (i, i).If Det(C) = 0 then output C, otherwise output the dummy zero sequence.
Theorem 4.3.(VPV ⊢) Let A( X) be the Edmonds matrix of a complete bipartite graph K n,n .Let S denote the set [0, s).Then the function F (n, s, •) defined above is a polytime surjection that maps [n] × S n 2 −1 onto Z(n, s).
Proof.It is easy to see that F (n, s, •) is polytime (in fact it belongs to the complexity class DET).To see that F is surjective, let C be an arbitrary matrix in Z(n, s), so Det(C) = 0. Let i ∈ [n] be determined by Fact 4.2 when B = C.Let r be the sequence of n 2 − 1 elements consisting of the rows of C with C(i, i) deleted.Then the algorithm for computing F (n, s, (i, r)) correctly computes the missing element C(i, i) and outputs C.

4.2.
Edmonds' polynomials for general bipartite graphs.For general bipartite graphs, an entry of an Edmonds matrix A might be 0, so we cannot simply use leading principal submatrices in our construction of the surjection F .However given a sequence W n×n making Det(A( W )) = 0, it follows from Theorem 3.1 that we can find a perfect matching M in polytime.Thus, the nonzero diagonal corresponding to the perfect matching M will play the role of the main diagonal in our construction.The rest of the proof will proceed similarly.Thus, we have the following theorem.
In other words, it follows from Definition 2.1 that the function H(n, s, A, W , •) in the theorem witnesses that Pr r∈S n 2 Det(A( r)) = 0 n s .
Proof.Assume Det(A( W )) = 0. Then the polytime function described in the proof of Theorem 3.1 produces an n × n permutation matrix M such that for all i, j ∈ [n], if M (i, j) = 1 then the element A(i, j) in the Edmonds matrix A is not zero.We apply the algorithm in the proof of Theorem 4.3, except that the sequence of principal submatrices of B used in Fact 4.2 is replaced by the sequence B n , B n−1 , . . ., B 1 determined by M as follows.We let B n = B, and for i = n, . . ., 2 we let where the indices j i are chosen the same way as in the proof of Theorem 3.1 when constructing the perfect matching M .
We note that the mapping H(n, s, A, •) in this case may not be in DET since the construction of M depends on the sequential polytime algorithm from Theorem 3.1 for extracting a perfect matching.

4.3.
Formalizing the RNC 2 algorithm for the bipartite perfect matching decision problem.An instance of the bipartite perfect matching decision problem is a bipartite graph G encoded by a matrix E n×n , and we are to decide if G has a perfect matching.
Here is an RDET algorithm for the problem.The algorithm is essentially due to Lovász [20].From E, construct the Edmonds matrix A( X) for G and choose a random sequence r n×n ∈ [2n] n 2 .If Det(A( r)) = 0 then we report that G has a perfect matching.Otherwise, we report G does not have a perfect matching.
We claim that VPV proves correctness of this algorithm.The correctness assertion states that if G has a perfect matching then the algorithm reports NO with probability at most 1/2, and otherwise it certainly reports NO.Theorem 3.1 shows that VPV proves the latter.Conversely, if G has a perfect matching given by a permutation matrix M then the function H(n, 2n, A, M, •) of Theorem 4.4 witnesses that the probability of Det(A( r)) = 0 is at most 1/2, according to Definition 2.1, where A is the Edmonds matrix for G. Hence VPV proves the correctness of this case too.
Since RDET ⊆ FRNC 2 , this algorithm (which solves a decision problem) is also an RNC 2 algorithm.

Formalizing the Hungarian algorithm
The Hungarian algorithm is a combinatorial optimization algorithm which solves the maximumweight bipartite matching problem in polytime and anticipated the later development of the powerful primal-dual method.The algorithm was developed by Kuhn [19], who gave the name "Hungarian method" since it was based on the earlier work of two Hungarian mathematicians: D. Kőnig and J. Egerváry.Munkres later reviewed the algorithm and showed that it is indeed polytime [23].Although the Hungarian algorithm is interesting by itself, we formalize the algorithm since we need it in the VPV proof of the Isolating Lemma for perfect matchings in Section 6.1.
The Hungarian algorithm finds a maximum-weight matching for any weighted bipartite graph.The algorithm and its correctness proof are simpler if we make the two following changes.First, since edges with negative weights can never be in a maximum-weight matching, and thus can be safely deleted, we can assume that every edge has nonnegative weight.Second, by assigning zero weight to every edge not present, we only need to consider weighted complete bipartite graphs.
Let G = (X ⊎ Y, E) be a complete bipartite graph, where X = {x i | 1 ≤ i ≤ n} and Y = {y i | 1 ≤ i ≤ n}, and let w be an integer weight assignment to the edges of G, where w i,j ≥ 0 is the weight of the edge {x i , y j } ∈ E.
A pair of integer sequences (5.1) The cost of a cover is cost( u, v) := n i=1 (u i + v i ).We also define w(M ) := (i,j)∈M w i,j .The Hungarian algorithm is based on the following important observation.Proof.Since the edges in a matching M are disjoint, summing the constraints w i,j ≤ u i + v j over all edges of M yields w(M ) ≤ (i,j)∈M (u i + v j ).Since no edge has negative weight, we have u i + v j ≥ 0 for all i, j ∈ [n].Thus, for every matching M and every weight cover ( u, v).
Given a weight cover ( u, v), the equality subgraph H u, v is the subgraph of G whose vertices are X ⊎ Y and whose edges are precisely those {x i , y j } ∈ E satisfying w i,j = u i + v j .Theorem 5.2.(VPV ⊢) Let H = H u, v be the equality subgraph, and let M be a maximum cardinality matching of H. Then the following three statements are equivalent (1) w(M ) = cost( u, v).
(2) M is a maximum-weight matching of G and ( u, v) is a minimum-weight cover of G.
(3) M is a perfect matching of the equality subgraph H. (cf.Appendix A.3 for the full proof of this theorem.) Below we give a simplified version of the Hungarian algorithm which runs in polynomial time when the edge weights are small (i.e.presented in unary notation).The correctness of the algorithm easily follows from Theorem 5.2.
Algorithm 5.3 (The Hungarian algorithm).We start with an arbitrary weight cover ( u, v) with small weights: e.g.let u i = max{w i,j | 1 ≤ j ≤ n} and v i = 0 for all i ∈ [n].If the equality subgraph H u, v has a perfect matching M , we report M as a maximum-weight matching of G. Otherwise, change the weight cover ( u, v) as follows.Since the maximum matching M is not a perfect matching of H, the Hall's condition fails for H. Thus it is not hard (cf.Corollary 1 from Appendix A.2) to construct in polytime a subset S ⊆ X satisfying |N (S)| < |S|, where N (S) denotes the neighborhood of S. Hence we can calculate the quantity and decrease u i by δ for all x i ∈ S and increase v j by δ for all y j ∈ N (S) without violating the weight cover property (5.1).This strictly decreases the sum n i=1 (u i + v i ).Thus this process can only repeat at most as many time as the initial cost of the cover ( u, v).Assuming that all edge weights are small (i.e.presented in unary), the algorithm terminates in polynomial time.Finally we get an equality subgraph H u, v containing a perfect matching M , which by Theorem 5.2 is also a maximum-weight matching of G.
When formalizing the Isolating Lemma for bipartite matchings, we need a VPV function Mwpm that takes as inputs an edge relation E n×n of a bipartite graph G and a nonnegative weight assignment w to the edges in E, and outputs a minimum-weight perfect matching if such a matching exists, or outputs ∅ to indicate that no perfect matching exists.Recall that the Hungarian algorithm returns a maximum-weight matching, and not a minimum-weight perfect matching.However we can use the Hungarian algorithm to compute Mwpm(n, E, w) as follows.
Algorithm 5.4 (Finding a minimum-weight perfect matching).
Construct the sequence w ′ as follows 3: Run the Hungarian algorithm on the complete bipartite graph K n,n with weight assignment w ′ to get a maximum-weight matching M .
4: if M contains an edge that is not in E then 5: return the empty matching ∅ 6: else 7: return M 8: end if Note that since we assign zero weights to the edges not present and very large weights to other edges, the Hungarian algorithm will always prefer the edges that are present in the original bipartite graph.More formally for any perfect matching M and non-perfect matching N we have The last inequality follows from the fact that w ′ i,j ≤ c for all (i, j) ∈ N .Thus, if the Hungarian algorithm returns a matching M with at least one edge not in E, then the original graph cannot have a perfect matching.Also from the way the weight assignment w ′ was defined, every maximum-weight perfect matching of K n,n with weight assignment w ′ is a minimum-weight matching of the original bipartite graph.
It is straightforward to check that the above argument can be formalized in VPV, so VPV proves the correctness of Algorithm 5.4 for computing the function Mwpm.

FRNC 2 algorithm for finding a bipartite perfect matching
Below we recall the elegant FRNC 2 (or more precisely RDET) algorithm due to Mulmuley, Vazirani and Vazirani [22] for finding a bipartite perfect matching.Although the original algorithm works for general undirected graphs, we will only focus on bipartite graphs in this paper.
Let G be a bipartite graph with two disjoint sets of vertices U = {u 1 , . . ., u n } and V = {v 1 , . . ., v n }.We first consider the minimum-weight bipartite perfect matching problem, where each edge (i, j) ∈ E is assigned an integer weight w i,j ≥ 0, and we want to a find a minimum-weight perfect matching of G.It turns out there is a DET algorithm for this problem under two assumptions: the weights must be polynomial in n, and the minimumweight perfect matching must be unique.We let A( X) be an Edmonds matrix of the bipartite graph.Replace X i,j with W i,j = 2 w i,j (this is where we need the weights to be small).We then compute Det(A( W )) using Berkowitz's FNC 2 algorithm.Assume that there exists exactly one (unknown) minimum-weight perfect matching M .We will show in Theorem 6.5 that w(M ) is exactly the position of the least significant 1-bit, i.e., the number of trailing zeros, in the binary expansion of Det(A( W )). Once having w(M ), we can test if an edge (i, j) ∈ E belongs to the unique minimum-weight perfect matching M as follows.Let w ′ be the position of the least significant 1-bit of Det(A[i | j]( W )). We will show in Theorem 6.6 that the edge (i, j) is in the perfect matching if and only if w ′ is precisely w(M ) − w i,j .Thus, we can test all edges in parallel.Note that up to this point, everything can be done in DET ⊆ FNC 2 since the most expensive operation is the Det function, which is complete for DET.
What we have so far is that, assuming that the minimum-weight perfect matching exists and is unique, there is a DET algorithm for finding this minimum-weight perfect matching.But how do we guarantee that if a minimum-weight perfect matching exists, then it is unique?It turns out that we can assign every edge (u i , v j ) ∈ E a random weight w i,j ∈ [2m], where m = |E|, and use the Isolating Lemma [22] to ensure that the graph has a unique minimum-weight perfect matching with probability at least 1/2.
The RDET ⊆ FRNC 2 algorithm for finding a perfect matching is now complete: assign random weights to the edges, and run the DET algorithm for the unique minimum-weight perfect matching problem.If a perfect matching exists, with probability at least 1/2, this algorithm returns a perfect matching.
6.1.Isolating a perfect matching .We will recall the Isolating Lemma [22], the key ingredient of Mulmuley-Vazirani-Vazirani FRNC 2 algorithm for finding a perfect matching.Let X be a set with m elements {a 1 , . . ., a m } and let F be a family of subsets of X.We assign a weight w i to each element a i ∈ X, and define the weight of a set Y ∈ F to be w(Y ) := a i ∈Y w i .Let minimum-weight be the minimum of the weights of all the sets in F. Note that several sets of F might achieve minimum-weight.However, if minimumweight is achieved by a unique Y ∈ F, then we say that the weight assignment w = w i m i=1 is isolating for F. (Every weight assignment is isolating if |F| ≤ 1.) Theorem 6.1 (Isolating Lemma [22]).Let F be a family of subsets of an n-element set X = {a 1 , . . ., a m }.Let w = w i m i=1 be a random weight assignment to the elements in X.

Then
Pr To formalize the Isolating Lemma in VPV it seems natural to present the family F by a polytime algorithm.This is difficult to do in general (see Remark 6.3 below), so we will formalize a special case which suffices to formalize the FRNC 2 algorithm for finding a bipartite perfect matching.Thus we are given a bipartite graph G, and the family F is the set of perfect matchings of G.We want to show that if we assign random weights to the edges, then the probability that this weight assignment does not isolate a perfect matching is small.Note that although the family F here might be exponentially large, F is polytime definable, since recognizing a perfect matching is easy.Theorem 6.2 (Isolating a Perfect Matching).(VPV ⊢) Let F be the family of perfect matchings of a bipartite graph G with edges E = {e 1 , . . ., e m }.Let w be a random weight assignment to the edges in E. Then For brevity, we will call a weight assignment w "bad" if w is not isolating for F. Let Then to prove Theorem 6.2, it suffices to construct a VPV function mapping [m] × [k] m−1 onto Φ.Note that the upper bound m/k is independent of the size n of the two vertex sets.
The set Φ is polytime definable since w ∈ Φ iff ∃i, j ∈ [n] E(i, j) and M (i, j) and ¬M ′ (i, j) and M, M ′ encode two perfect matchings with the same weight , where M denotes the output produced by applying the Mwpm function (Algorithm 5.4) on G, and M ′ denotes the output produced by applying Mwpm on the graph obtained from G by deleting the edge (i, j).
Proof of Theorem 6.2.By Definition 2.1 we may assume that Φ is nonempty, so there is an element δ ∈ Φ. (We will use δ as a "dummy" element.Otherwise ϕ outputs the dummy element δ of Φ.Note that if both M ′ and M 1 exist, then (6.1) is a bad weight assignment, since M ′ and M = M 1 ∪ {e i } are distinct minimum-weight perfect matchings of G under this assignment.
To show that ϕ is surjective, consider an arbitrary bad weight assignment w = w i m i=1 ∈ Φ.Since w is bad, there are two distinct minimum-weight perfect matchings M and M ′ and some edge e i ∈ M \ M ′ .Thus from how ϕ was defined, is an element that gets mapped to the bad weight assignment w.Remark 6.3.The above proof uses the fact that there is a polytime algorithm for finding a minimum-weight perfect matching (when one exists) in an edge-weighted bipartite graph.This suggests limitations on formalizing a more general version of Theorem 6.2 in VPV.For example, if F is the set of Hamiltonian cycles in a complete graph, then finding a minimum weight member of F is NP hard.6.2.Extracting the unique minimum-weight perfect matching.Let G be a bipartite graph and assume that G has a perfect matching.Then in Section 6.1 we formalized a version of the Isolating Lemma, which with high probability gives us a weight assignment w for which G has a unique minimum-weight perfect matching.This is the first step of the Mulmuley-Vazirani-Vazirani algorithm.Now we proceed with the second step, where we need to output this minimum-weight perfect matching using a DET function.
Let B be the matrix we get by substituting W i,j = 2 w i,j for each nonzero entry (i, j) of the Edmonds matrix A of G.We want to show that if M is the unique minimum weight perfect matching of G with respect to w, then the weight w(M ) is exactly the position of the least significant 1-bit in the binary expansion of Det(B).The usual proof of this fact is not hard, but it uses properties of the Lagrange expansion for the determinant, which has exponentially many terms and hence cannot be formalized in VPV.Our proof avoids using the Lagrange expansion, and utilizes properties of the cofactor expansion instead.Lemma 6.4.(VPV ⊢) There is a VPV function that takes as inputs an n × n Edmonds' matrix A and a weight sequence And if B = A( W ) satisfies Det(B) = 0 and p is the position of the least significant 1-bit of Det(B), then the VPV function outputs a perfect matching M of weight at most p.
It is worth noting that the lemma holds regardless of whether or not the bipartite graph corresponding to A and weight assignment W has a unique minimum-weight perfect matching.
The proof of Lemma 6.4 is very similar to that of Theorem 3.1.Recall that in Theorem 3.1, given a matrix B satisfying Det(B) = 0, we want to extract a nonzero diagonal of B. In this lemma, we are given the position p of the least significant 1-bit of Det(B), and we want to get a nonzero diagonal of B whose product has the least significant 1-bit at position at most p.For this, we can use the same method of extracting the nonzero diagonal from Theorem 3.1 with the following modification.When choosing a term of the Lagrange expansion on the recursive step, we will also need to make sure the chosen term produces a nonzero sub-diagonal of B that will not contribute too much weight to the diagonal we are trying to extract.This ensures that the least significant 1-bit of the weight of the chosen diagonal is at most p.
For the rest of this section, we define numz(Y ) to be the position of the least significant 1-bit of the binary string Y .Thus if numz(Y ) = q then Y = ±2 q Z for some positive odd integer Z.
Proof of Lemma 6.4.We construct a sequence of matrices B n , B n−1 , . . ., B 1 where B n = B and B i−1 = B i [i | j i ] for i = n . . ., 2 where the index j i is chosen as follows.Define Assume we are given j n , . . ., j i+1 such that numz(T i ) ≤ p.We want to choose j i such that numz(T i−1 ) ≤ p, where by definition This can be done as follows.From the cofactor expansion of Det(B i ), we have Since numz(T i ) ≤ p, at least one of the terms in the sum must have its least significant 1-bit at position at most p.Thus, we can choose j i such that is minimized, which guarantees that numz(T i−1 ) ≤ p.
Since by assumption numz(T n ) = numz(Det(B n )) = p, VPV proves by Σ B 0 (L FP ) induction on i = n, . . ., 1 that numz(T i ) ≤ p.If we define j 1 = 1, then when i = 1 we have Thus it follows that numz(T 1 ) = numz ( n ℓ=1 B ℓ (ℓ, j ℓ )) ≤ p. Similarly to the proof of Theorem 3.1, we can extract a perfect matching with weight at most p by letting Q be a matrix, where Q(i, j) = j for all i, j ∈ [n].Then we compute another sequence of matrices i.e., we delete from Q i exactly the row and column we deleted from B i .
To prove that M = {(ℓ, Q ℓ (ℓ, j ℓ )) | 1 ≤ ℓ ≤ n} is a perfect matching, we note that whenever a pair (i, k) is added to the matching M , we delete the row i and column j i , where j i is the index satisfying Q i (i, j i ) = k.So we can never match any other vertex to k again.
It remains to show that w(M ) ≤ p.Since the binary expansion of n ℓ=1 B ℓ (ℓ, j ℓ ) has a unique one at position w(M ) and zeros elsewhere.Thus it follows from how the matching M was constructed that w(M ) ≤ p.
The next two theorems complete our description and justification of our RDET algorithm for finding a perfect matching.For these theorems we are given a bipartite graph G = (U ⊎ V, E), where we have U = {u 1 , . . ., u n } and V = {v 1 , . . ., v n }, and each edge (i, j) ∈ E is assigned a weight w i,j such that G has a unique minimum-weight perfect matching (see Theorem 6.2).Let W n×n be a sequence satisfying W i,j = 2 w i,j for all (i, j) ∈ E. Let A be the Edmonds matrix of G, and let B = A( W ). Let M denote the unique minimum weight perfect matching of G.If in Lemma 6.4 we tried to extract an appropriate nonzero diagonal of B using the determinant and minors of B as our guide, then in the proof of this theorem we do the reverse.From a minimum-weight perfect matching M of G, we want to rebuild in polynomially many steps suitable minors of B until we fully recover the determinant of B. We can then prove by Σ B 0 (L FP ) induction that in every step of this process, each "partial determinant" of B has the least significant 1-bit at position p.Note that the technique we used to prove this theorem does have some similarity to that of Lemma 3.2, even though the proof of this theorem is more complicated.
Proof.Let Q be a matrix, where Q(i, j) = j for all i, j ∈ [n].For 1 ≤ i ≤ n let B i be the result of deleting rows i + 1, . . ., n and columns M (i + 1), . . ., M (n) from B and let Q i be Q with the same rows and columns deleted.We can construct these matrices inductively in the form of two matrix sequences • for i = n, n − 1, . . ., 2, define j i to be the unique index satisfying and then let ) for all i ∈ The theorem follows from this by setting i = n.We will prove the claim by induction on i.The base case i = 1 follows from (6.2).
For the induction step, it suffices to show From the cofactor expansion formula we have Since B i+1 (i+1, j i+1 ) = 2 w i+1,M (i+1) by (6.2), and Suppose for a contradiction that there is some j ′ = j ℓ such that Then, we can extend the set of edges (n, M (n)), . . ., (i + 2, M (i + 2)), (i + 1, j ′ ) with i edges extracted from B i+1 [i + 1 | j ′ ] (using the method from Lemma 6.4) to get a perfect matching of G with weight at most p, which contradicts that M is the unique minimum-weight perfect matching of G.
To extract the edges of M in DET, we need to decide if an edge (i, j) belongs to the unique minimum-weight perfect matching M without knowledge of other edges in M .The next theorem, whose proof follows directly from Lemma 6.4 and Theorem 6.5, gives us that method.Theorem 6.6.(VPV ⊢) For every edge (i, j) ∈ E, we have (i, j) ∈ M if and only if Proof.(⇒): Assume (i, j) ∈ M .Then the bipartite graph G ′ = G \ {u i , v j } must have a unique minimum-weight perfect matching of weight w(M ) − w i,j .Thus from Theorem 6.5, (⇐): We prove the contrapositive.Assume (i, j) ∈ M .Suppose for a contradiction that w(M ) − w i,j = numz Det(B[i | j]) .Then by Lemma 6.4 we can extract from the submatrix B[i | j] a perfect matching Q of the bipartite graph G ′ = G\{u i , v j } with weight at most w(M )−w i,j .But then M ′ = Q∪{(i, j)} is another perfect matching of G with w(M ′ ) ≤ w(M ), a contradiction.Theorems 6.2, 6.5, and 6.6 complete the description and justification of our RDET algorithm for finding a perfect matching in a bipartite graph.Since these are theorems of VPV, it follows that VPV proves the correctness of the algorithm.6.3.Related bipartite matching problems.The correctness of the Mulmuley-Vazirani-Vazirani algorithm can easily be used to establish the correctness of RDET algorithms for related matching problems, for example, the maximum (cardinality) bipartite matching problem and the minimum-weight bipartite perfect matching problem, where the weights assigned to the edges are small.We refer to [22] for more details on these reductions.

Conclusion and Future Work
We have only considered randomized matching algorithms for bipartite graphs.For general undirected graphs, we need Tutte's matrix (cf.[22]), a generalization of Edmonds' matrix.Since every Tutte matrix is a skew symmetric matrix where each variable appears exactly twice, we cannot directly apply our technique for Edmonds' matrices, where each variable appears at most once.However, by using the recursive definition of the Pfaffian instead of the cofactor expansion, we believe that it is also possible to generalize our results to general undirected graphs.We also note that the Hungarian algorithm only works for weighted bipartite graphs.To find a maximum-weight matching of a weighted undirected graph, we need to formalize Edmonds' blossom algorithm (cf.[16]).Once we have the correctness of the blossom algorithm, the proof of the Isolating Lemma for undirected graph perfect matchings will be the same as that of Theorem 6.2.We leave the detailed proofs for the general undirected graph case for future work.
It is worth noticing that symbolic determinants of Edmonds' matrices result in very special polynomials, whose structures can be used to define the VPV surjections witnessing the probability bound in the Schwartz-Zippel Lemma as demonstrated in this paper.It remains an open problem whether we can prove the full version of the Schwartz-Zippel Lemma using Jeřábek's method within the theory VPV.Note that we can compute cardinalities of the sets directly here since all the sets we are considering here are small.Now let H be the graph whose edge relation is Q and whose vertices are simply the vertices of G.We then observe the following properties of H:

• For every
• Since Q is constructed from two matchings M and M ′ , every vertex of H can only be incident with at most two edges: one from M and another from M ′ .So every vertex of H has degree at most 2. • Any path of H must alternate between the edges of M and M ′ .We will provide a polytime algorithm to extract from the graph H an augmenting path with respect to M , which gives us the contradiction.Since Q = M ′ ⊕ M = i C i and all C i are disjoint, we have But this contradicts (A.1).
Algorithm A.2 (The augmenting-path algorithm).As a corollary of Berge's Theorem, we have the following simple algorithm for finding a maximum matching of a bipartite graph G.We start from any matching M of G, say empty matching.Repeatedly locate an Maugmenting path P and augment M along P and replace M by the resulting matching.
Stop when there is no M -augmenting path.Then we know that M is maximum.Thus, it remains to show how to search for an M -augmenting path given a matching M of G.
Algorithm A.3 (The augmenting-path search algorithm).First, from G and M we construct a directed graph H, where the vertices V H of H are exactly the vertices X ⊎ Y of G, and the edge relation E H of H is a 2n × 2n matrix defined as follows: The key observation is that t is reachable from s by an M -alternating path in the bipartite graph G iff t is reachable from s in the directed graph H.
After constructing the graph H, we can search for an M -augmenting path using the breadth first search algorithm as follows.Let s be an M -unsaturated vertex in X.We construct two 2n×2n matrices S and T as follows.
From the proof of Hall's Theorem, we have the following corollary saying that if a bipartite graph does not have a perfect matching, then we can find in polytime a subset of vertices violating Hall's condition.2): Assume that cost( u, v) = w(M ).By Lemma 5.1, no matching has weight greater than cost( u, v), and no cover with weight less than w(M ).
(2)⇒( 3): Assume M is a maximum-weight matching and ( u, v) is a minimum-weight cover of G. Suppose for a contradiction that the maximum matching M is not a perfect matching of H.We will construct a weight cover whose cost is strictly less than cost( u, v), which contradicts that ( u, v) is a minimum-weight cover.
Since the maximum matching M is not a perfect matching of H, by Corollary 1, we can construct in polytime a subset S ⊆ X satisfying We claim that ( u ′ , v ′ ) is again a weight cover.The condition w i,j ≤ u ′ i + v ′ j might only be violated for x i ∈ S and y i ∈ N (S).But since we chose δ ≤ u i + v j − w i,j , it follows that it follows that cost( u ′ , v ′ ) < cost( u, v).
(3)⇒(1): Suppose M is a perfect matching of H. Then w i,j = u i + v j holds for all edges in M .Summing equalities w i,j = u i + v j over all edges of M yields the equality cost( u, v) = w(M ).

Lemma 3 . 2 .
(VPV ⊢) Det(M ) ∈ {−1, 1} for any permutation matrix M .Proof of Lemma 3.2.We will construct a sequence of matrices N n , N n−1 , . . ., N 1 , s, •) takes as input a pair (i, r), where i ∈ [n] and r ∈ S n 2 −1 is a sequence of n 2 − 1 elements.Let B be an n × n matrix with elements from S. For i ∈ [n] let B i denote the leading principal submatrix of B that consists of the i × i upper-left part of B. In other words, B n = B, and B i−1 := B i [i | i] for i = n, . . ., 2. The following fact follows easily from the least number principle Σ B 0 (L FP )-MIN.Fact 4.2.(VPV ⊢) If Det(B) = 0, then there is i ∈ [n] such that Det(B j ) = 0 for all i ≤ j ≤ n, and either i = 1 or i > 1 and Det(B i−1 ) = 0.

Theorem 4 . 4 .
(VPV ⊢) There is a VPV function H(n, s, A, W , •) where A n×n is the Edmonds matrix for an arbitrary bipartite graph and W is a sequence of n 2 (binary) integers, such that if Det

)
It suffices for us to construct explicitly a VPV function ϕ mapping [m] × [k] m−1 onto Φ.For each i ∈ [k] we interpret the set {i} × [k] m−1 as the set of all possible weight assignments to the m − 1 edges E \ {e i }.Our function ϕ will map each set {i} × [k] m−1 onto the set of those bad weight assignments w such that the graph G contains two distinct minimum-weight perfect matchings M and M ′ with e i ∈ M \ M ′ .The function ϕ takes as input a sequence i, w 1 , . . ., w i−1 , w i+1 , . . ., w m from [m]×[k] m−1 and does the following.Use the function Mwpm (defined by Algorithm 5.4) to find a minimum-weight perfect matching M ′ of G with the edge e i deleted.Use Mwpm to find a minimum-weight perfect matching M 1 of the subgraph G \ {u j , v ℓ }, where u j and v ℓ are the two endpoints of e i .If both perfect matchings M ′ and M 1 exist and satisfy w(M ′ ) − w(M 1 ) ∈ [k], then ϕ outputs the sequence w 1 , . . ., w i−1 , w(M ′ ) − w(M 1 ), w i+1 , . . ., w m .(6.1)

Theorem A. 1 (
Berge's Theorem).(VPV ⊢) Let G = (X ⊎ Y, E) be a bipartite graph.A matching M is maximum iff there is no M -augmenting path in G.Proof.(⇒): Assume that all matchings N of E satisfy |N | ≤ |M |.Suppose for a contradiction that there is an M -augmenting path P .Let M ⊕ P denote the symmetric difference of two sets of edges M and P .Then M ′ = M ⊕ P is a matching greater than M , a contradiction.(⇐):We will prove the contrapositive.Assume there is another matching M ′ satisfying |M ′ | > |M |.We want to construct an M -augmenting path in G.ConsiderQ = M ′ ⊕ M .Since |M ′ | > |M |, it follows that |M ′ \ M | > |M \ M ′ |,and thus |Q ∩ M ′ | > |Q ∩ M | (A.1)

1 : 3 : 4 :
Initialize K = H and i = 1 2: while K = ∅ do Pick the least vertex v ∈ K Compute the connected component C i containing v 5: if C i is an M -augmenting path then 6:return C i and halt.

8 :
Update K = K \ C i and i = i + 1. 9: end while Note that since H has n vertices, the while loop can only iterate at most n times.It only remains to show the following.Claim:The algorithm returns an M -augmenting path assuming|M ′ | > |M |.Suppose for a contradiction that the algorithm would never produce any M -augmenting path.Since H has degree at most two, in every iteration of the while loop, we know that the connected component C i is• either a cycle, which means|C i ∩ M | = |C i ∩ M ′ |, or• a path but not an M -augmenting path, which implies that|C i ∩ M | ≥ |C i ∩ M ′ |.

Corollary 1 .A. 3 .
(VPV ⊢) There is a VPV function that, on input a bipartite graph G that does not have a perfect matching, outputs a subset S ⊆ X such that |S| > |N (S)|.Proof of Theorem 5.2 .Let H = H u, v be the equality subgraph for the weight cover ( u, v), and let M be a maximum cardinality matching of H. Recall Theorem 5.2 wants us to show that VPV proves equivalence of the following three statements:(1)  w(M ) = cost( u, v)(2) M is a maximum-weight matching and the cover ( u, v) is a minimum-weight cover of G (3) M is a perfect matching of H Proof of Theorem 5.2.(1)⇒( |N (S)| < |S|.Then we calculate the quantityδ = min{u i + v j − w i,j | x i ∈ S ∧ y j ∈ N (S)}.Note that δ > 0 since H is the equality subgraph.Next we construct a pair of sequencesu ′ = u ′ i n i=1 and v ′ = v ′ i n i=1, as follows: