Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!samsung!zaphod.mps.ohio-state.edu!usc!ucsd!ucbvax!homxb.att.COM!wilber From: wilber@homxb.att.COM (Bob Wilber) Newsgroups: comp.theory Subject: Re: Finding subsets of equations Message-ID: <9006042051.AA10658@irt.watson.ibm.com> Date: 4 Jun 90 20:51:23 GMT Sender: daemon@ucbvax.BERKELEY.EDU Reply-To: Bob Wilber Lines: 70 Russell J. Abbott writes: >Does anyone have or know of a solution to the following problem? >Suppose you have n equations in m unknowns, m > n and want to find a >subset of those equations in which the number of unknowns is no greater >than the number of equations. Is there a way to do that without a brute >force search through all possible combinations of equations with common >unknowns? Martin Tompa replies: >Here is a polynomial time algorithm for this problem, but I'd be interested >to know if there is a more efficient one. Construct a bipartite graph, >whose left block contains a vertex for each of the n equations, and whose >right block contains a vertex for each of the m unknowns. Now test if this >graph has a perfect matching. If it does not, then, by Hall's Theorem, >there is some set of k equations with at most k-1 unknowns among them, and >you are done. (The standard perfect matching algorithms can be made to >output the set of k equations when they fail to find a perfect matching.) > >Now to finish, add a new vertex u to the left block, and invoke the >perfect matching algorithm m more times as follows: in the ith invocation, >u is adjacent only to the ith vertex in the right block. If any of these m >graphs fails to have a perfect matching, it must be because k equations >have exactly k unknowns among them, one of them being the one to which u is >currently adjacent (and again, the algorithm can be made to output the set >nof k equations). If all m have perfect matchings, then every set of k >equations has at least k+1 unknowns among them, for all k. > >This still seems like brute force to me: it would be interesting to know >whether the problem stated can be reduced to only one call on perfect >matching. The answer is yes. Construct the bipartite graph G = (Q, V, E) where Q = the set of equations, V = the set of variables, and E = { (q, v) | equation q contains variable v }. Compute a maximal matching. As above, if the number of edges in the matching is < n, then by Hall's Theorem there is a subset of k equations with <= k-1 variables among them, and we're done. Suppose we have a matching with n edges. Call a vertex v in V exposed if it is not adjacent to any edge in the matching. Say that a path is alternating if it is a simple path that starts at an exposed vertex and alternates between unmatched and matched edges, with the first and last edges being unmatched. (So the last vertex in the path is in Q.) Let Q' be the set of vertices in Q that are contained in some alternating path. Note that Q' can be computed in O(E) time by depth first search starting at the exposed vertices. If a subset S of k equations contains an equation in Q', then those equations have at least k+1 variables among them. For there are the k variables at the endpoints of the matched edges coming from the k equations, and in addition we can find one more variable as follows. Let q_0 be an equation in S that is Q'. Let q_0, v_0, q_1, v_1, ..., q_i, v_i be an alternating path from q_0 to an exposed vertex v_i. Let q_j be the last equation in the path that is in S. Vertex v_j is either the exposed vertex, v_i, or it is matched up with an equation q_{j+1} that is not in S, and since v_j is in equation q_j it brings the total number of variables among the equations in S up to k+1. So if Q' = Q there is no subset of k equations with <= k variables. Conversely, if Q'' = Q - Q' is nonempty, then Q'' has as many variables as equations. For let V'' be the set of variables matched up with the variables in Q'' (|V''| = |Q''|). Every variable in an equation in Q'' must be in V''. Suppose to the contrary that an equation q_0 in Q'' has a variable v that is not in V''. If v is exposed then edge (v, q_0) is an alternating path so q is in Q', a contradiction. Otherwise let q_1 be the equation that v is matched with. Since v is not in V'', q_1 is in Q'. But then the alternating path to q_1 can be extended with edges (q_1, v) and (v, q_0), so q_0 is in Q', again a contradiction. The complexity of the algorithm is dominated by the time to compute one maximal matching, O(|E| sqrt(m)). Bob Wilber