3. Greedy Works!
This write up is borrowed (with minor adaptations) from Chapter 4 of Jeff Erickson’s textbook on Algorithms.
Recall that from last time, we were dealing with:
- a finite universe of elements
- a weight function that assigns a non-negative weight with every universe element
- oracle access to a family of subsets of
and we wanted to determine the maximum weight attained by any set in , where the weight of a set is defined as the sum of the weights of the elements that belong to it.
To this end, we proposed the following greedy algorithm:
- Sort the elements of in descending order of weight; i.e, WLOG .
- Initialize a solution: specifically, let .
- For : if , .
Since the greedy algorithm did not work “at large”, we agreed that it might be interesting to impose some constraints on and see if we could argue that the greedy algorithm works for all families that enjoy some additional structure. After some back and forth, here is what we came up with:
A matroid consists of a finite ground set and a collection of subsets of that satisfies three axioms:
Non-emptiness. The empty set is in . (Thus, is not itself empty.)
Heredity. If contains a subset , then contains every subset of .
Exchange. If and are two sets in where , then for some element .
All this sets the stage for the main result that we want to get to:
Theorem. For any matroid and any weight function , the greedy algorithm that takes as input returns a maximum-weight basis of .
Proof. As you might anticipate, we use an exchange argument.
Let be the independent set returned by the greedy algorithm on input .
If any other element of could be added to to obtain a larger independent set, the greedy algorithm would have added it. Thus, is a basis.
For purposes of deriving a contradiction, suppose there is an independent set such that Without loss of generality, assume that is a basis. The exchange property now implies that .
Now suppose the elements of and are indexed in order of decreasing weight. Let be the smallest index such that , and consider the independent sets
By the exchange property, there is some element such that is an independent set. We have:
Thus, the greedy algorithm considers and rejects the heavier element before it considers the lighter element .
But this is impossible, since the greedy algorithm accepts elements in decreasing order of weight.
This result also has an interesting flip side, so to speak. When dealing with set systems that do not satisfy the exchange axiom, it turns out that we can adverserially choose a weight function that misleads the greedy algorithm into producing suboptimal output! The intuition is the following: a violation of the exchange axiom in a set system amounts to saying the following:
such that but for all , .
Now our plan is the following.
We “tempt” greedy into picking all elements in before anything in , by making the elements of just a little heavier than elements in, say, . Note that once greedy chooses all of , it cannot expand further into because of the violation above combined with its own committment to navigating with . So it will skip over all elements of : which is an opportunity for us to create a better optimal solution.
In particular, our goal will be to assign weights to elements in in such a way that in spite of the fact that elements in are heavier than those in . Here, we can take advantage of the fact that .
With these ideas in place, can you come up with the adverserial weight function?
Using to denote , suppose we use the following scheme:
- Every element of has weight .
- Every element of has weight .
- Every other element of the ground set has weight zero.
With these weights, the greedy algorithm will consider and accept every element of , then consider and reject every element of , and finally consider all the other elements. The algorithm returns a set with total weight . But the total weight of is at least . Thus, the output of the greedy algorithm is not the maximum-weight set in .
This completes the argument for the following fact:
For any heriditary family that is not a matroid, there is a weight function such that the greedy algorithm fails to output the optimal answer.
Recall the Dance Classes problem considered in Chapter 1. There is a natural subset system associated with this problem: A set of classes is independent if and only if no two classes overlap. This subset system is not a matroid, because there can be maximal independent sets of different sizes, which violates the exchange property. If we consider a weighted version of the class scheduling problem, where each class is worth a different number of hours, our result above implies that the greedy algorithm will not always find the optimal schedule.
However, this result here does not contradict the correctness of the greedy algorithm for the original unweighted problem; that problem uses a particularly lucky choice of weights (all equal), whereas the claim here is simply that there exists particularly nasty choice of weights that is designed to mislead greedy into making suboptimal choices.