Neeldhara
  • About
  • Research
    • Overview
    • People
    • Publications
    • Surveys
  • Teaching
    • Courses
    • Materials
  • Lists
    • Puzzles
    • Bookmarks
  • Exposition
    • Talks
    • Videos
  • Events
  • Blog
  1. Advanced Algorithms
  2. Matroids
  3. 2. The Definition
  • Advanced Algorithms
    • Greedy Algorithms
      • 1. Storing Files on a Tape
      • 2. A Scheduling Problem
      • 3. Stable Matchings
    • Matroids
      • 1. The Motivation
      • 2. The Definition
      • 3. Greedy Works!
      • 4. Scheduling with Deadlines
    • Dynamic Programming
      • 1. Longest Increasing Subsequence
      • 2. Subset Sum
      • 3. Set Cover
      • 4. Optimal BSTs
      • 5. Maximum Independent Set on Trees
    • Flows
      • 1. Flow-Cut Duality
    • Intractability
      • 1. Independent Set
    • Approximation
      • 1. Vertex Cover
      • 2. Set Cover
    • Randomized Algorithms
      • 1. Randomized QuickSort

2. The Definition

algonotes
lecturenotes
Published

June 7, 2023

Acknowledgements

The part of the write up that introduces terminology and examples is borrowed from Chapter E of Jeff Erickson’s textbook on Algorithms.

Recall that from last time, we were dealing with:

  • a finite universe of elements U={e1,e2,⋯ ,en}\cal U = \{e_1, e_2, \cdots, e_n\}U={e1​,e2​,⋯,en​}
  • a weight function w:U⟶Q⩾0w: \cal U \longrightarrow \mathbb{Q}^{\geqslant 0}w:U⟶Q⩾0 that assigns a non-negative weight with every universe element
  • oracle access to a family F={S1,…,Sm}\cal F = \{S_1, \ldots, S_m\}F={S1​,…,Sm​} of subsets of U\cal UU

and we wanted to determine the maximum weight attained by any set in F\cal FF, where the weight of a set is defined as the sum of the weights of the elements that belong to it. To this end, we proposed the following greedy algorithm:

A Greedy Algorithm
  1. Sort the elements of UUU in descending order of weight; i.e, WLOG w(e1)⩾⋯⩾w(en)w(e_1) \geqslant \cdots \geqslant w(e_n)w(e1​)⩾⋯⩾w(en​).
  2. Initialize a solution: specifically, let X=∅X = \emptysetX=∅.
  3. For i∈[n]i \in [n]i∈[n]: if X∪{ei}∈FX \cup \{e_i\} \in \cal FX∪{ei​}∈F, X⟵X∪{ei}X \longleftarrow X \cup \{e_i\}X⟵X∪{ei​}.

and concluded that it does not quite work, but might if the family came with the following promise:

If SSS and TTT are two subsets of U\cal UU with T∈FT \in \cal FT∈F and ∣S∣>∣T∣|S| > |T|∣S∣>∣T∣, then ∃e∈S∖T\exists e \in S \setminus T∃e∈S∖T such that T∪{e}∈FT \cup \{e\} \in \cal FT∪{e}∈F.

This is exactly what we need to make our proof work out: recall that TTT corresponds to the greedy choices, which always correspond to a set in F\cal FF, while SSS corresponds to a subset of the optimal solution, which has no obligation to be in F\cal FF. Let us call a family fantastic if it satisfies the property above. It is not difficult to show that the greedy algorithm produces the correct output when the input is a fantastic family: we leave this as an exercise.

However, notice that this is not a very interesting defintion. Consider any family F\cal FF that is non-trivial, i.e, has at least one non-empty subset of UUU in it: call this XXX. Then, by triggering this property on X∪{e}X \cup \{e\}X∪{e} for any e∉Xe \notin Xe∈/X, we see that all “immediate supersets” of XXX are forced to also belong to F\cal FF. Carrying this argument forward, we see that all supersets of XXX are forced to belong to F\cal FF. In particular, if ∅∈F\emptyset \in \cal F∅∈F — which will be the case for most applications of this framework — then the only family F\cal FF that qualifies as fantastic is the family of all possible subsets of UUU. So… while our notion of a fantastic family is perfectly well-defined and even useful, it is also rather restrictive and therefore uninteresting.

Our next step is to see if we can still get to our desired outcome (i.e, the correctness of the greedy algorithm), while not being so demanding with our definition. How about we modify our constraint to the following:

If SSS and TTT are two sets with S∈FS \in \cal FS∈F and T∈FT \in \cal FT∈F, and ∣S∣>∣T∣|S| > |T|∣S∣>∣T∣, then ∃e∈S∖T\exists e \in S \setminus T∃e∈S∖T such that T∪{e}∈FT \cup \{e\} \in \cal FT∪{e}∈F.

So let us say that a family is nice if it satisfies the property above. Can we now push our proof of correctness for the greedy algorithm above? Unfortunately, the answer is a negative. Can you come up with a counterexample before reading on?

A Counterexample

Let:

  • U={p,q}\cal U = \{p,q\}U={p,q}
  • w(p)=40,w(q)=30w(p) = 40, w(q) = 30w(p)=40,w(q)=30
  • F={{q},{pq}}\cal F = \{\{q\},\{pq\}\}F={{q},{pq}}

The greedy output will be {q}\{q\}{q}, while the optimal output is {pq}\{pq\}{pq}. Notice that the family is nice!

So well, it is back to the drawing board. Let us recall how our proof was supposed to go. Let:

  • TTT be the set of greedy choices up to the (i−1)th(i-1)^{th}(i−1)th iteration, and
  • SSS be the set of optimal choices up to the ithi^{th}ith iteration,

then the hope was that the property above would rescue us and contradict the greedy choices, since ∣S∣>∣T∣|S| > |T|∣S∣>∣T∣. However, in the counterexample above, we are unable to take advantage of the fact that F\cal FF is a nice family because S∉FS \notin \cal FS∈/F. Note that the optimal solution is obliged to only output a set in F\cal FF, but the intermediate choices as we are viewing them now have no obligation to necessarily belong to F\cal FF.

One way to force this is to insist that our family be heriditary, that is, the inclusion of a set S∈FS \in \cal FS∈F implies the inclusion of all subsets of SSS in F\cal FF as well. If this were true, then S∈FS \in \cal FS∈F since SSS is a subset of the optimal solution, which we know to be in F\cal FF. Note that the counterexample above did not involve a heriditary family. It turns out that this combination of properties is both: (a) permissive, in that a non-trivial number of families enjoy the property; and (b) useful, in that it gives us enough leverage to prove the correctness of the greedy algorithm.

With all this, we are now ready to bring up the official definition of a matroid.

Matroids

A matroid (U,F)(\cal U, \cal F)(U,F) consists of a finite ground set U\cal UU and a collection F\mathcal{F}F of subsets of U\cal UU that satisfies three axioms:

  • Non-emptiness. The empty set is in F\mathcal{F}F. (Thus, F\mathcal{F}F is not itself empty.)

  • Heredity. If F\mathcal{F}F contains a subset S⊆US \subseteq \cal US⊆U, then F\mathcal{F}F contains every subset of SSS.

  • Exchange. If SSS and TTT are two sets in F\cal FF where ∣S∣>∣T∣|S|>|T|∣S∣>∣T∣, then T∪{e}∈FT \cup\{e\} \in \mathcal{F}T∪{e}∈F for some element e∈S\Te \in S \backslash Te∈S\T.

You can think of the first property as a formality that excludes the empty family from being a matorid. The other two have independent historical motivations, but we have also just seen a good reason to find these properties particularly desirable: which is the way it plays into the proof of correctness of our greedy algorithm.

Speaking of history, let us introduce some terms commonly associated with matroids. Some of these words may ring familiar if you have encountered linear algebra before: and indeed, the crossover is not unintented — a fundamental type of matroid is one that is naturally associated with a matrix, and a lot of the terms are inspired from this setting.

First, the sets of F\cal FF are called independent sets, and all subsets of U\cal UU that did not make it to F\cal FF are called — you guessed it — dependent sets. Note that the exchange axiom ensures that all maximal sets of a matroid F\cal FF have the same size. The maximal sets are called a basis, and their common size is called the rank of a matroid. A minimal dependent set is called a circuit.

Before moving on, we enlist some examples of matroids, to reaffirm the idea that the definition indeed allows for a variety of interesting set systems. We do not elaborate on the details of how these families satisfy the matroid axioms: the motivated reader is welcome to pursue this further as an exercise.

  • Uniform matroid Uk,nU_{k, n}Uk,n​. A subset X⊆{1,2,…,n}X \subseteq\{1,2, \ldots, n\}X⊆{1,2,…,n} is independent if and only if ∣X∣≤k|X| \leq k∣X∣≤k. Any subset of {1,2,…,n}\{1,2, \ldots, n\}{1,2,…,n} of size kkk is a basis; any subset of size k+1k+1k+1 is a circuit.

  • Graphic/cycle matroid M(G)\mathcal{M}(G)M(G). Let G=(V,E)G=(V, E)G=(V,E) be an arbitrary undirected graph. A subset of EEE is independent if it defines an acyclic subgraph of GGG. A basis in the graphic matroid is a spanning tree of GGG; a circuit in this matroid is a cycle in GGG.

  • Cographic/cocycle matroid M∗(G)\mathcal{M}^*(G)M∗(G). Let G=(V,E)G=(V, E)G=(V,E) be an arbitrary undirected graph. A subset I⊆EI \subseteq EI⊆E is independent if the complementary subgraph (V,E\I)(V, E \backslash I)(V,E\I) of GGG is connected. A basis in this matroid is the complement of a spanning tree; a circuit in this matroid is a cocycle-a minimal set of edges that disconnects the graph.

  • Matching matroid. Let G=(V,E)G=(V, E)G=(V,E) be an arbitrary undirected graph. A subset I⊆VI \subseteq VI⊆V is independent if there is a matching in GGG that covers III.

  • Disjoint path matroid. Let G=(V,E)G=(V, E)G=(V,E) be an arbitrary directed graph, and let sss be a fixed vertex of GGG. A subset I⊆VI \subseteq VI⊆V is independent if and only if there are edge-disjoint paths from sss to each vertex in III.

Up next, we consolidate our learnings, and present a proof that should strike the reader as at least slightly boring, given that it is completely predictable in the light of our discussion so far.


© 2022 • Neeldhara Misra • Credits •

 

Corrections? Please leave a comment here or a PR in this repository, thanks!

I’d rather be a failure at something I love than a success at something I hate.

George Burns

You live and you learn — at any rate, you live.

Douglas Adams

A problem worthy of attack proves its worth by fighting back.

Paul Erdos

×