Neeldhara
  • About
  • Research
    • Overview
    • People
    • Publications
    • Surveys
  • Teaching
    • Courses
    • Materials
  • Lists
    • Puzzles
    • Bookmarks
  • Exposition
    • Talks
    • Videos
  • Events
  • Blog

4. Scheduling with Deadlines

  • Advanced Algorithms
    • Flows
    • Intractability
    • Approximation Algorithms
    • Randomized Algorithms
    • Greedy Algorithms
      • 1. Storing Files on a Tape
      • 2. A Scheduling Problem
      • 3. Stable Matchings
    • Matroids
      • 1. The Motivation
      • 2. The Definition
      • 3. Greedy Works!
      • 4. Scheduling with Deadlines
    • Dynamic Programming
      • 1. Longest Increasing Subsequence
      • 2. Subset Sum
      • 3. Set Cover
      • 4. Optimal BSTs
      • 5. Maximum Independent Set on Trees

4. Scheduling with Deadlines

algonotes
lecturenotes
Published

June 7, 2023

Acknowledgements

This write up is borrowed (with minor adaptations) from Chapter E of Jeff Erickson’s textbook on Algorithms.

Suppose you have nnn tasks to complete in nnn days; each task requires your attention for a full day. Each task comes with a deadline, the last day by which the job should be completed, and a penalty that you must pay if you do not complete each task by its assigned deadline. What order should you perform your tasks in to minimize the total penalty you must pay?

More formally, you are given:

  • an array D[1,...,n]D[1,...,n]D[1,...,n] of deadlines and
  • an array P[1,...,n]P[1,...,n]P[1,...,n] of penalties.

Each deadline D[i]D[i]D[i] is an integer between 1 and nnn, and each penalty P[i]P[i]P[i] is a non-negative real number. A schedule is a permutation of the integers {1,2,…,n}\{1,2, \ldots, n\}{1,2,…,n}.

The scheduling problem asks you to find a schedule π\piπ that minimizes the following cost: cost⁡(π):=∑i=1nP[i]⋅[π(i)>D[i]]. \operatorname{cost}(\pi):=\sum_{i=1}^n P[i] \cdot[\pi(i)>D[i]]. cost(π):=i=1∑n​P[i]⋅[π(i)>D[i]].

This doesn’t look anything like a matroid optimization problem. For one thing, matroid optimization problems ask us to find an optimal set; this problem asks us to find an optimal permutation. Surprisingly, however, this scheduling problem is actually a matroid optimization in disguise!

For any schedule π\piπ, call tasks iii such that π(i)>D[i]\pi(i)>D[i]π(i)>D[i] late or delayed, and all other tasks on time.

To see this, consider that two schedules are equivalent from a cost perspective if they delay the same set of tasks. In particular, let S⊆[n]S \subseteq [n]S⊆[n] be some subset of tasks. Suppose we have a schedule π1\pi_1π1​ that delays all tasks in SSS and finishes all tasks that are not in SSS on time, and another schedule π2≠π1\pi_2 \neq \pi_1π2​=π1​ that also delays all tasks in SSS and finishes all tasks that are not in SSS on time. Observe that cost⁡(π1)=cost⁡(π2)\operatorname{cost}(\pi_1) = \operatorname{cost}(\pi_2)cost(π1​)=cost(π2​). Intuitively, we might as well turn our attention to subsets of completed tasks as opposed to the sequence in which we execute them. We summarize our discussion here in the following claim:

Claim Relating Permutations to Subsets

The cost of a schedule is determined by the subset of tasks that are on time.

Call a subset XXX of the tasks realistic if there is a schedule π\piπ in which every task in XXX is on time. We can precisely characterize the realistic subsets as follows. Let X(t)X(t)X(t) denote the subset of tasks in XXX whose deadline is on before ttt :

X(t):={i∈X∣D[i]≤t} X(t):=\{i \in X \mid D[i] \leq t\} X(t):={i∈X∣D[i]≤t}

In particular, X(0)=∅X(0)=\varnothingX(0)=∅ and X(n)=XX(n)=XX(n)=X.

Lemma characterizing realistic sets

Lemma. Let X⊆{1,2,…,n}X \subseteq\{1,2, \ldots, n\}X⊆{1,2,…,n} be an arbitrary subset of the nnn tasks. XXX is realistic if and only if ∣X(t)∣≤t|X(t)| \leq t∣X(t)∣≤t for every integer ttt.

Proof. Let π\piπ be a schedule in which every task in XXX is on time. Let iti_tit​ be the ttt th task in XXX to be completed. On the one hand, we have π(it)≥t\pi\left(i_t\right) \geq tπ(it​)≥t, since otherwise, we could not have completed t−1t-1t−1 other jobs in XXX before iti_{\mathrm{t}}it​. On the other hand, π(it)≤D[i]\pi\left(i_t\right) \leq D[i]π(it​)≤D[i], because iti_tit​ is on time. We conclude that D[it]≥tD\left[i_t\right] \geq tD[it​]≥t, which immediately implies that ∣X(t)∣≤t|X(t)| \leq t∣X(t)∣≤t.

Now suppose ∣X(t)∣≤t|X(t)| \leq t∣X(t)∣≤t for every integer ttt. If we perform the tasks in XXX in increasing order of deadline, then we complete all tasks in XXX with deadlines ttt or less by day ttt. In particular, for any i∈Xi \in Xi∈X, we perform task iii on or before its deadline D[i]D[i]D[i]. Thus, XXX is realistic.

We can now define a canonical schedule for any set XXX as follows: execute the tasks in XXX in increasing deadline order, and then execute the remaining tasks in any order. The previous proof implies that a set XXX is realistic if and only if every task in XXX is on time in the canonical schedule for XXX. Thus, our scheduling problem can be rephrased as follows:

Find a realistic subset XXX such that ∑i∈XP[i]\sum_{i \in X} P[i]∑i∈X​P[i] is maximized.

So we’re looking for optimal subsets after all! Now we could describe a greedy algorithm and show that it works out, or…

Realistic Sets have the Matroid Properties

Lemma. The collection of realistic sets of jobs forms a matroid.

Proof: The empty set is vacuously realistic, and any subset of a realistic set is clearly realistic. Thus, to prove the lemma, it suffices to show that the exchange property holds. Let XXX and YYY be realistic sets of jobs with ∣X∣>∣Y∣|X|>|Y|∣X∣>∣Y∣.

Consider keeping two dated notebooks, where the cover page is day 0, and the last page is day nnn. In a thought experiment, let us go over the days one by one, starting from the 0-th day, and on each day, we make a note of the number of tasks in X(t)X(t)X(t) on the first notebook, and the number of tasks in Y(t)Y(t)Y(t) on the second notebook — and we keep doing this until we observe that the number on the first notebook is bigger than the second. Notice that this has to happen at some point, since ∣X(n)∣>∣Y(n)∣|X(n)| > |Y(n)|∣X(n)∣>∣Y(n)∣ and both notebooks start at zero.

Now, let q⩽nq \leqslant nq⩽n be the smallest integer for which ∣X(q)∣>∣Y(q)∣|X(q)| > |Y(q)|∣X(q)∣>∣Y(q)∣. Notice that this means that there is at least one task in XXX with a deadline of qqq which is not in YYY. Pick any one such — say ttt — for inclusion in YYY: we claim that Y∪{t}Y \cup \{t\}Y∪{t} is realistic. Indeed, on any day ppp that is before qqq, we can appeal to the fact that YYY was realistic up to day ppp and we have introduced no new tasks in YYY that have a deadline of ppp or earlier. On the fated day qqq, the number of tasks we have in YYY is at most the number of tasks we have in XXX up to day qqq, and since that is at most qqq, we are done.

This lemma implies that our scheduling problem is actually a matroid optimization problem, so the greedy algorithm finds the optimal schedule.

GREEDYSCHEDULE(D[1 . . n], P[1 . . n]):
Sort P in increasing order, and permute D to match
j = 0
for i in 1,n:
    X[j+1] = i
    if X[1,...,j+1] is realistic
        j = j+1
return the canonical schedule for X[1...j]

We leave as an exercise to the reader to figure out an implementation of the oracle access to the family of realistic sets. In particular, this involves coming up an efficient procedure to test whether a given subset of jobs is realistic. It is possible to do this in linear time based on the properties of realistic sets established from before.


© 2022 • Neeldhara Misra • Credits •

 

Corrections? Please leave a comment here or a PR in this repository, thanks!

I’d rather be a failure at something I love than a success at something I hate.

George Burns

You live and you learn — at any rate, you live.

Douglas Adams

A problem worthy of attack proves its worth by fighting back.

Paul Erdos

×