## Saturday, July 23, 2016

### Coursera Approximation Algorithms Part II - Peer Assignment 1

Problem:

Solution:

(1) There are $m$ variables (i.e. number of subsets) and $n$ constraints (number of nodes) in the primal program. So there are $n$ variables and $m$ constraints in the dual. To fit of the rest of the problem, we name the dual variables $y_i$.

The primal program is a minimization, so the dual program is a maximization.

The primal program constraints vector is all 1, so the optimization criterion is

$\min \sum\limits_{1 \le i \le n} y_i$

The constraint is really just a transpose of the constraint matrix, so we have

$\forall j \in [1, m] \sum\limits_{e_i \in S_j} y_i \le w_j$.

Last but not least, we have the non-negativity constriants

$\forall i \in [1,n ] y_i \ge 0$

(2) There are two ways to look at this and they are equally valid.

When we increase the dual variable $y_i$, the associated subset $S_i$ is not currently in the cover, and then we add it to the cover, so there is no way we will attempt to increase $y_i$ again. So there is at most once in the algorithm a dual variable is increased.

The algorithm is maintaining dual feasibility and it increase a dual variable to the limit once we try to increase it, therefore, the dual variable cannot be increased again, so again, we show that a dual variable can be increased at most once.

(3) It is obvious that if the algorithm terminates, it is a solution (because it gets out the while loop).

The algorithm must terminate because there is only a finite number of subsets. Worst case we take them all.

(4) The original primal program is a minimization program, so the optimal primal solution is always better than the optimal primal integral solution, so in other words:

$OPT \ge val(x*)$

By strong duality, we have $val(x*) = val(y*)$, so we have

$OPT \ge val(x*) = val(y*)$

Next, the algorithm make sure the vector $y$ stay feasible by making sure it does not break any constraints. Therefore when the algorithm terminates, $y$ is a dual feasible vector.

(6) Since the dual program is a maximization program, so any the $val(y)$ for any feasible vector $y$ will be at most $val(y*)$, or in equations, we have:

$OPT \ge val(x*) = val(y*) \ge val(y)$

(7) $j \in I$ implies that we have, in the course of the algorithm that pick it, max out the constraint, so we have:

$\sum\limits_{e_i \in S_j} y_i = w_j$

This is of course the same as

$w_j = \sum\limits_{e_i \in S_j} y_i$

(8) We simply take the sum of the above over all j, that gives

$\sum\limits_{j \in I} w_j = \sum\limits_{j \in I} \sum\limits_{e_i \in S_j} y_i$

(9) Let's forcefully swap the summation order by using an indicator variable, we have

$\sum\limits_{j \in I} w_j = \sum\limits_{1 \le i \le n} \sum\limits_{j \in I} X(i, j) y_i$

Where $X(i, j) = 1$ if $e_i \in S_j$ or 0 otherwise.

Now we can factor out the $y_i$ as that has nothing to do with $j$.

$\sum\limits_{j \in I} w_j = \sum\limits_{1 \le i \le n} [y_i \sum\limits_{j \in I} X(i, j)]$

So how about the $j$ sum? Remember a node belongs to at most $f$ sets, so we can write

$\sum\limits_{j \in I} X(i,j) \le \sum\limits_{j \in [1,m]} X(i,j) \le f$

Therefore we conclude

$\sum\limits_{j \in I} w_j = \sum\limits_{1 \le i \le n} [y_i \sum\limits_{j \in I} X(i, j)] \le \sum\limits_{1 \le i \le n} y_i f = f\sum\limits_{1 \le i \le n} y_i = f \cdot val(y)$.

(10) Finally, we have in (6)

$OPT \ge val(x*) = val(y*) \ge val(y)$

Multiplying by $f$ and using (9), we finally arrive:

$f OPT \ge f val(x*) = f \cdot val(y*) \ge f \cdot val(y) = \sum\limits_{j \in I} w_j$

In other words, the algorithm is a $f$ approximation of the set cover problem.