Let's see how this substructure lemma naturally leads to a recursive search
algorithm. The algorithm is recursive.
I'm not going to bother to specify the base cases, which is when you're given a
graph, with a most 1 vertex. Or whether you're given a value of k,
equal to 0 or 1. So let's assume that the graph has a
least 2 vertices, and that k is at least equal to 2.
I wouldn't be surprised if this algorithm reminds you, of some of the recursive
implementations we discussed for various dynamic programming algorithms.
And in the same way those algorithms exhibited exponential running time,
without memoization, we're going to see an exponential running time here.
For all of our dynamic programming algorithms, we were able to cleverly
organize sub-problems from smallest to largest, thereby eliminating redundantly
solving the same sub-problems over and over again, yielding polynomial run time
bounds. Here, we're going to be stuck with the
exponential running time bound. Obviously not surprising given Given that
it's an NP complete problem, but still, if you really want to understand this
deeply, take this recursive search algorithm.Try to come up with a
polynomial time version,try to come up with a small set of sub problems, ordered
from smallest to largest that you can solve systematically.You will be
inevitably stymied in those attempts, but you'll better appreciate what's non
trivial about the recursive solution here.
In step 1 of the alrogithm, we pick some arbitray edge, u, comma v.
Why is it useful to focus on an edge? Well, by the definition of a vertex
cover, it has to include either u or v, so that's our initial clue.
We're going to proceed optimistically. We're looking for a vertex cover of size
K, so let's assume one exists. What does the sub structure limit tells
us? It says. That if such a solution exists, then
necessarily either gu or gv or both themselves have small vertex covers, of
size only k - 1. Moreover, it's a simple matter to extend
such vertex covers to a small 1 for g, you just augment them by the missing
vertex. So let's first adopt as a working
hypothesis that g sub u has a smaller disk number of size only k-1.
Let's recursively try to find it. If our recursive search comes back with a
solution of vertex number of size k-1 for g-u, we're good to go.
We just augment that by the vertex u and that's our vertex number of size k for
the original graph of G If that recursive call fails to find the small vertex cover
will you say, oh well then it must be that v is the key to getting a small
vertex cover. So we do exactly the same thing, we
recursively search g sub v for a vertex cover of size K - 1.
If the second recursive call also fails to find a small vertex cover, one of size
only k - 1, then by the substructure claim, the other direction, we know that
the original graph G cannot have a vertex cover of size k.
If it did, the substructure limit tells us.
One of the 2 recursive calls would have succeeded.
Since they both failed, we conclude correctly that the original graph did not
have a small vertex cover. The correctness analysis is straight for,
forward, formally you would proceed by induction.
So the induction hypothesis would then guarantee the correctness of the 2
recursive calls, so on the smaller, our sub problem is GU and GV the recursive
calls correctly detect whether or not there are vertex covers of size K-1.
The substructure limit guarantees that we correctly compile the results of the two
recursive calls into the output of our algorithms.
So if one of the two recursive calls succeeds, we're good to go.
We just output the vertex cover of size K as desired, and this is the crucial part,
if both of the recursive calls. Fail then by sub-structure liner we know,
we correctly report fail. We know there is not a vertex cover of
size k or less in the original graph G. So lets analyse the running time using an
Adhoc analysis. Lets think about the number of workers of
cost that there can be over the entire trajectory of the algorithm or
equivalently the number of nodes in the recursion tree generated by the algorithm
and then we'll come up with the bound on how much work is done on a given.
recursive of call, not counting the work done by its recursive subcalls.
And now, the cool part, and this is really where the rule of a small value of
k comes into play is that, each time we recurse, we subtract 1 from the target
size of a vertex cover that we're looking for.
Remember, if we are given, as input, the parameter k.
Each of our recursive calls involve. The parameter value of K minus 1.
So what this does is limit the recursion depth or the depth of the tree generated
by this algorithm to be at most K. Initially you start with K, each time you
recurse you decrement it by 1 and you're going to hit base cases by the time K is
around 1 or 0. Putting those two observations together,
branching factor banded by two, recursion depth bounded by K.
The number of recursive calls in total, over the entire life time of this
algorithm, is going to be bounded by two to the K.
Moreover, if you inspect the pseudo code, you see that not a whole lot of actions
going on, other than the recursive sub calls.
So even with a very sloppy implementation, where you have construct
a G sub U, or a G sub V from the input graph g.
Even if you do that crudely, you're going to get a linear time bound big o of m
work, per recursive call, not counting the work that gets done later by the
recursive sub-calls. So, that means an upward bound on the
total amount of work done by this algorithm is the number of recursive
calls, 2 ^ k, times the work per call. M 2^K*M.
Now this running time you'll of course notice still exponential in K, but, of
course it has to be exponential unless we intend on proving P=NP.
Don't forget the vertex cover problem is NP complete.
So what have we accomplished? We had a trivial exponential time algorithm by
brute force search earlier. Why did we go to this rigamorole to come
up with the second algorithm with this exponential run time bound.
Well, this is a qualitatively better running time bound than we have before
with naive brute force search. Here's 2 ways to see why this running
time bound is qualitatively superior So first, let's just inspect this running
time from a mathematical perspective and let's ask how large can we take K while
still having a polynomial time algorithm. Recall that our naive brute force
searched algorithm where you just try each sub set of K vertices random time
theta of n to the k. And so that's polynomials Time if k is
constant, and it's super-polynomial time if k is bigger than a constant.
For this running time bound 2 ^ k * m, we can let k grow as large as log of the
graph size and still have a polynomial Algorithm, So now we have the sig way
from these mathematical balance to thinking through actually implementing
these algorithms running them on actual data.
So for the naive per search you have got this running time of of n to the k so
even for a relatively small graph you know lets say n is may be fifty or
hundred you are not going to be able, You're not going to be able to run this
brute for-, naive brute force search algorithm except for extremely small
values of k. 3, 4, maybe 5 if you're lucky.
So naive brute force search has very limited range of applicability.
So by contrast our smartest risk algorithm can accommodate a noticeably
larger range of k's. So remember the running time here is 2 to
the k * the graph size, and so even for values of k up to 20, for many networks
of interest you're going to be able to implement and run this algorithm in a
reasonable amount of time.