Five years ago, the era of personalized genomics started and
Nicholas Volker is a foster child of personalized genomics.
He was so sick that he went through thousands of surgery, but
doctors still were not able to figure out what is wrong with this kid.
However, after he is genome sequence can reveal a mutation
in a gene linked to defect in the immune system.
Doctors applied immunotherapy and Nicholas Volker is now a healthy child.
However, sequence in personal genomes,
from scratch, still remains difficult even today.
What biologists do today however, they do so
called reference base human genome sequences.
Let's start from Craig Venter genome assembled in 2000,
call it reference genome.
And then let's start sequencing my genome by generating all reads from my genome.
Here's some of the three perfectly match to genome, but some of them don't.
And based on these reads that do not match,
we will be able to figure out what is my genome.
For example, we can find a mutation of T into C and
deletion of T in my genome as compared to.
It brings us to a number of computational problems.
The easiest one is the exact pattern matching.
Given a String pattern and a String text, we want to find all positions and
texts where pattern appears as a Substring.