About this Course
391 ratings
89 reviews

100% online

Start instantly and learn at your own schedule.

Flexible deadlines

Reset deadlines in accordance to your schedule.

Approx. 22 hours to complete


Subtitles: English

Skills you will gain

Bioinformatics AlgorithmsAlgorithmsPython ProgrammingAlgorithms On Strings

100% online

Start instantly and learn at your own schedule.

Flexible deadlines

Reset deadlines in accordance to your schedule.

Approx. 22 hours to complete


Subtitles: English

Syllabus - What you will learn from this course

4 hours to complete

DNA sequencing, strings and matching

This module we begin our exploration of algorithms for analyzing DNA sequencing data. We'll discuss DNA sequencing technology, its past and present, and how it works. ...
19 videos (Total 112 min), 7 readings, 2 quizzes
19 videos
Lecture: Why study this?4m
Lecture: DNA sequencing past and present3m
Lecture: Genomes as strings, reads as substrings5m
Lecture: String definitions and Python examples3m
Practical: String basics 7m
Practical: Manipulating DNA strings 7m
Practical: Downloading and parsing a genome 6m
Lecture: How DNA gets copied3m
Optional lecture: How second-generation sequencers work 7m
Optional lecture: Sequencing errors and base qualities 6m
Lecture: Sequencing reads in FASTQ format4m
Practical: Working with sequencing reads 11m
Practical: Analyzing reads by position 6m
Lecture: Sequencers give pieces to genomic puzzles5m
Lecture: Read alignment and why it's hard3m
Lecture: Naive exact matching10m
Practical: Matching artificial reads 6m
Practical: Matching real reads 7m
7 readings
Welcome to Algorithms for DNA Sequencing10m
Pre Course Survey10m
Setting up Python (and Jupyter)10m
Getting slides and notebooks10m
Using data files with Python programs10m
Programming Homework 1 Instructions (Read First)10m
2 practice exercises
Module 120m
Programming Homework 114m
3 hours to complete

Preprocessing, indexing and approximate matching

In this module, we learn useful and flexible new algorithms for solving the exact and approximate matching problems. We'll start by learning Boyer-Moore, a fast and very widely used algorithm for exact matching...
15 videos (Total 114 min), 1 reading, 2 quizzes
15 videos
Lecture: Boyer-Moore basics8m
Lecture: Boyer-Moore: putting it all together6m
Lecture: Diversion: Repetitive elements5m
Practical: Implementing Boyer-Moore 10m
Lecture: Preprocessing7m
Lecture: Indexing and the k-mer index10m
Lecture: Ordered structures for indexing8m
Lecture: Hash tables for indexing7m
Practical: Implementing a k-mer index 7m
Lecture: Variations on k-mer indexes9m
Lecture: Genome indexes used in research9m
Lecture: Approximate matching, Hamming and edit distance6m
Lecture: Pigeonhole principle6m
Practical: Implementing the pigeonhole principle 9m
1 reading
Programming Homework 2 Instructions (Read First)10m
2 practice exercises
Module 220m
Programming Homework 212m
2 hours to complete

Edit distance, assembly, overlaps

This week we finish our discussion of read alignment by learning about algorithms that solve both the edit distance problem and related biosequence analysis problems, like global and local alignment....
13 videos (Total 92 min), 1 reading, 2 quizzes
13 videos
Lecture: Solving the edit distance problem12m
Lecture: Using dynamic programming for edit distance12m
Practical: Implementing dynamic programming for edit distance 6m
Lecture: A new solution to approximate matching9m
Lecture: Meet the family: global and local alignment10m
Practical: Implementing global alignment 8m
Lecture: Read alignment in the field4m
Lecture: Assembly: working from scratch2m
Lecture: First and second laws of assembly8m
Lecture: Overlap graphs8m
Practical: Overlaps between pairs of reads 4m
Practical: Finding and representing all overlaps 3m
1 reading
Programming Homework 3 Instructions (Read First)10m
2 practice exercises
Module 320m
Programming Homework 38m
2 hours to complete

Algorithms for assembly

In the last module we began our discussion of the assembly problem and we saw a couple basic principles behind it. In this module, we'll learn a few ways to solve the alignment problem....
13 videos (Total 83 min), 1 reading, 2 quizzes
13 videos
Lecture: The shortest common superstring problem8m
Practical: Implementing shortest common superstring 4m
Lecture: Greedy shortest common superstring7m
Practical: Implementing greedy shortest common superstring 7m
Lecture: Third law of assembly: repeats are bad5m
Lecture: De Bruijn graphs and Eulerian walks8m
Practical: Building a De Bruijn graph 4m
Lecture: When Eulerian walks go wrong9m
Lecture: Assemblers in practice8m
Lecture: The future is long?9m
Lecture: Computer science and life science5m
Lecture: Thank yous 43s
1 reading
Post Course Survey10m
2 practice exercises
Programming Homework 48m
Module 414m
89 ReviewsChevron Right


started a new career after completing these courses


got a tangible career benefit from this course

Top Reviews

By VKAug 8th 2017

This course provided me a very quick overview of all the core concepts pertaining to DNA sequencing. It is very well organized, crystal clear demonstration of concepts and I really enjoyed the course.

By AZMar 11th 2016

Awesome, you will learn a lot about how DNA assemblers work, but very challenging and time demand in, especially if your background is in life science and not computer science.



Ben Langmead, PhD

Assistant Professor
Computer Science

Jacob Pritt

Department of Computer Science

About Johns Hopkins University

The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world....

About the Genomic Data Science Specialization

This specialization covers the concepts and tools to understand, analyze, and interpret data from next generation sequencing experiments. It teaches the most common tools used in genomic data science including how to use the command line, Python, R, Bioconductor, and Galaxy. The sequence is a stand alone introduction to genomic data science or a perfect compliment to a primary degree or postdoc in biology, molecular biology, or genetics. To audit Genomic Data Science courses for free, visit https://www.coursera.org/jhu, click the course, click Enroll, and select Audit....
Genomic Data Science

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

More questions? Visit the Learner Help Center.