This syllabus and all lectures copyright (c) 1999 by Steven L. Salzberg. Students are prohibited from selling (or being paid for taking) notes during this course to or by any person or commercial firm without the express written permission of Professor Salzberg.


CS 600.439 Principles of Computational Biology
Fall 1999

Professor: Steven Salzberg, salzberg@tigr.org 
Time: Wednesdays, 4:30-7:10 p.m. 
Location: NEB B28 (ATT classroom) and JHU Montgomery County Center (simultaneously) 

This course is co-listed in the Part-time Programs in Engineering, Department of Computer Science, as  605.491 Principles of Computational Sequence Analysis.

Textbooks:  Introduction to Computational Molecular Biology by Joao Setubal and Joao Meidanis.  Publisher: PWS Publishing Company, Boston, 1997.   (The website lists errata for the text.)  Abbreviated as SM below. Computational Methods in Molecular Biology edited by Steven Salzberg, David Searls, and Simon Kasif.  Publisher: Elsevier Science B.V., Amsterdam, 1998.  Softcover edition.  Abbreviated as SSK below.
 

Syllabus

Sept. 8. First day of class. Introduction to the course. Overview of computational biology and genomics.  Introduction to molecular biology for non-biologists: DNA basics, replication, transcription, translation, splicing.  DNA sequencing technology.  Whole-genome shotgun sequencing strategies.  Guest lecturer: Owen White, Ph.D., Deputy Director of Bioinformatics, The Institute for Genomic Research.

Reading:

Sept. 15. Strings and graphs.  Sequence alignment.  Global and local alignment using the Smith-Waterman algorithm.  PAM matrices.  Sequence alignment using BLAST.  Get first homework assignment here.

Reading:

Sept. 22.  Large-scale sequence alignment using MUMmer.  Sequence assembly: shortest superstring, greedy assembly algorithms.  Algorithms for sequencing by hybridization.

Reading:

Sept. 29.  Mapping.  Restriction site mapping, hybridization mapping.  Interval graphs.  Consecutive ones property and associated algorithms.  Optical mapping.  Get second homework assignment here.
Assignment 1 due today.

Reading:

October 6.  Gene indices.  EST sequencing projects.  Assembling genes from EST databases.  Aligning ESTs to genomic sequence.  Guest lecturer: John Quackenbush, Ph.D., The Institute for Genomic Research.
Get the third homework assignment here.
Assignment 2 due today.

Reading:

October 13.  Introduction to probability.  Markov chains.  Scoring DNA sequence patterns using Markov chains.  Information theory and its relationship to probability. Selected lecture notes available: (part 1) (part 2) (figures 1 and 2)

Reading:

October 20.  Hidden Markov Models for sequence analysis.  The forward algorithm, Viterbi algorithm, and forward-backward (E-M) algorithm. Selected lecture notes (figure 3) (figure 5) (figure 6)

Reading:

October 27. Applications of HMMs: profile HMMs (HMMer, PFAM), gene finding.  Multiple sequence alignment and ortholog management.  Biological background on the structure of genes: exons, introns, and splicing. Description of the gene finding problem for prokaryotes and eukaryotes.

Reading:

November 3. Computational gene finding in prokaryotes.  Frameshift analysis, database search, identification of ribosome binding sites, terminators, and operon structure.
Get the fifth homework assignment here.
Assignment 4 due by midnight today.

Reading:

November 10.  Phylogenetic analysis.  Guest lecturer: Jonathan Eisen, Ph.D., The Institute for Genomic Research.

Reading:

November 17.  Gene finding in eukaryotes.  HMMs, Markov chains, neural nets, and decision trees for gene finding.
Assignment 5 due by midnight today.
Get the sixth homework assignment here.

Reading:

December 1. Protein structure prediction.  Secondary structure prediction methods.  Signal peptide recognition.  Introduction to structure threading.

Reading:

December 8. Genome databases and annotation.  Representations of sequence data and functional information.  Review for final exam.
 

December 15.  Final exam.

Assignments and grading

The grade will be based on problem sets, programming assignments, and a final exam. There will be six assigments which count for a total of 70% of the grade. The final exam accounts for the remaining 30%.  Instead of the exam, students may opt to do a final project instead.  Just for reference, here are the 1996 Homework assignments (the 1999 assignments will be completely different).

Web links

1996 course syllabus
Home Page for Computational Biology at Hopkins
Computer Science at Hopkins