R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.
Errata for the Durbin et al. book.
This book may also be useful but is not required:
D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997.
This is the sequence from the assignment, in case you want it in electronic form:
V,V,V,V,T,W,Z,V,Y,V,X,V,U,V,W,U,U,Y,U,U,V,V,T,V,Z,Y,Z,X,Y,T,U,U,Y,Y,Y,W,U,W,Y,T,U,X,X,Z,W,X,T,W,T,U,Z,W,Z,T,V,U,Y,Z,W,X,W,Z,T,V,V,V,V,V,X,Y,W,T,T,X,V,V,V,V,V,V,V,T,V,V,Z,U,T,V,V,V,V,V,T,W,Y,Y,T,V,V,V
The executable for problem 1 is here:
Cygwin (cygwin1.dll needs to be in the same directory if you are not running Cygwin already.)
Contact us if you have problems running it.
You will also need either hw4.linux.tar.gz or hw4.sunos.tar.gz and the data in hw4.data.tar.gz.
You will need dataset A. You will also need dataset B; to get that, come to class on November 1 with your partner (or wait until after that class and email us begging forgiveness!).
Updated pbw and pvit (not completely tested, but appear to work). These might run faster than the old versions. tools.v2.linux.tar.gz tools.v2.sun.tar.gz