Homology search, finding similar parts between two sequences, is the most fundamental and popular task in bioinformatics. Traditional homology search technology is either too slow or too insensitive. When it does return something, the results are simply some non-specific fragments of alignments. We introduce new ideas, including a new mathematical theory of optimized spaced seeds, that allow modern homology search to achieve high sensitivity, high specificity, and high speed simultaneously. The spaced seed methodology is implemented in our PatternHunter software, as well as most other modern homology search software, serving thousands of queries daily. We also introduce ZOOM that maps short reads of 5x coverage of a human genome in a CPU-day.
Joint work with Bin Ma, John Tromp, X.F. Cui, B. Brejova, T. Vinar, D. Shasha
Ming Li is a Canada Research Chair in Bioinformatics and professor of Computer Science at the University of Waterloo. He is a fellow of Royal Society of Canada, ACM, and IEEE. He is a recipient of Canada’s E.W.R. Steacie Fellowship Award in 1996, and the 2001 Killam Fellowship. Together with Paul Vitanyi they have pioneered the applications of Kolmogorov complexity and co-authored the book “An Introduction to Kolmogorov Complexity and Its Applications”. His main research focus recently is protein structure prediction.