When: May 12 2026 @ 10:30 AM
Where: 228 Malone Hall
Categories:
Computer Science Seminar Series.

Refreshments are available starting at 10:30 a.m. The seminar will begin at 10:45 a.m.

Abstract

Genome assembly is the process of reconstructing a complete sequence from relatively short-range data generated by sequencing instruments, like a giant jigsaw puzzle with billions of pieces. While short-read sequencing made draft assemblies routine, repetitive regions fragmented the results and prevented complete reconstruction. Sergey Koren’s research has focused on advancing genome assembly from fragmented drafts to complete, haplotype-resolved, telomere-to-telomere genomes using noisy long-read sequences. Existing tools to identify similarity between these sequences were slow; Koren’s team applied MinHash to dramatically accelerate similarity detection while preserving sensitivity, enabling efficient assembly of mammalian-scale genomes. To resolve complex repeats where the assembly problem is under-constrained, they first estimated edge multiplicities and transformed the graph to account for repeats. They then used a heuristic algorithm to identify the highest scoring Eulerian path. These innovations contributed directly to large-scale efforts to produce complete genome assemblies, culminating in the Telomere-to-Telomere consortium and the first truly complete human genome sequence. Across these projects, Koren’s work has integrated algorithm adaptation, sequencing technology advances, and collaborative genome science.

Speaker Biography

Sergey Koren received his PhD in computer science in 2012 under the supervision of Mihai Pop at the University of Maryland. He joined the National Bioforensics Analysis Center in 2011 and was appointed as an associate principal investigator in 2014. During this time, he pioneered the use of single-molecule sequencing for the reconstruction of complete genomes. In 2015, Koren joined the National Human Genome Research Institute as a founding member of the Genome Informatics Section. His research focuses on the efficient analysis of large-scale genomic datasets and new methods for analysis and assembly of high-noise single-molecule sequencing data. He is a key contributor to the Telomere-to-Telomere Consortium, the Human Pangenome Reference Consortium, and the Vertebrate Genomes Projects, all collaborative efforts aimed at generating complete, gapless references for vertebrate genomes. Koren is a recipient of the National Institutes of Health Director’s Award and is a Samuel J. Heyman Service to America Medal Finalist.

Zoom link »