Genetic Mapping

Daniel Brown, Cornell University

Genetic mapping experiments discover the location of many sites, or “markers,” on the genome of a species. Some of these experiments are extremely expensive, computationally intense, and time consuming, and involve millions of dollars, tens of thousands of markers, and years of experimental time.

We propose two changes to these experiments: first, to perform much of the experimentation on only a sample of an experimental population, and second, to use a different method when determining the location of new markers. In the first of these changes, we model the problem of selecting a good mapping sample as a discrete optimization problem, based on the k-center problem. We discuss several heuristic methods for choosing good samples, including linear programming with randomized rounding and simpler greedy methods, and show results for several existing experimental populations.

In the second of these changes, we model the problem of placing new markers as the problem of locating them into “bins” of the genome, which are terminated by biological breakpoints. This approach differs markedly from existing methods, which instead try to determine the correct order of all of the new markers from all possible permutations. Our approach has the advantages that it is very accurate and computationally and experimentally feasible. We show preliminary results which demonstrate that our methods are of high quality and highly compatible with the sample selection approaches we present.

Our two changes offer the possibility of more accurate, less expensive, and faster genetic mapping experiments.

Joint work with Todd Vision, David Shmoys, Steve Tanksley and Rick Durrett