Traditional Approximate Pattern Matching (e.g. Hamming Distance errors, edit distance errors) assumes that various types of errors may occur in the data, but an implicit assumption is that the order of the data remains unchanged. Over the years some applications identified types of “errors” where the data remains correct but its order is compromised. The earliest example is the “swap” error motivated by a common typing error. Other widely known examples such as transpositions, reversals, and interchanges, are motivated by Biology. We propose a model that formally separates the concepts of “errors in the data” and “errors in the address” since they present different algorithmic challenges solved by different techniques. The “error in address” model, which we call asynchronous pattern matching, since the data is not assumed to arrive in a synchronous sequential manner, is rich in problems not addressed hitherto. We will consider some reasonable metrics for asynchronous pattern matching and show some efficient algorithms for these problems. As expected, the techniques needed to solve these problems are not taken from the standard pattern matching “toolkit”.
Amihood Amir received his Ph.D. in 1983 at Bar-Ilan University. He did his post-doc at MIT, and was on the faculty at Tufts, University of Maryland at College Park, and Georgia Tech, before joining Bar-Ilan University in 1994. Today he is a professor of Computer Science at Bar-Ilan University and a Research Professor at Johns Hopkins University. Amir had been head of the Bar-Ilan Responsa Project, and Chairman of the Department of Computer Science. Today he is dean of the College of Exact Sciences at Bar-Ilan University. Amihood Amir’s Ph.D. thesis was in logic of programs, particularly temporal logics. He later did some work in Complexity theory -sub-recursive classes of functions and the concept of “cheatability” in hard sets. Since the late 1980’s Amir’s research has been in Algorithms design and analysis, in particular Pattern Matching Algorithms. In the later area he has been instrumental in developing the multidimensional pattern matching area, compressed matching, and, recently, asynchronous matching. In this context he has also done work in algorithms for Computational Biology. Amir has also worked in the past in Scheduling algorithms, VLSI algorithms, and Data Mining algorithms.