Variable Tree Algorithms and the Efficient Discovery of Spatial Associations and Structure

Jeremy Kubica

Finding sets of points that conform to an underlying spatial model is a conceptual simple, but potentially expensive, task. In this talk I consider the computational issues inherent in extracting structure from large, noisy data sets. I discuss how a trade-off exists between traditional search algorithms based on spatial data structures and more recent multiple tree algorithms, leaving both approaches with potential drawbacks. I describe a new type of tree-based search algorithm that uses a dynamic, variable number of tree nodes to adapt to both structure in the data and search state itself. I show that this new approach addresses potential drawbacks in both the constructive and multiple tree approaches. Further, I show that this algorithm leads to significant benefits on a variety of problem domains. While the problem of finding spatial structure arises in a wide range of domains, the primary motivating problem throughout this talk is the task of efficiently linking faint asteroid detections. Here the goal is to link together individual point observations in order to find and extract new asteroid trajectories in the data. Future astronomical surveys will provide a wealth of observational data to this end, greatly increasing both the ability to find new objects and the scale of the problem. Thus efficient algorithms are vital to effectively handle the massive amount of data that is expected.