Analyzing the Privacy of Anonymized Network Data

Scott Coull

The scientific method is founded on the principles of validation and comparison of experimental results. For network and information security research, this means that researchers must have access to freely available network data to validate and improve on published results. Releasing network data, however, can be problematic for the data publisher because it may contain potentially sensitive information about the network and its users. Recently, network data anonymization methods have been proposed in an effort to simultaneously provide generally useful network data to researchers while removing sensitive information. Unfortunately, these state-of-the-art anonymization systems rely on intuition and expert knowledge, rather than rigorous privacy guarantees, to prevent the leakage of sensitive information. As a result, unforeseen and dangerous channels of information leakage may exist despite the use of these anonymization techniques. In this talk, we focus on developing more rigorous foundations for the analysis of anonymization methods and the privacy they provide to data publishers. To do so, we begin by highlighting several areas of information leakage within anonymized network data discovered through the application of novel data mining and machine learning techniques. Specifically, we are able to reveal the security posture of the networks, identify hosts and the services they offer, and even user behaviors (e.g., web browsing activity). From these attacks, we derive an adversarial model and analytic framework that captures the risk involved with these and other inference attacks on anonymized network data. This analysis framework provides the data publisher with a method for quantifying the privacy risks of anonymized network data. Furthermore, we discuss the parallels between our proposed security notions and those of microdata anonymization in an effort to bridge the gap between the two fields.

Speaker Biography

Scott Coull graduated from Rensselaer Polytechnic Institute with a B.Sc. in Computer Science in December of 2003, and a M.Sc. in Computer Science in May of 2005. He is the recipient of several awards, including the Best Student Paper Award at the Annual Computer Security Applications Conference, and the Stanley I. Landgraf Prize for Outstanding Academic Achievement. His research focuses on computer security and applied cryptography, with a particular emphasis on the use of machine learning techniques in identifying and mitigating threats to privacy and security.