Who is similar to my patient: Large-scale Patient Similarity Learning for Healthcare Analytics

Jimeng Sun, IBM TJ Watson Research Center
Host: Suchi Saria

Heterogeneous and large volume of Electronic Health Records (EHR) data are becoming available in many healthcare institutes. Such EHR data from millions of patients serve as huge collective memory of doctors and patients over time. How to leverage that EHR data to help caregivers and patients to make better decisions? How to efficiently use these data to help clinical and pharmaceutical research?

My research focuses on developing large-scale algorithms and systems for healthcare analytics. First, I will describe our healthcare analytic research framework, which provides an intuitive collaboration mechanism across interdisciplinary teams and an efficient computation framework for handling heterogeneous patient data. Second, I will present a core component of this framework, patient similarity learning that answers the following questions:

  • How to leverage physician feedback into the similarity computation?
  • How to integrate multiple patient similarity measures into a single consistent similarity measure?
  • How to present the similarity results and obtain user feedback in an intuitive and interactive way?

I will illustrate the effectiveness of our proposed algorithms for patient similarity learning in several different healthcare scenarios. I will demonstrate an interactive visual analytic system that allows users to cluster patients and to refine the underlying patient similarity metric. Finally, I will highlight future work that I am pursuing.

Speaker Biography

Jimeng Sun is a research staff member at Healthcare Analytic Department of IBM TJ Watson Research Center. He leads research projects of medical informatics, especially in developing large-scale predictive and similarity analytics on healthcare applications. He has extensive research track records on data mining research: specialized in healthcare analytics, big data analytics, similarity metric learning, social network analysis, predictive modeling and visual analytics. He has published over 70 papers, filed over 20 patents (4 granted). He has received ICDM best research paper in 2007, SDM best research paper in 2007, and KDD Dissertation runner-up award in 2008.

Sun received his B.S. and M.Phil. in Computer Science from Hong Kong University of Science and Technology in 2002 and 2003, and PhD in Computer Science in Carnegie Mellon University in 2007, specialized on data mining on streams, graphs and tensor data.