Summer 2017


June 14, 2017

In this talk I detail the design and implementation of ZCash, the first completely anonymous crypto-currency, and associated work. Crypto-currencies, first introduced by Bitcoin, make use of an append-only ledger, termed a blockchain, that is maintained by an ad-hoc peer-to-peer network. With blockchains, unlike older approaches to digital cash, we need not trust any individual provider in order to trust the currency. Instead, we need only assume that a majority of the network’s computational power is honest. The downside to this approach, however, is that blockchains are completely public. Payments made with crypto-currencies can be analyzed and tracked by anyone, including businesses seeking information on their competitors or stalkers seeking personal data. This talk covers my work resolving these privacy issues.

Speaker Biography: Ian Miers is a Ph.D. student at Johns Hopkins University working on applied cryptography and privacy enhancing technologies advised by Prof. Matthew Green. His work includes anonymous crypto-currencies (Zerocoin and Zerocash), decentralized anonymous credentials, and secure messaging including attacks on Apple iMessage. His work has been featured in The Washington Post, The New York Times, Wired, and The Economist, among others. He is a co-founders of ZCash, a company which has commercially deployed Zerocash.


July 6, 2017

Image-guided therapy is a central part of modern medicine. By incorporating medical imaging into the planning, surgical, and evaluation process, image-guided therapy has helped surgeons perform less invasive and more precise procedures. Of the most commonly used medical imaging modalities, ultrasound imaging offers a unique combination of cost-effectiveness, safety, and mobility. Advanced ultrasound-guided interventional systems will often require calibration and tracking technologies to enable all of their capabilities. Many of these technologies rely on localizing point-based fiducials to accomplish their task.

In this talk, I introduce various methods for localizing active acoustic and photoacoustic point sources. The goals of these methods are (1) to improve localization and visualization for point targets that are not easily distinguished under conventional ultrasound and (2) to track and register ultrasound sensors with the use of active point sources as non-physical fiducials or markers.

We applied these methods to three main research topics. The first is an ultrasound calibration framework that utilizes an active acoustic source as the phantom to aid in in-plane segmentation as well as out-of-plane estimation. The second is an interventional photoacoustic surgical system that utilizes the photoacoustic effect to create markers for tracking ultrasound transducers. We demonstrate variations of this idea to track a wide range of ultrasound transducers (three-dimensional, two-dimensional, bi-planar). The third is a set of interventional tool tracking methods combining the use of acoustic elements embedded onto the tool with the use of photoacoustic markers.

Speaker Biography: Alexis Cheng was raised in Vancouver, British Columbia. He received a Bachelor’s degree in Electrical and Computer Engineering from the University of British Columbia in 2011, and a Master’s degree in Computer Science from Johns Hopkins University in 2013. During his studies as a PhD candidate in the department of Computer Science at the Johns Hopkins University, he was involved in 6 journal articles, 19 conference publications, 8 abstracts, and 4 patents. He won the MUUSS fellowship in 2012, the best poster award at CARS in 2014, and the Professor Joel Dean Excellence in Teaching Award in 2016. His research interests include ultrasound-guided interventions, photoacoustic tracking, and surgical robotics.


July 7, 2017

A scalable and real-time capable infrastructure is required to enable high degrees-of-freedom systems that need high-performance control and haptic rendering. The specific platform that motivates this thesis work is the open research platform da Vinci Research Kit (dVRK). For the system architecture, we propose a specialized IEEE-1394 (FireWire) broadcast protocol that takes advantage of broadcast and peer-to-peer transfers to minimize the number of transactions, and thus the software overhead, on the control PC, thereby enabling fast real-time control. It has also been extended to Ethernet via a novel Ethernet-to-FireWire bridge protocol. The software architecture consists of a distributed hardware interface layer, a real-time component-based software framework, and integration with the Robot Operating System (ROS). The architecture is scalable to support multiple active manipulators, reconfigurable to enable researchers to partition a full system into multiple independent subsystems, and extensible at all levels of control. This architecture has been applied to two semi-autonomous teleoperation applications. The first application is to a suturing task in Robotic Minimally Invasive Surgery (RMIS), that includes the development of virtual fixtures for the needle passing and knot tying sub-tasks, with a multi-user study to verify their effectiveness. The second application concerns time-delayed teleoperation of a robotic arm for satellite servicing. The research contribution includes the development of a line virtual fixture with augmented reality, a test for different time delay configurations and a multi-user study that evaluates the effectiveness of the system.

Speaker Biography: Zihan Chen received the Bachelors of Science degree in Control Science and Engineering and the Bachelors of Arts degree in English Language and Literature in 2010, and the Masters of Science and Engineering degree in Mechanical Engineering from Johns Hopkins University in 2012. He enrolled in the Computer Science Ph.D. program in 2012. His research focuses on scalable, high-performance control system and semi-autonomous teleoperation.


July 25, 2017

This thesis explores low-resource information extraction (IE), where a sufficient quantity of high-quality human annotations are unavailable for fitting statistical machine learning models. This setting receives increasing demands from domains where annotations are expensive to obtain, such as biomedicine, or domains that are rapidly changing, where annotations easily become out-of-date, such as social media. It is crucial to leverage as many learning signals and human knowledge as possible to mitigate the problem of inadequate supervision.

In this thesis, we explore two directions to help information extraction with limited supervision: 1). Learning representations/knowledge from heterogeneous sources with deep neural networks and transfer the knowledge; 2). Incorporating structure knowledge into the design of the models to learn robust representations and make holistic decisions. Specifically, for the application of named entity recognition (NER), we explore transfer learning including multi-task learning, domain adaptation, and multi-task domain adaptation in the context of neural representation learning, to help transfer learned knowledge from related tasks and domains to the problem of interest. For the applications of entity relation extraction and joint entity recognition and relation extraction, we explore incorporating linguistic structure and domain knowledge into the design of the models, to conduct joint inference and learning to make holistic decisions, and thus yield more robust systems with less supervision.

Speaker Biography: Nanyun Peng is a PhD candidate in the Department of Computer Science, affiliated with the Center for Language and Speech Processing. She is broadly interested in Natural Language Processing, Machine Learning, and Information Extraction. Her research focuses on using deep learning and joint models for low-resource information extraction. Nanyun is the recipient of the 2016 Fred Jelinek Fellowship. She had the fortune to work with great researchers at IBM T.J. Watson Research Center, and Microsoft Research Redmond in summers of 2014 and 2016. She holds a master’s degree in Computer Science and BAs in Computational Linguistics and Economics, all from Peking University.


July 25, 2017

Human language artifacts represent a plentiful source of rich, unstructured information created by reporters, scientists, and analysts. In this thesis we provide approaches for adding structure: extracting and linking entities, events, and relationships from a collection of documents about a common topic. We pursue this linking at two levels of abstraction. At the document level we propose models for aligning the entities and events described in coherent and related discourses: these models are useful for deduplicating repeated claims, finding implicit arguments to events, and measuring semantic overlap between documents. Then at a higher level of abstraction, we construct and employ knowledge graphs containing salient entities and relations linked to supporting documents: these graphs can be augmented with facts and summaries to give users a structured understanding of the information in a large collection.

Speaker Biography: Travis Wolfe is a Ph.D. candidate in Computer Science at Johns Hopkins University advised by Mark Dredze and Benjamin Van Durme. He obtained a B.S. in Statistics and Information Systems at Carnegie Mellon University in 2011 and a M.S. in Computer Science from Johns Hopkins University in 2014. His work focuses on information extraction and machine learning for natural language processing.


July 27, 2017

Advances in machine learning and streaming systems provide a backbone to transform vast arrays of raw data into valuable information. Leveraging distributed execution, analysis engines can process this information effectively within a data exploration workflow to solve problems at unprecedented rates. However, with increased input dimensionality, a desire to simultaneously share and isolate information, as well as overlapping and dependent tasks, this process is becoming increasingly difficult to maintain. These flexible and scalable systems require more robust coordination of distributed execution at a lower level in order to synchronize their efforts as part of a common goal. We argue that an abstraction layer providing adaptive asynchronous control and consistency management over a series of individual tasks coordinated to achieve a global objective can significantly improve data exploration effectiveness and efficiency. We demonstrate this through serverless simulation ensemble management and multi-model machine learning with improved performance and reduced resource utilization. We focus on the specific genres of molecular dynamics and personalized healthcare, however, the contributions are applicable to a wide variety of domains.

In the first part of this talk, we present a novel approach to data exploration centered around a lattice structure which organizes data features to drive input sampling as part of the exploratory process. Integrated with a serverless framework we developed as a data driven in-situ simulation ensemble management system, we couple analysis and simulation inside an HPC cluster to improve rare event detection in molecular dynamics.

In the second part of this talk, we show how to improve machine learning with increased data representation through many, semi-isolated sub-domains. Specifically, we implement asynchronous control over multi-model training in distributed machine learning as applied to healthcare. We address the challenge of integrating both data sharing and data isolation and show how synchronization and adaptive controls are necessary to improve machine learning outcomes.

Speaker Biography: Benjamin A. Ring, Lieutenant Colonel, US Army was commissioned as an Armor Officer in 1996 with a Bachelor in Computer Science from the U.S. Military Academy in West Point, NY. He earned his Masters in Computer Science from Boston University in 2006 and taught at West Point, 2006-2009, receiving his Assistant Professor promotion in 2008. From 2010-2011, he served as Senior Systems Manager for Regional Coalition Forces East in Afghanistan and from 2011-2014, he was the Academic Systems Manager for the US Army Command and General Staff College, Ft. Leavenworth, KS. A member of Upsilon Pi Epsilon and Phi Kappa Phi Honor Societies, Lt. Colonel Ring will be assigned as the Chief Operations Officer, U.S. Army Cyber Protection Brigade, Ft. Gordon, GA starting in September.


August 3, 2017

In 2015 more than 150 million records and $400 billion were lost due to publicly reported criminal and nation-state cyberattacks in the United States alone. The failure of our existing security infrastructure motivates the need for improved technologies, and cryptography provides a powerful tool for doing this. There is a misperception that the cryptography we use today is a “solved problem” and the real security weaknesses are in software or other areas of the system. This is, in fact, not true at all, and over the past several years we have seen a number of serious vulnerabilities in the cryptographic pieces of systems, some with large consequences.

This talk will discuss three aspects of securing deployed cryptographic systems. We will first explore the evaluation of systems in the wild, using the example of how to efficiently and effectively recover user passwords submitted over TLS encrypted with RC4, with applications to many methods of web authentication as well as the popular IMAP protocol for email. We will then address my work on developing tools to design and create cryptographic systems and bridge the often large gap between theory and practice by introducing AutoGroup+, a tool that automatically translates cryptographic schemes from the mathematical setting used in the literature to that typically used in practice, giving both a secure and optimal output. We will conclude with an exploration of how to actually build real world deployable systems by discussing my work on developing decentralized anonymous credentials in order to increase the security and deployability of existing anonymous credentials systems.

Speaker Biography: Christina Garman received the Bachelor of Science in Computer Science and Engineering and Bachelor of Arts in Mathematics from Bucknell University in 2011 and completed the Masters of Engineering in Computer Science from Johns Hopkins in 2013. Her research interests focus largely on practical and applied cryptography. More specifically, her work has focused on the security of deployed cryptographic systems from all aspects, including the evaluation of real systems, improving the tools that we have to design and create them, and actually creating real, deployable systems. Some of her recent work has been on the weaknesses of RC4 in TLS, cryptographic automation, decentralized anonymous e-cash, and decentralized anonymous credentials. She is also one of the co-founders of ZCash, a startup building a cryptocurrency based on Zerocash. Her work has been publicized in The Washington Post, Wired, and The Economist, and she received a 2016 ACM CCS Best Paper Award. After graduation, Christina will join the faculty at Purdue University as an Assistant Professor.


August 9, 2017

Pre-clinical irradiation systems have been developed and commercialized to minimize the gap between human clinical systems and small animal research systems. One of the examples is the small animal radiation research platform (SARRP) developed at Johns Hopkins University. Along with the development of pre-clinical irradiation systems, the need to understand the amount of dose deposited to different tissue types and to visualize the resulting dose distribution have motivated the development of treatment planning systems (TPS). Treatment planning in radiation therapy is often performed with a forward strategy to reuse an old treatment setup. However, this forward strategy cannot produce satisfactory results for complicated anatomical situations. This led to using an inverse strategy, but it is mathematically challenging and computationally demanding to define and solve the optimization problem.

This thesis provides an overview of the development of the SARRP TPS followed by a validation of the dose engine integrated into the SARRP TPS. Then, various inverse planning solutions are proposed to support complex treatment planning and delivery in small animal radiation research, which is the main topic and contribution of this thesis.

The first inverse planning contribution is an efficient two-dimensional (2D) uniform dose painting method using a motorized variable collimator (MVC), which is a compromise between the multileaf collimator (MLC) available in most human clinical systems and the size limitations of small animal radiotherapy systems. The second contribution is a fast 3D inverse planning framework that takes advantage of an existing GPU-accelerated superposition-convolution dose computation algorithm. It optimizes both beam directions and beam weights from a large number of initial beams in less than 5 minutes with a typical cone beam computed tomography (CBCT) image. The third contribution is an improvement on previous work for dose shell delivery, which includes a hollow cylinder as a planning and motion primitive. Specifically, this thesis improves the robot motion primitive for delivering a cylindrical beam, commissions the new primitive, and integrates the method into the SARRP TPS. The final contribution is the use of a statistical shape model (SSM) to enable further reductions in inverse planning time by providing a good guess for the initial beam arrangement.

Speaker Biography: Nathan (Bong Joon) Cho was raised in South Korea and received the Bachelor of Science degree in School of Computing from Soongsil University in 2006. He came to United States for the graduate studies in 2006 and received the Master of Science in Engineering in Computer Science from Johns Hopkins University in 2008. From that point on, he worked on several MRI-guided interventions as an assistant research engineer in the Laboratory for Computational Sensing and Robotics (LCSR). In Spring 2012, he started his Ph.D. program in Computer Science at Johns Hopkins University under the guidance of Dr. Peter Kazanzides. His Ph.D. research is on the development and validation of algorithms for complex dose planning and treatment delivery solutions based on the treatment planning system (TPS) of the Small Animal Radiation Research Platform (SARRP).


August 21, 2017

present work on using bulk, structured, linguistic annotations in order to perform unsupervised induction of meaning for three kinds of linguistic forms: words, sentences, and documents. The primary linguistic annotation I consider throughout are frames, which encode core linguistic, background or societal knowledge necessary to understand abstract concepts and real-world situations. I will discuss how these bulk annotations can be used to better encode linguistic- and cognitive science-backed semantic expectations within word forms, learn large lexicalized and refined syntactic fragments, present scalable methods to learn high-level representations for document and discourse understanding.

Speaker Biography: Frank Ferraro is a Ph.D. candidate in Computer Science at Johns Hopkins University. His research focuses on computational event semantics, and unlabeled, structured probabilistic modeling over very large corpora. He has published basic and applied research on a number of cross-disciplinary projects, and has papers in areas such as multimodal processing and information extraction, latent-variable syntactic methods and applications, and the induction and evaluation of frames and scripts. He worked as a research intern at Microsoft Research (2015), and he was a National Science Foundation Graduate Research Fellow. He will be joining the UMBC CS department as an assistant professor.


August 29, 2017

We have arrived in an era where we face a deluge of data streaming in from countless sources and across virtually all disciplines; This holds especially true for data intensive sciences such as astronomy where upcoming surveys such as the LSST are expected to collect tens of terabytes per night, upwards of 100 Petabytes in 10 years. The challenge is keeping up with these data rates and extracting meaningful information from them. We present a number of methods for combining and distilling vast astronomy datasets using GPUs. In particular we focus on cross-matching catalogs containing close to 0.5 Billion sources, optimally combining multi-epoch imagery and computationally extracting color from monochrome telescope images.

Speaker Biography: Matthias A. Lee received his Bachelor of Science in Computer Science from Wentworth Institute of Technology (Boston, MA) in 2011 and completed his Masters of Engineering in Computer Science from Johns Hopkins University (Baltimore, MD) in 2014. He has spent the past 7 years as a Performance Engineering Co-Op at IBM Rational and IBM Cloudant, developing software performance testing and analysis frameworks. He is also the Technical Lead for the Corrie Health Platform which is aims to reduce hospital readmissions and improve patient outcomes. His research has focused on GPU-acceleration, Image Processing, NoSQL databases, low-power computing and performance testing. After graduation, Matthias will join Appian as the Lead Performance Engineer.