I am an Assistant Professor of Computer Science at Johns
Hopkins University, where I lead the Order Lab. My research
spans broadly in systems including OS and distributed systems. I am particularly
interested in the research of building reliable, efficient, and defensible systems
in data centers, mobile phones and IoT devices.
I’m looking for motivated graduate students and undergraduate interns who are
interested in systems building and research. Prospective
students please read this.
News
Dec. 2021
Awarded an NSF SMALL grant on distributed system fault injection
Dec. 2021
Gave a keynote talk in
HotDC 2021
Aug. 2021
Received a
Facebook Research Award on performance diagnosis.
Jul. 2021
Argus received the best paper award at ATC '21!
Apr. 2021
Argus is accepted to appear at USENIX ATC '21. Congrats Lingmei!
Mar. 2021
Arthas (
paper) is accepted to appear at EuroSys '21. Congrats Brian!
Jan. 2021
Teaching a new course
CS 624: Reliable Software Systems in the Spring
Dec. 2020
Dec. 2020
Co-organizing (with Rebecca Isaacs) a new workshop, HAOC, on availability and observability in EuroSys '21.
A tentative CFP is out, send your work!
Dec. 2020
Gave a short
talk to PhD students on effectively working with advisors.
Aug. 2020
Violet (
paper) is accepted to appear at OSDI '20. Congrats Yigong, Gongqi!
Aug. 2020
Narya (
preprint) is accepted to appear at OSDI '20.
June 2020
Awarded NSF CAREER award on gray-failure-tolerant cloud!
Feb. 2020
OmegaGen received the best paper award at NSDI '20!
Dec. 2019
OmegaGen (
preprint) is accepted to appear at NSDI '20. Congrats Chang!
June 2019
Gandalf (
preprint) is accepted to appear at NSDI '20.
May 2019
May 2019
Received the Professor Joel Dean Excellence in Teaching Award. Grateful to the students in my OS class.
Apr. 2019
LeaseOS received the best paper award at ASPLOS '19!
Apr. 2019
Brian received the NSF Graduate Research Fellowship. Congrats Brian!
Mar. 2019
Our position
paper on watchdog abstraction is accepted to appear at
HotOS XVII. Congrats Chang!
Mar. 2019
Lightning talk
video and paper
preprint for LeaseOS project is released
Feb. 2019
Technical Briefing on AIOps accepted to
ICSE
Feb. 2019
Yigong will intern at Microsoft in the summer
Dec. 2018
URSA is accepted to appear at
EuroSys '19
Nov. 2018
LeaseOS is accepted to appear at
ASPLOS '19. Congrats Yigong, Suyi!
Aug. 2018
Coppelia (MICRO 51) is nominated as a best paper candidate!
Research
A major focus of my recent research is to push for higher availability and
observability of next-generation cloud systems. This includes a series of
projects in multiple thrusts:
- Understanding of failures beyond fail-stop model
- Gray failure: We advocate the importance of the gray failure problem
in cloud systems and discuss its differential observability traits.
- Partial failure: We study and analyze real-world
partial failures in popular distributed systems.
- Principled detection and localization of complex failures
- Panorama: We design a solution to capture and enhance
inherent observability in cloud systems for the detection of gray failures.
- Watchdog: We propose the intrinsic watchdog abstraction
for comprehensive runtime checking in system software.
- OmegaGen: We design a program analysis and
instrumentation tool to generate custom watchdogs to localize partial failures. (Best Paper Award)
- Data-driven approach to transform traditional reliability activities
- Narya: a holistic system to predict failures and adaptively mitigate failures through online experimentation.
- Gandalf: an analytics service for safe deployments in cloud.
- AIOps: a short position paper on the real-world challenges and research
opportunities on AIOps.
I also research on energy-efficient mobile systems (e.g., LeaseOS, DefDroid,
eDoctor) and preventing system misconfigurations (e.g., Violet,
ConfValley).
Recent Publications
(Full publication list)
-
Argus: Debugging Performance Issues in Modern Desktop Applications with Annotated Causal Tracing [Best Paper Award]
Lingmei Weng, Peng Huang, Jason Nieh, Junfeng Yang
ATC 2021
[BibTeX]
[Slides]
[Software]
-
Understanding and Dealing with Hard Faults in Persistent Memory Systems
Brian Choi, Randal Burns, Peng Huang
EuroSys 2021
[BibTeX]
[Slides]
[Software]
[TechReport]
-
Automated Reasoning and Detection of Specious Configuration in Large Systems with Symbolic Execution
Yigong Hu, Gongqi Huang, Peng Huang
OSDI 2020
[BibTeX]
[Slides]
[Software]
[TechReport]
-
Predictive and Adaptive Failure Mitigation to Avert Production Cloud VM Interruptions
Sebastien Levy, Randolph Yao, Youjiang Wu, Yingnong Dang, Peng Huang, Zheng Mu, Pu Zhao, Tarun Ramani, Naga Govindaraju, Xukun Li, Qingwei Lin, Gil Lapid Shafriri, Murali Chintalapati
OSDI 2020
[BibTeX]
[TechReport]
-
Understanding, Detecting and Localizing Partial Failures in Large System Software [Best Paper Award]
Chang Lou, Peng Huang, Scott Smith
NSDI 2020
[BibTeX]
[Slides]
-
Gandalf: An Intelligent, End-To-End Analytics Service for Safe Deployment in Large-Scale Cloud Infrastructure
Ze Li, Qian Cheng, Ken Hsieh, Yingnong Dang, Peng Huang, Pankaj Singh, Xinsheng Yang, Qingwei Lin, Youjiang Wu, Sebastien Levy, Murali Chintalapati
NSDI 2020
[BibTeX]
-
Comprehensive and Efficient Runtime Checking in System Software through Watchdogs
Chang Lou, Peng Huang, Scott Smith
HotOS 2019
[BibTeX]
[Slides]
-
URSA: Hybrid Block Storage for Cloud-Scale Virtual Disks
Huiba Li, Yiming Zhang, Dongsheng Li, Zhiming Zhang, Shengyun Liu, Peng Huang, Zheng Qin, Kai Chen, Yongqiang Xiong
EuroSys 2019
[BibTeX]
-
A Case for Lease-Based, Utilitarian Resource Management on Mobile Devices [Best Paper Award]
Yigong Hu, Suyi Liu, Peng Huang
ASPLOS 2019
[BibTeX]
[Slides]
[Software]
-
Capturing and Enhancing In Situ System Observability for Failure Detection
Peng Huang, Chuanxiong Guo, Jacob R. Lorch, Lidong Zhou, Yingnong Dang
OSDI 2018
[BibTeX]
[Slides]
[Software]
-
End-to-End Automated Exploit Generation for Validating the Security of Processor Designs [Best Paper Candidate]
Rui Zhang, Calvin Deutschbein, Peng Huang, Cynthia Sturton
MICRO 2018
[BibTeX]
-
TerseCades: Efficient Data Compression in Stream Processing
Gennady Pekhimenko, Chuanxiong Guo, Myeongjae Jeon, Peng Huang, Lidong Zhou
USENIX ATC 2018
[BibTeX]
-
Gray Failure: The Achilles’ Heel of Cloud-Scale Systems
Peng Huang, Chuanxiong Guo, Lidong Zhou, Jacob R. Lorch, Yingnong Dang, Murali Chintalapati, Randolph Yao
HotOS 2017
[BibTeX]
[Slides]
-
Early Detection of Configuration Errors to Reduce Failure Damage [Best Paper Award]
Tianyin Xu, Xinxin Jin, Peng Huang, Yuanyuan Zhou, Shan Lu, Long Jin, Shankar Pasupathy
OSDI 2016
[BibTeX]
-
DefDroid: Towards a More Defensive Mobile OS Against Disruptive App Behavior
Peng Huang, Tianyin Xu, Xinxin Jin, Yuanyuan Zhou
MobiSys 2016
[BibTeX]
[Slides]
[Video]
[Website]
Students
I am very fortunate to work with the following people:
- PhD students
- Undergraduate students
- Alumni
- Ding Ding (Visiting intern → NYU PhD)
- Varun Radhakrishnan (BS → Amazon)
- Suyi Liu (BS → Amazon → Netflix)
- Xu Meng (BS → Amazon → Cruise)
- Zach Silver (BS → Bloomberg)
- Parv Saxena (MS → Microsoft)
- Justin Shafer (MS → Westpoint)
- Ziyan Wang (MS → Xmind)
- Shreyas Aiyar (MS → Otter.ai)
Professional Service
- Program Committee:
- 2021: OSDI, ASPLOS ‘22, HAOC ‘21 (co-chair), APSys
- 2020: OSDI, NSDI, APSys, RTAS, ICDCS
- 2019: SOSP, HotOS, APSys, ASPLOS SRC
- 2018: USENIX ATC
- 2017: USENIX ATC, SOSP SRC, SIGCOMM HotConNet
- 2016: MobiSys PhD forum
- Journal Reviewer: TPDS 2016, SCICO 2019, TOS 2020
- Shadow PC: EuroSys 2017
- Assistant for PC chair: ASPLOS 2016
Teaching
- 601.318/418/618 Principles of Operating Systems
- 601.624 Reliable Software Systems
- 601.718 Advanced Operating Systems
- 601.817 Selected Topics in Systems Research
Bio
I received my Ph.D. from UCSD, advised by
Prof. Yuanyuan Zhou. Before joining Hopkins,
I took one year off at MSR Redmond Systems Group
to gain exposure to real-world system challenges in a state-of-the-art cloud service, Microsoft
Azure. I received B.S. (Computer Science) and B.A. (Economics) from Peking University.
Plug: if you are organizing virtual conferences during COVID-19, please
consider the Whova virtual conference platform (resources)
founded by my advisor YY.
Note: Ryan is my English name. For legal documents and publications, Peng Huang is used.