I am an Assistant Professor at Johns Hopkins University CS, where I lead the Order Lab. My research
spans broadly in systems including OS and distributed systems. I am
particularly interested in designing principled techniques to enable reliable,
efficient, and defensible systems from large data centers to small mobile
join the University of Michigan CSE
as an associate professor in January 2023!
I will be recruiting PhD
students to join my lab at U-M. Prospective students should apply through the U-M admission
My lab also has openings for postdocs, graduate and undergraduate
interns. I’m looking for students who are self-motivated and have
strong interests in systems building and research. Prospective students please read
Gave a talk at Strange Loop
on distributed systems runtime checking
Orbit is accepted to OSDI '22. Congrats Yuzhuo!
Oathkeeper is accepted to OSDI '22. Congrats Chang, Yuzhuo!
RESIN is accepted to OSDI '22. Congrats Chang!
Awarded an NSF SMALL grant on distributed system fault injection
Gave a keynote talk in HotDC 2021
Received a Facebook Research Award
on performance diagnosis.
Argus received the best paper award at ATC '21!
Argus is accepted to appear at USENIX ATC '21. Congrats Lingmei!
) is accepted to appear at EuroSys '21. Congrats Brian!
Teaching a new course CS 624: Reliable Software Systems
in the Spring
Co-organizing (with Rebecca Isaacs) a new workshop, HAOC, on availability and observability in EuroSys '21.
A tentative CFP is out, send your work!
Gave a short talk
to PhD students on effectively working with advisors.
) is accepted to appear at OSDI '20. Congrats Yigong, Gongqi!
) is accepted to appear at OSDI '20.
Awarded NSF CAREER award on gray-failure-tolerant cloud!
OmegaGen received the best paper award at NSDI '20!
) is accepted to appear at NSDI '20. Congrats Chang!
) is accepted to appear at NSDI '20.
Received the Professor Joel Dean Excellence in Teaching Award. Grateful to the students in my OS class.
LeaseOS received the best paper award at ASPLOS '19!
Brian received the NSF Graduate Research Fellowship. Congrats Brian!
Our position paper
on watchdog abstraction is accepted to appear at HotOS XVII
. Congrats Chang!
Lightning talk video
and paper preprint
for LeaseOS project is released
Technical Briefing on AIOps accepted to ICSE
Yigong will intern at Microsoft in the summer
URSA is accepted to appear at EuroSys '19
LeaseOS is accepted to appear at ASPLOS '19
. Congrats Yigong, Suyi!
Coppelia (MICRO 51) is nominated as a best paper candidate!
A major focus of my recent research is to push for higher availability and
observability of next-generation cloud systems. This includes a series of
projects in multiple thrusts:
- Understanding of failures beyond fail-stop model
- Gray failure: We advocate the importance of the gray failure problem
in cloud systems and discuss its differential observability traits.
- Partial failure: We study and analyze real-world
partial failures in popular distributed systems.
- Principled detection and localization of complex failures
- Panorama: We design a solution to capture and enhance
inherent observability in cloud systems for the detection of gray failures.
- Watchdog: We propose the intrinsic watchdog abstraction
for comprehensive runtime checking in system software.
- OmegaGen: We design a program analysis and
instrumentation tool to generate custom watchdogs to localize partial failures. (Best Paper Award)
- Data-driven approach to transform traditional reliability activities
- Narya: a holistic system to predict failures and adaptively mitigate failures through online experimentation.
- Gandalf: an analytics service for safe deployments in cloud.
- AIOps: a short position paper on the real-world challenges and research
opportunities on AIOps.
I also research on energy-efficient mobile systems (e.g., LeaseOS, DefDroid,
eDoctor) and preventing system misconfigurations (e.g., Violet,
(Full publication list)
I am very fortunate to work with the following people:
- PhD students
- Undergraduate students
- Gongqi Huang (BS/MS → Princeton PhD)
- Ding Ding (Visiting intern → NYU PhD)
- Varun Radhakrishnan (BS → Amazon)
- Suyi Liu (BS → Amazon → Netflix)
- Xu Meng (BS → Amazon → Cruise)
- Zach Silver (BS → Bloomberg)
- Parv Saxena (MS → Microsoft)
- Justin Shafer (MS → Westpoint)
- Ziyan Wang (MS → Xmind)
- Shreyas Aiyar (MS → Otter.ai)
- Program Committee:
- OSDI ‘23, SOSP ‘23
- OSDI ‘21, ASPLOS ‘22, HAOC ‘21 (co-chair), APSys ‘21
- OSDI ‘20, NSDI ‘21, APSys ‘20, RTAS ‘20, ICDCS ‘20
- SOSP ‘19, HotOS ‘19, APSys ‘19, ASPLOS SRC
- USENIX ATC ‘18
- USENIX ATC ‘17, SOSP SRC, SIGCOMM HotConNet
- MobiSys PhD forum
- Journal Reviewer: TPDS 2016, SCICO 2019, TOS 2020
- Shadow PC: EuroSys 2017
- Assistant for PC chair: ASPLOS 2016
- 601.318/418/618 Principles of Operating Systems
- 601.624 Reliable Software Systems
- 601.718 Advanced Operating Systems
- 601.817 Selected Topics in Systems Research
I received my Ph.D. from UCSD, advised by
Prof. Yuanyuan Zhou. Before joining Hopkins,
I took one year off at MSR Redmond Systems Group
to gain exposure to real-world system challenges in a state-of-the-art cloud service, Microsoft
Azure. I received B.S. (Computer Science) and B.A. (Economics) from Peking University.
Plug: if you are organizing virtual conferences during COVID-19, please
consider the Whova virtual conference platform (resources)
founded by my advisor YY.
Note: Ryan is my English name. For legal documents and publications, Peng Huang is used.