When: Apr 30 2026 @ 1:00 PM
Where: 228 Malone Hall
Categories:
2026 CS Senior Thesis Presentations. Illustrations of an open window, a desktop computer, and a shelf with books and a plant on it.

Please RSVP here.

Zoom link »

This year’s theses include:

Combinatorics on Counterpoint

By Alex Ma, advised by Mike Dinitz.

This work deals with questions of counting and construction in music theory. We open with problems of harmony, melody, and form before describing a central contribution for first species counterpoint. In first species counterpoint, one is presented with a challenge melody (cantus firmus) and must respond with a valid countermelody (counterpoint) that adheres to certain rules. Many approaches to counterpoint generation have been studied. However, each of them exhibits one of the following qualities: (a) they are fast but inexact, (b) they are exact but exponential-time, (c) they do not account for variation in rulesets between authors, and (d) they cannot efficiently count the number of solution melodies. We resolve these issues by demonstrating a ruleset-agnostic encoding of first species counterpoint as a regular language. This has several nice properties. Given a length-n cantus firmus, one can count the number of valid counterpoints in O(n). Given a ruleset for cantus firmi generation, one can count the number of valid cantus-counterpoint pairs of length n in O(log n). On the generative side, it also suggests a cost-function-agnostic method for finding an optimal response in O(n2). We close with suggestions for applying and generalizing these results.

SAW: Toward a Surgical Action World Model via Controllable and Scalable Video Generation

By Sampath Rampuri, advised by Mathias Unberath.

A surgical world model capable of generating realistic surgical action videos with precise control over tool-tissue interactions can address fundamental challenges in surgical AI and simulation—from data scarcity and rare event synthesis to bridging the sim-to-real gap for surgical automation. However, current video generation methods, the very core of such surgical world models, require expensive annotations or complex structured intermediates as conditioning signals at inference, limiting their scalability. Other approaches exhibit limited temporal consistency across complex laparoscopic scenes and do not possess sufficient realism. We propose Surgical Action World (SAW), a step toward surgical action world modeling through video diffusion conditioned on four lightweight signals: language prompts encoding tool-action context, a reference surgical scene, tissue affordance mask, and 2D tool-tip trajectories. We design a conditional video diffusion approach that reformulates video-to-video diffusion into trajectory-conditioned surgical action synthesis. The backbone diffusion model is fine-tuned on a custom-curated dataset of 12,044 laparoscopic clips with lightweight spatiotemporal conditioning signals, leveraging a depth consistency loss to enforce geometric plausibility without requiring depth at inference. SAW achieves state-of-the-art temporal consistency (CD-FVD: 199.19 vs. 546.82) and strong visual quality on held-out test data. Furthermore, we demonstrate its downstream utility for (a) surgical AI, where augmenting rare actions with SAW-generated videos improves action recognition (clipping F1-score: 20.93% to 43.14%; cutting: 0.00% to 8.33%) on real test data, and (b) surgical simulation, where rendering tool-tissue interaction videos from simulator-derived trajectory points toward a visually faithful simulation engine.

Vascular Atlas based on Neural Template Aligned Graph Encodings

By Edmund Sumpena, advised by Craig Jones.

Numerous systemic and ocular diseases affect the vasculature in the retina, a piece of neural tissue at the back of the eye that is essential for human vision and accessible through non-invasive fundus imaging. In neuroscience, standardized heathy brains, also known as an atlas, have been constructed for standardization, anatomical mapping, disease identification, and a variety of other applications. Despite the importance of retinal vascular architecture in diagnosing and monitoring disease, no comprehensive spatial atlas of retinal vasculature currently exists due to high intersubject variability, making standardization across a population exceptionally challenging. To address this gap, we propose VANTAGE (Vascular Atlas based on Neural Template Aligned Graph Encodings), an end-to-end deep learning framework for constructing the first global vascular atlas of the superficial retina, presented as an initial proof of concept. Inspired by the success of AlphaFold, VANTAGE leverages known healthy vasculature templates to generate graph encodings of the vasculature and constructs an atlas by minimizing the deformation energy to each sample using a spectrally regularized multilayer perceptron. Experimental results demonstrate that the VANTAGE atlas identifies subjects with diabetic retinopathy, a disease that commonly disrupts retinal vasculature, significantly above chance by detecting deviations from the healthy atlas. We further demonstrate population-level standardization and generalizability of the atlas by transferring anatomical labels of the four major vascular arcades onto arbitrary subjects with moderate accuracy. Additional validation confirms that the atlas is spatially centered and faithfully reproduces the vessel tortuosity, radius, and length distributions of the training dataset used in its construction.