Research


See Google Scholar or SemanticScholar for an up-to-date list of publications.

I have attached links to the papers, code, and data where applicable. If anything that you want is missing, please email me at aadelucia @ jhu.edu or message me on Twitter.

Large Language Models

DeLucia, Alexandra; Wu, Shijie; Mueller, Aaron; Aguirre, Carlos; Dredze, Mark; Resnik, Philip. 2022. Bernice: A Multilingual Pre-trained Encoder for Twitter. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 6191–6205, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics. [PDF] [Code] [Model] [Data] [Poster] [Talk]

Portillo Wigthman, Gwenyth; DeLucia, Alexandra; Dredze, Mark. 2023. Strength in Numbers: Estimating Confidence of Large Language Models by Prompt Agreement. In Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023), pages X-X, Toronto, CA. Association for Computational Linguistics. [PDF] [Code]


Civil Unrest on Twitter

DeLucia, Alexandra; Dredze, Mark; Buzcak, Anna L. A Multi-instance Learning Approach to Civil Unrest Event Detection using Twitter. The 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text at RANLP. September 2023. [PDF] [Slides] [Code]

Zhang, Jingyu; DeLucia, Alexandra; Zhang, Chenyu; Dredze, Mark. 2023. Geo-seq2seq: Twitter user geolocation on noisy data through sequence to sequence learning. In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, July. 2023. Association for Computational Linguistics. [PDF] [Code]

Zhang, Jingyu; DeLucia, Alexandra; and Dredze, Mark. 2022. Changes in Tweet Geolocation over Time: A Study with Carmen 2.0. In Proceedings of the Eighth Workshop on Noisy User-generated Text (W-NUT 2022), pages 1–14, Gyeongju, Republic of Korea. Association for Computational Linguistics. [PDF][Talk][Slides] [Code] [Tool]

Chinta, Abhinav; Zhang, Jingyu; DeLucia, Alexandra; Buzcak, Anna L; Dredze, Mark. Study of Manifestation of Civil Unrest on Twitter. The 7th Workshop on Noisy User-generated Text (W-NUT) at EMNLP. November 2021. [PDF] [Poster] [Code/Data]

Sech, Justin; DeLucia, Alexandra; Buzcak, Anna L; Dredze, Mark. Civil Unrest on Twitter (CUT): A Dataset of Tweets to Support Research on Civil Unrest. The 6th Workshop on Noisy User-generated Text (W-NUT) at EMNLP. November 2020. [PDF] [Poster] [Code/Data]


Text Generation and Decoding

Sia, Suzanna; DeLucia, Alexandra; Duh, Kevin. Anti-LM Decoding for Zero-shot In-context Machine Translation. arXiv Preprint. November 2023. [PDF]

DeLucia, Alexandra; Mueller, Aaron; Li, Xiang “Lisa”; Sedoc, João. Decoding Methods for Neural Narrative Generation. 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM). August 2021. [PDF] [Poster] [Preprint] [Code]

DeLucia, Alexandra; Mueller, Aaron; Li, Xiang; Sedoc, João. Decoding Strategies for Interactive Narrative Generation. Animal Crossing Artifical Intelligence Workshop (ACAI). Presentation. 23 July 2020. [Talk] [Slides]


Social Media and Public Health

Evan L. Eschliman; Karen Choe; Alexandra DeLucia; Elizabeth Addison; Valerie W. Jackson; Sarah M. Murray; Danielle German; Becky L. Genberg; Michelle R. Kaufman. First-hand accounts of structural stigma toward people who use opioids on Reddit. Social Science & Medicine (Volume 347). 2024. [Paper] [PDF]

Savannah Brenneke; Meredith Meacham; Amanda Bunting; Alexandra DeLucia; Nicholas Proferes. r/AskAcademia: Special Considerations in the Practice of Using Reddit for Substance Use Research. College on Problems of Drug Dependence (CPDD). 2023. [Slides]

Alexandra DeLucia, Adam Poliak, Zechariah Zhu, Stephanie R Pitts, Mario Navarro, Sharareh Shojaie, John W Ayers, Mark Dredze. Automated Discovery of Perceived Health-related Concerns about E-cigarettes from Reddit. Annual Meeting of the Society for Research on Nicotine and Tobacco. 2023. Page 189. [PDF] [Poster]


Human Annotation Studies

Lee, Seunggun; DeLucia, Alexandra; Guan, Ryan; Li, Rubing; Nangia, Nikita; Vaidya, Shalaka; Zhang, Lining; Yuan, Zijun; Ganedi, Praneeth; Ngaw, Britney; Singhal, Aditya; Sedoc João. Common Law Annotations: Investigating the Stability of Dialog Annotations. In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, July. 2023. Association for Computational Linguistics. [PDF] [Code]

Lee, Seunggun; DeLucia, Alexandra; Guan, Ryan; Li, Rubing; Nangia, Nikita; Vaidya, Shalaka; Zhang, Lining; Yuan, Zijun; Ganedi, Praneeth; Ngaw, Britney; Singhal, Aditya; Sedoc João. Common Law Annotations: Investigating the Stability of Dialog Annotations. Work in Progress. Human Computation 2022.


Characterization of Online Behavior

DeLucia, Alexandra; Drobina, Emma; Fairchild, Geoffrey; Daughton, Ashlynn; Moore, Elisabeth. Automated Detection and Characterization of Pathological Online Behavior. Beyond Misinformation: Towards a Research Agenda for Information Ecosystems, Network Dynamics and Emergent Epistemologies (INDE). 5 August 2021. Presented by Elisabeth Moore. [Slides]

DeLucia, Alexandra; Drobina, Emma; Fairchild, Geoffrey; Daughton, Ashlynn; Moore, Elisabeth. Automated Detection and Characterization of Pathological Online Behavior. Los Alamos National Laboratory Applied Machine Learning Summer Research Final Presentation. 12 August 2021. [Slides]


System Log Analysis

DeLucia, Alexandra, and Baseman, Elisabeth. Early Prediction of High Performance Computing Job Outcomes via Modeling System Text Logs. Machine Learning for Computing Systems (MLCS) Workshop, Supercomputing. November 2020. [PDF]

DeLucia, Alexandra, and Moore, Elisabeth. Modeling High Performance Computing System Log Messages for Early Prediction of Job Outcome. NeurIPS co-located workshop Women in Machine Learning (WiML). Poster. December 2018. [PDF]

DeLucia, Alexandra, and Moore, Elisabeth. HPC Job Outcome Prediction: System Log Feature Extraction and Importance. Chesapeake Large Scale Analytics Conference (CLSAC). Invited Poster. November 2018. [PDF]

DeLucia, Alexandra. High Performance Computing Job Outcome Prediction by Mining System Logs. Los Alamos National Laboratory Ultrascale Systems Research Center 3rd Annual Symposium. Presentation. August 2018. [PDF]

DeLucia, Alexandra, and Baseman, Elisabeth. Work in Progress: Topic Modeling for HPC Job State Prediction. Machine Learning for Computing Systems Workshop, ACM High-Performance Parallel and Distributed Computing (HPDC). June 2018. [PDF]

DeLucia, Alexandra, and Baseman, Elisabeth. High Performance Computing Job Outcome Prediction by Mining System Logs. Southern Data Science Conference (SDSC). Poster. April 2018. [PDF]

Haque, Abida, DeLucia, Alexandra, and Baseman, Elisabeth. Markov Chain Modeling for Anomaly Detection in High Performance Computing System Logs. HUST’17: Proceedings of the Fourth International Workshop on HPC User Support Tools. November 2017. [PDF]

DeLucia, Alexandra, and Baseman, Elisabeth. Intelligent Anomaly Detection in High Performance Computing Logs via Machine Learning. Los Alamos National Laboratory Ultrascale Systems Research Center 2nd Annual Symposium. Poster. August 2017. [PDF]


Senior Honor’s Thesis | Rollins College 2018

Alexandra DeLucia, advised by Julie Carrington and Elisabeth Moore (Los Alamos National Lab). High Performance Computing Job Outcome Prediction By Mining System Logs. [PDF]