Svitlana Volkova, PhD

Email: svitlana dot volkova at pnnl dot gov
LinkedIn, Facebook

Pacific Northwest National Laboratory
Data Sciences and Analytics Group
Computing and Analytics Division
National Security Directorate
902 Battelle Boulevard
Richland, WA 99354

Visualizing Spatiotemporal Embeddings in Social Media

Contrasting Public Opinion Dynamics in VKontakte during Crisis

Inferring User Psycho-Demographics from Texts on Twitter

Inferring Gender and Age Attributes from Tweets

Correlating User Demographics and Interests on Twitter

Crowdsourcing Multilingual Gender and Age Rationales

I am a senior research scientist at the Data Sciences and Analytics Group, National Security Directorate, Pacific Northwest National Laboratory.

I have graduated with a PhD in Computer Science with a focus on Natural Language Processing, Social Media Analytics and Machine Learning from Johns Hopkins University, Center for Language and Speech Processing. My adviser was Benjamin Van Durme.

I am a PI on DARPA SocialSim program (TA3: Test and Measurement), and other projects -- Deception Detection and Tacking in News and Social Media, Graph-Based Deep Learning on Real-World Data. Previous projects focused on Deep Learning of Multilingual Distributed Representations from Large Streaming Text Data (co-PI) and Forecasting the Future using Diverse Social Media Sources (co-PI).

My research interests include natural language processing; machine learning; computational social science; social media analytics; deep learning for NLP applications: forecasting social media dynamics -- real-word events, opinions and emotions, disease outbreaks, language change, entity and event-driven connotations; deception detection and tracking, information biases and factuality assessment in news and social media; multilingual social media analysis.

During my PhD I developed novel approaches and practical techniques for streaming personal analytics in social media focusing on the task of user demographic prediction. I worked on methods and practical applications for inferring latent user demographics from streaming communications in social media. I proposed approaches for constrained-resource classification, streaming prediction, iterative learning and inference with interactive annotation and weighting techniques. Predictive models developed in my dissertation allow the analysis of the relationships between user language, perceived psycho-demographic traits, emotions, opinions and interests in social media at scale.

I have been a research intern (summers of 2011, 2012 and 2014) in the Microsoft Research Natural Language Processing and Microsoft Research Cambridge Machine Learning and Perception teams. I worked with Yoram Bachrach on inferring latent user demographics, personality, emotions and sentiments in social media at Microsoft Research Cambridge in 2014 (demo @AAAI2015). I worked with Bill Dolan, Pallavi Choundry, Chris Quirk and Luke Zettlemoyer on learning to built procedural dialog systems with light supervision (published @ACL2013) and methods to relate literal and sentimental descriptions of visual properties (@NAACL2013 with Mark Yatskar and Asli Celikyilmaz) at Microsoft Research Redmond in 2012 and 2011.

I was awarded a Google Anita Borg Memorial Scholarship in 2010 and a Fulbright Scholarship in 2008.

I am on a Program Committee for top NLP and AI conferences: NAACL, TACL, ACL, EACL, EMNLP, AAAI, WWW and GHC.

Dissertation: Predicting Demographics and Affect in Social Networks [slides]
Committee: Benjamin Van Durme, David Yarowsky and Philip Resnik

ICWSM Hands-On Tutorial on Measuring Information Spread Within and Across Social Platforms, Munich, Germany, June 11 2019
Organizers: Emily Saldanha, Maria Glenski, and Svitlana Volkova

NAACL Tutorial on Social Media Predictive Analytics, Denver, CO May 31 2015 [slides] [references] [video]
Code and data (email to get access): [querying_twitter] [attribute] [psycho-demographics]
By using data, models or code, you agree to be bound by the terms of its license. Read the license.


February 28, 2019: Talk on I can't Believe it's not Better: Detecting and Quantifying Misinformation and Disinformation NLP Seminar Series, Stanford, CA
October 17, 2017: Talk on Deep Learning for Predictive and Anticipatory Social Media Analytics at CLSAC: Chesapeake Large-Scale Analytics Conference, Annapolis, MD
October 13, 2017: Talk on Models for Detecting Deceptive News on Twitter at Text as Data Conference 2017, Princeton University
September 17, 2017: Computer Science Colloquium on Deep Learning for Social Media Analytics at the University of Idaho
August 4, 2017: Invited talk at the ACL NLP + CSS Workshop on Predicting the Future with Deep Learning and Signals from Social Media
June 30, 2017: I have been appointed as a Vice-Chair for the ACM Future of Computing Academy [more]
June 22 - 26, 2017: Attending the inaugural meeting of the ACM Future of Computing Academy and ACM celebration of 50 years of Turing Awards, San Francisco CA
April 25 - 26, 2017: Attending Machine Learning Open House at Facebook and visiting Allen Institute for Artificial Intelligence, Seattle WA
April 21, 2017: I have been selected as a member of the ACM Future of Computing Academy (FCA)
March 27 - 28, 2017: Talking about Predicting the Future with Deep Learning and Signals from Social Media at Google, Amazon and Facebook, Seattle WA
Feb 2017: Serve as an Area Chair for the Social Media Track at ACL 2017
Nov 14 - 17, 2016: Presenting two papers on Students' Emotional Wellbeing and Opinion Dynamics during Crisis at SocInfo 2016, Seattle WA
Oct 19 - 21, 2016: Presenting my research on Predicting User Demographics, Emotions and Opinions in Social Networks, at Grace Hopper Conference (GHC2016), Houston TX
May 17, 2016: Presenting my recent work on Account Deletion Prediction on RuNet at the NAACL Workshop on Computational Approaches to Deception Detection
May 13, 2016: Talk at the University of Maryland Baltimore County, Baltimore MD
Feb 28, 2016: Co-chair of the International Workshops on NLP and Computational Social Science at EMNLP-2016 and at WebSci-2016 with David Jurgens, Dirk Hovy, David Bamman, A. Seza Dogruoz, Jacob Eisenstein, Brendan O'Connor, Alice Oh, and Oren Tsur
Feb 25, 2016: Talk at the University of Washington and a CS seminar at Northwestern University, Seattle WA
Dec 12, 2015: Serving on the Women in Machine Learning Board
Dec 5 - 12, 2015: Attending NIPS 2015 and 10th Women in Machine Learning Workshop, Montreal, Canada
Oct 05, 2015: Joined Data Science and Analytics Group at Pacific Northwest National Laboratory
May 31, 2015: NAACL Tutorial on Social Media Predictive Analytics [video]
May 01, 2015: Co-organizing 10th Women in Machine Learning Workshop (WiML'2015)
Apr 15, 2015: Talk at Microsoft Research, Natural Language Processing Group, Redmond WA
Apr 12, 2015: Talk at Pacific Northwest National Laboratory, Richland WA
Mar 17, 2015: Microsoft Research, Machine Learning Group, Cambridge UK [video]
Mar 16, 2015: Talk at Psychometrics Center, University of Cambridge, Cambridge UK
Feb 18, 2015: CLIP Seminar and Guest Lecture, University of Maryland
Jan 30, 2015: Talk at People Pattern, Austin TX
Jan 26 - 30, 2015: Presenting a demo and giving a talk at AAAI 2015
Dec 8 - 12, 2015: Presenting a paper at NIPS Personalization Workshop 2015
Nov 13, 2014: Talk at Penn, Positive Psychology Center, World Well-Being Project
Nov 12, 2014: Talk at Penn, Computational Linguistics Group, CLunch [slides]
Nov 7, 2014: Amazon's Fall 2014 Graduate Research Symposium in Seattle [poster]
Sept 12, 2014: Invited speaker, University of Cambridge, Computer Laboratory, NLIP Seminar Series
Jun 29, 2014: I am doing research at Microsoft Research Cambridge UK, Machine Learning and Perception Group with Yoram Bachrach this summer!
Feb 2014: I am co-organizing a Joint Workshop on Social Dynamics and Personal Attributes in Social Media at ACL 2014. Proceedings can be found here.
Dec 2014: I am a student organizer for the Student Research Workshop at ACL 2014.

Recent Publications (see also Google Scholar)

By using data, models or code released with any of the papers below, you agree to be bound by the terms of its license. Read the license.










Non Refereed Publications and Presentations

My earlier publications from 2008 to 2010 can be found here. More of my talks and posters can be found here.

Besides NLP and ML research, I enjoy skiing, cooking, and opera performances (especially MET productions), as well as the ballet (Mariinsky and Bolshoi).

Overview of social media analytics capabilities developed by Volkova et al.

Communication network (@mention) among verified (blue), propaganda (pink), and clickbait (orange) Twitter accounts.

Psycholinguistic differences across suspicious news types on Twitter.

Disinformation graph of agents, actions and themes extracted from @EUvsDisinfo disinformation reviews.

Representation shift between each word's current representation and its original representation.

Semantic trajectory of the word "war" over time, projected in 2D, with two most similar words at each timestamp.

User demographic attribute classification performance (ROC AUC).

Streaming social media analytics using iterative Bayesian updates.

Active learning setup for political preference prediction over a stream of user and neighbor communications.

Political preference classification accuracy: batch model is learned from tweets in the friend neighborhood evaluated on our political preference dataset