Svitlana Volkova, PhD

Email: svitlana dot volkova at pnnl dot gov
LinkedIn, Facebook

Pacific Northwest National Laboratory
Data Sciences and Analytics Group
Computational & Statistical Analytics Division
National Security Directorate
902 Battelle Boulevard
Richland, WA 99354

Inferring Gender, Age and Political Orientation from Tweets

Correlating Perceived Psycho-Demographics and Interests on Twitter

Crowdsourcing Gender, Age and Political Rationales across Languages

I am a senior research scientist at the Data Sciences and Analytics Group, National Security Directorate, Pacific Northwest National Laboratory.

I have graduated with a PhD in Computer Science with a focus on Natural Language Processing, Social Media Analytics and Machine Learning from Johns Hopkins University, Center for Language and Speech Processing. My adviser was Benjamin Van Durme.

My research interests include deep learning for NLP applications; forecasting the future using signals from social media e.g., events, opinion and emotion dynamics, infectious disease outbreaks; deception detection, information biases and factuality assessment in social and news media; multilingual social media analytics.

During my PhD I developed novel approaches and practical techniques for streaming personal analytics in social media focusing on the task of user demographic prediction. I worked on methods and practical applications for inferring latent user demographics from streaming communications in social media. I proposed approaches for constrained-resource classification, streaming prediction, iterative learning and inference with interactive annotation and weighting techniques. Predictive models developed in my dissertation allow the analysis of the relationships between user language, perceived psycho-demographic traits, emotions, opinions and interests in social media at scale.

I have been an intern (summers of 2011, 2012 and 2014) in the MSR Natural Language Processing and MSRC Machine Learning and Perception teams. I worked with Yoram Bachrach on inferring latent user demographics, personality, emotions and sentiments in social media at MSR Cambridge in 2014 (demo @AAAI2015). I worked with Bill Dolan, Pallavi Choundry, Chris Quirk and Luke Zettlemoyer on learning to built procedural dialog systems with light supervision (published @ACL2013) and methods to relate literal and sentimental descriptions of visual properties (@NAACL2013 with Mark Yatskar and Asli Celikyilmaz) at MSR Redmond in 2012 and 2011.

I was awarded a Google Anita Borg Memorial Scholarship in 2010 and a Fulbright Scholarship in 2008.

I am a reviewer for NAACL, TACL, ACL, EMNLP, AAAI, WWW and GHC.

Dissertation: Predicting Demographics and Affect in Social Networks [slides]
Committee: Benjamin Van Durme, David Yarowsky and Philip Resnik

NAACL Tutorial on Social Media Predictive Analytics, Denver, CO May 31 2015 [slides] [references] [video]
Code and data (email to get access): [querying_twitter] [attribute] [psycho-demographics]
By using data, models or code, you agree to be bound by the terms of its license. Read the license.


Nov 14 - 17, 2016: Presenting two papers on Students' Emotional Wellbeing and Opinion Dynamics during Crisis at SocInfo 2016, Seattle WA
Oct 19 - 21, 2016: Presenting my research on Predicting User Demographics, Emotions and Opinions in Social Networks, at Grace Hopper Conference (GHC2016), Houston TX
May 17, 2016: Presenting my recent work on Account Deletion Prediction on RuNet at the NAACL Workshop on Computational Approaches to Deception Detection
May 13, 2016: Giving a talk at the University of Maryland Baltimore County, Baltimore MD
Feb 28, 2016: Co-chair of the International Workshops on NLP and Computational Social Science at EMNLP-2016 and at WebSci-2016 with David Jurgens, Dirk Hovy, David Bamman, A. Seza Dogruoz, Jacob Eisenstein, Brendan O'Connor, Alice Oh, and Oren Tsur
Feb 25, 2016: Giving a talk at the University of Washington and a CS seminar at Northwestern University, Seattle WA
Dec 12, 2015: Serving on the Women in Machine Learning Board
Dec 5 - 12, 2015: Attending NIPS 2015 and 10th Women in Machine Learning Workshop, Montreal, Canada
Oct 05, 2015: Joined Data Science and Analytics Group at Pacific Northwest National Laboratory
May 31, 2015: NAACL Tutorial on Social Media Predictive Analytics [video]
May 01, 2015: Co-organizing 10th Women in Machine Learning Workshop (WiML'2015)
Apr 15, 2015: Giving a talk at Microsoft Research, Natural Language Processing Group, Redmond WA
Apr 12, 2015: Giving a talk at Pacific Northwest National Laboratory, Richland WA
Mar 17, 2015: Microsoft Research, Machine Learning Group, Cambridge UK [video]
Mar 16, 2015: Giving a talk at Psychometrics Center, University of Cambridge, Cambridge UK
Feb 18, 2015: CLIP Seminar and Guest Lecture, University of Maryland
Jan 30, 2015: giving a talk at People Pattern, Austin TX
Jan 26 - 30, 2015: presenting a DEMO (email to get access) and giving a talk at AAAI 2015
Dec 8 - 12, 2015: presenting a paper at NIPS Personalization Workshop 2015
Nov 13, 2014: giving a talk at Penn, Positive Psychology Center, World Well-Being Project
Nov 12, 2014: giving a talk at Penn, Computational Linguistics Group, CLunch [slides]
Nov 7, 2014: attending Amazon's Fall 2014 Graduate Research Symposium in Seattle [poster]
Sept 12, 2014: invited speaker, University of Cambridge, Computer Laboratory, NLIP Seminar Series
Jun 29, 2014: I am doing research at MSR Cambridge UK, Machine Learning and Perception Group with Yoram Bachrach this summer!
Feb 2014: I am co-organizing a Joint Workshop on Social Dynamics and Personal Attributes in Social Media at ACL 2014. Proceedings can be found here.
Dec 2014: I am a student organizer for the Student Research Workshop at ACL 2014.

Recent Publications (see also Google Scholar)

By using data, models or code released with any of the papers below, you agree to be bound by the terms of its license. Read the license.







Non Refereed Publications and Presentations

The rest of my publications from 2008 to 2010 can be found here. More of my talks and posters can be found here.

Besides NLP and ML research, I enjoy skiing, cooking, and opera performances (especially MET productions), as well as the ballet (Mariinsky and Bolshoi).

User demographic attribute classification performance (ROC AUC).

Streaming social media analytics using iterative Bayesian updates.

Active learning setup for political preference prediction over a stream of user and neighbor communications.

Political preference classification accuracy: batch model is learned from tweets in the friend neighborhood evaluated on our political preference dataset
Age classification accuracy: batch model is learned from tweets in the follower neighborhood evaluated on our age dataset
The number of users and their daily tweets estimated over time for a set of randomly sampled profiles from the 1% Twitter feed (~150K points).