Text-Driven Demographic Prediction in Social Media

Social media predictive analytics bring unique opportunities to study people and their behaviors in real time, at an unprecedented scale: who they are, what they like and what they think and feel. Such large-scale real-time social media predictive analytics provide a novel set of conditions for the construction of predictive models.

This work focuses on various approaches to handling this dynamic data for predicting latent user demographics, from constrained-resource batch classification, to incremental bootstrapping, and then iterative learning via interactive rationale (feature) crowdsourcing. In addition, we study the relationships between a variety of perceived user properties e.g., income, education etc. and opinions, emotions and interests in a social network. Finally, we demonstrate how user demographics can be useful for downstream prediction tasks e.g., gender-informed sentiment analysis.

Speaker Biography

Svitlana Volkova is a PhD candidate in Computer Science at the Center for Language and Speech Processing, Johns Hopkins University. Her PhD research focuses on building text-driven predictive models for socio-linguistic content analysis in social media. She has been mainly working on online models for streaming social media analytics, fine-grained emotion detection and multilingual sentiment analysis, and effective annotation techniques via crowdsourcing incorporated into the active learning framework. She interned at Microsoft Research in 2011, 2012 and 2014 at the Natural Language Processing and Machine Learning and Perception teams. She was awarded the Google Anita Borg Memorial Scholarship in 2010 and the Fulbright Scholarship in 2008.