Sifting through social media messages has become a popular way to track illness and other public health trends, but a key hurdle remains: how to separate valuable information from mindless chatter.
To address this, Johns Hopkins computer scientists and researchers in the School of Medicine have developed a new tweet-screening method that not only delivers real-time data on flu cases but also filters out online chatter not linked to actual infections. Comparing their method, which is based on analysis of 5,000 publicly available tweets per minute, to other Twitter-based tracking tools, the Johns Hopkins team says their real-time results closely track government disease data that takes much longer to compile.
“When you look at Twitter posts, you can see people talking about being afraid of catching the flu or asking friends if they should get a flu shot or mentioning a public figure who seems to be ill,” said Mark Dredze, an assistant research professor in the Department of Computer Science, who uses tweets to monitor public health trends. “But posts like this don’t measure how many people have actually contracted the flu. We wanted to separate hype about the flu from messages from people who truly become ill.”
Dredze, also a research scientist at the Johns Hopkins Human Language Technology Center of Excellence, led a team that in mid-2011 released one of the first and most comprehensive studies showing that Twitter data can yield useful public health information. Since then, this strategy has become so popular that the U.S. Department of Health and Human Services last summer sponsored a contest challenging researchers to design an online application that could track major disease outbreaks.
To improve their accuracy when using tweets to track the flu, the Johns Hopkins team developed sophisticated statistical methods based on human language processing technologies. The system can distinguish, for example, between “I have the flu” and “I’m worried about getting the flu.”
TRACKING THE FLU: The U.S. map below shows Twitter-based flu forecasts in each state for the first week of January 2012. Hover your mouse over the map to reveal forecasts for the same week in 2013. Higher flu rates are marked in darker red, and the data shows the country with a dramatically higher flu rate in 2013.
- NEWS RELEASE: Using Twitter to Track the Flu: Researchers Find a Better Way to Screen the Tweets
- Twitter Stories: The Future of Public Health
- Human Language Technology Center of Excellence at Johns Hopkins
- Mark Dredze’s website
- Johns Hopkins Department of Computer Science
- Center for Advanced Modeling in The Social, Behavioral, and Health Sciences