Department of Computer Science, Johns Hopkins University
spacerHomeAbout UsWhy Join UsPeopleAcademicsResearchEventsServices
Department of Computer Science, Johns Hopkins Universityspacer

April 20, 2010 - Mark Dredze

Title: Building Confidence in Online Learning

Abstract:
The information revolution has produced huge quantities of digitized knowledge. Information users, such as web searchers, business analysts, and medical professionals, are overwhelmed by vast quantities of information. As new information sources move online, information overload will worsen and the need for intelligent information systems will grow. The recent focus on information processing in statistical methods has produced numerous high quality tools for processing language, including knowledge extraction, organization and analysis. With more data and better statistical methods, the state of the art advances. However, these statistical methods can have difficulty scaling up to huge quantities of diverse data.

This talk will present techniques designed for processing large data collections, with a particular focus on sparse representations common to many domains with a large number of features. I will present Confidence Weighted Learning, an online (streaming) machine learning algorithm designed for these types of data distributions. Confidence weighted learning maintains a distribution over linear classifiers and updates the distribution after each example. I'll show how this framework can be extended to multi-class and structured prediction problems, as well as extensions for modeling seconds order feature interactions and noisy data.













































spacerSearchContact UsIntegrity CodeAcademics FAQLibrary ResourcesJob Center