# Mark Dredze

 Research Scientist Human Language Technology Center of Excellence (HLTCOE) Assistant Research Professor Department of Computer Science Center for Language and Speech Processing (CLSP) Machine Learning Group Center for Population Health Information Technology (CPHIT), Bloomberg Health Sciences Informatics, School of Medicine
 Contact: |  www.cs.jhu.edu/~mdredze   www.dredze.com Office: Stieff 181    (410) 516-6786

## Publications

 Masters Thesis For my masters thesis in Jewish Studies at Yeshiva University, I completed a thesis titled: The Values of Traditional Judaism in Chicago. Please email me if you'd like a copy of this thesis.

## Students

Current Students

 Nicholas Andrews (Co-advised with Jason Eisner) Matt Gormley [www] (Co-advised with Jason Eisner) Michael Paul [www] (Co-advised with Jason Eisner) Travis Wolfe [www] Violet (Nanyun) Peng

Former Students
 Ariya Rastrow [www] (Co-advised with Sanjeev Khudanpur). ECE PhD, 2012. First Job: Amazon. Carolina Parada [www] (Co-advised with Hynek Hermansky). ECE PhD, 2011. First Job: Google Research.

 Project Student Year Information Extraction from Biomedical Text Leah Hanson 2011

 Project Student Email keyword summarization Danny Puller UPenn Summer Provost Fellowship Sentiment classification Ian Cohen Email Attachment Prediction Josh Magarick Prototype Driven Learning and Graphical Models Neal Parikh Machine Learning in Prediction Markets Ari Gilder Kevin Lerman Winner Best CS Senior Design Project, Honorary Mention Best Engineering Design Project User Adaptation in Email Reply Prediction Tova Brooks Josh Carroll Formal and Informal Meeting Extraction from Email Lauren Paone

## Teaching

Fall 2012: CS 600.475 Current Topics in Machine Learning [Class site]
Spring 2012: CS 600.775 Current Topics in Machine Learning [Class site]
Fall 2011: CS 600.475 Machine Learning [Class site]
Spring 2011: CS 600.775 Current Topics in Machine Learning [Class site]
Fall 2010: CS 600.475 Machine Learning [Class site]
Fall 2009: CS 600.475 Machine Learning [Class site]

## Data/Code

I get a lot of emails asking me for data or code from one of my papers. If you are wondering, the answer is yes! I try to provide both data and code so that others can reproduce or compare against my results. Sadly, I don't post data or code due to the lack of time, but I usually make them available if you email me.

Datasets
TAC 2009 Entity Linking (Email for data)
A collection of manually linked training examples to supplement those provided in the TAC 2009 KBP task. These are described in my Coling 2010 paper on entity linking.

A collection of ham and spam images taken from real user email.

Product reviews from several different product types taken from Amazon.com.

Attachment Prediction Email (Email for data)
Enron emails annotated with attachment information and cleaned of numerous artificats inserted by email programs.

Code
This is a collection of software developed by me and others in Fernando Pereria's research group at UPenn. It is designed for a range of machine learning tasks, such as dependency parsing, structured learning, gene prediction and gene mention finding.

Confidence Weighted Learning Library (Email for code)
We have collected most of the core algorithms in the confidence weighted learning framework for release as a software library. Please email me for the code.

Carmen is a library for geolocating tweets. Given a tweet, Carmen will return Location objects that represent a physical location. Carmen uses both coordinates and other information in a tweet to make geolocation decisions. It's not perfect, but this greatly increases the number of geolocated tweets over what Twitter provides.

## Colleagues

I have worked with a lot of amazing people on a wide variety of projects. Here are a few of them:

 Kedar Bellare Axel Bernal Larry Birnbaum John Blitzer Koby Crammer Krzysztof Czuba Kris Hammond Ryan Gabbard Kuzman Ganchev João Graça David Johnson Rie Johnson (Ando) Alex Kulesza Nicholas Kushmerick Tessa Lau Kevin Lerman Qian Liu Ryan McDonald David Mimno Peter Norvig Fernando Pereira Jeff Reynar Doug Riecken Sam Roweis Bill Schilit Partha Pratim Talukdar Hanna M. Wallach Joel Wallenberg Casey Whitelaw Tong Zhang