CS 661 - Machine Learning (Spring 1996)
Steven Salzberg will
run the seminar, with participation by other faculty members
including
Simon Kasif,
Eric Brill, and
David Yarowsky.
The class will meet from 1:00 - 3:00 on Thurdays in
NEB 36 (the ground floor of NEB).
Prerequisite: CS 435 Artificial Intelligence, or the
equivalent.
This page will be continually under construction throughout the
semester.
Suggestions about papers to read should be sent
to the salzberg@cs.jhu.edu
Overview
This is an advanced course that will focus on the recent literature on
the application of machine learning to problems from a range of
different areas, including biology, astronomy, and informational
retrieval. After 1-2 introductory lectures, subsequent classes will
focus on research papers, usually two papers per class. The last few
meetings may involve presentations of class projects.
Requirements
There are four parts of this course's requirements:
- Students will be expected to present 2-3 papers.
- Students will write short commentaries on several additional papers.
- Students will be expected to read each week's papers and
participate in the discussions.
- Students will do substantial class projects of their choosing.
Project proposals will be due in mid-semester.
Note: graduate students are welcome to audit the class.
Relevant Local Pointers
Artificial Intelligence at Hopkins
Computational Biology at Hopkins
Johns Hopkins University
Selected On-line Research Papers (excluding Hopkins)
(note: this list will grow and evolve throughout the semester)
- M. Burset and R. Guigo.
Evaluation of Gene Structure Prediction
Programs, draft manuscript, 1995.
- D. Jensen and P. Cohen (draft manuscript).
Overfitting in Inductive Learning Algorithms: Why it occurs and
how to correct it
- L. Prechelt (1996).
A Quantitative Study of Experimental Evaluations of Neural Network
Learning Algorithms: Current Research Practice.
To appear in Neural Networks, vol. 9.
- R. Parsons, S. Forrest, and C. Burks (in press).
Genetic operators for the DNA fragment-assembly problem.
To appear in Machine Learning.
- M. Mitchell and S. Forrest (1994).
Genetic algorithms and artificial life.
Artificial Life, Vol. 1, No. 3 (1994), pp. 267-289.
- Atkeson, C. G., Moore, A. W., & Schaal, S. (submitted).
Locally weighted learning, Artificial Intelligence Review.
- J. Ross Quinlan (1995).
MDL and Categorical Theories, ML '95.
- J. Ross Quinlan (1995).
Oversearching and Layered Search in Empirical Learning, IJCAI '95.
- J. Ross Quinlan (1996).
Improved Use of Continuous Attributes in C4.5, submitted to JAIR
- R. Crites and A. Barto (1995).
Improving Elevator Performance Using Reinforcement Learning
(scroll down to Pubs to get paper).
Advances in Neural Information Processing Systems 8 (NIPS8).
(nips8.ps.Z: 58525 bytes)
- Robert C. Holte (1993).
Very Simple Classification Rules Perform Well on Most Commonly Used
Datasets. Machine Learning, vol. 3, pp. 63-91.
- Peter Auer, Robert C. Holte, and Wolfgang Maass (1995).
Theory and Applications of Agnostic PAC-Learning with Small Decision
Trees. Proceedings of the 12th International Conference on Machine
Learning (ML'95), A. Prieditis and S. Russell (editors), pages 21-29.
- K. J. Cherkauer (1996).
Stuffing Mind into Computer: Knowledge and Learning for Intelligent
Systems. Informatica, 19.
- R. Maclin & J. W. Shavlik (1996).
Creating Advice-Taking Reinforcement Learners. Machine
Learning, 22, pp. 251--281.
- D. W. Opitz & J. W. Shavlik (1995).
Dynamically Adding Symbolically Meaningful Nodes to Knowledge-Based Neural Networks.
Knowledge-Based Systems, 8, pp. 301--311.
- G. G. Towell & J. W. Shavlik (1994).
Knowledge-Based Artificial Neural Networks.
Artificial Intelligence, 70, pp. 119--165.
- R.S. Sutton (1988).
Learning to predict by the methods of temporal differences,
Machine Learning, 3, 1988, No. 1, pp. 9--44. (121K)
- R.S. Sutton (1991).
Planning by incremental dynamic programming, Proceedings of the
Eighth International Workshop on Machine Learning, pp. 353-357,
Morgan Kaufmann. (55K)
- P. Riddle, R. Segal, and O. Etzioni (1994).
Representation design and brute-force induction in a Boeing manufacturing
domain. Applied Artificial Intelligence, 8:125-147.
- R. Segal and O. Etzioni (1994)
Learning decision lists using homogeneous rules. Proceedings
of the Twelfth National Conference on Artificial Intelligence,
July, 1994.
Selected On-line Research Papers by Hopkins researchers
- Steven Salzberg (1995).
Locating Protein Coding Regions in Human DNA using a Decision Tree
Algorithm. Journal of Computational Biology,
2:3 (1995), 473-485. (590K, PostScript compressed with gzip)
- S. Salzberg, R. Chandar, H. Ford, S. Murthy, and R. White (1995).
Decision Trees for Automated Identification of Cosmic Ray Hits in
Hubble Space Telescope Images. Publications of the Astronomical
Society of the Pacific 107, May 1995, 1-10. (460K, compressed
Postscript)
- Steven Salzberg (1995).
On Comparing Classifiers: A Critique of
Current Research and Methods Technical Report JHU-95/06, Department of
Computer Science, Johns Hopkins University, May 1995. Draft manuscript.
(139K, Postscript)
- S.K. Murthy and S. Salzberg (1995).
Lookahead and Pathology in Decision Tree Induction.
Proceedings of IJCAI-95, Montreal, pp. 1025--1031.
- S.K. Murthy, S. Kasif, and S. Salzberg (1994).
A System for Induction of Oblique Decision Trees.
Journal of Artificial Intelligence Research 2:1 (1994), 1-32.
(475K, PostScript. This is the main OC1 paper.)
- Sreerama K. Murthy (1995).
On Growing Better Decision Trees from Data
Ph.D. thesis, in HTML format, October 1995.
- J. Rachlin, S. Kasif, S. Salzberg, and D. Aha (1994).
Towards a Better Understanding of Memory-Based and Bayesian
Classifiers Proc. 1994 Internatl. Conf. on Machine Learning
(pp. 242--250). New Brunswick, NJ, July 1994.
- S. Salzberg, A. Delcher, D. Heath, and S. Kasif (1995).
Best-Case Results for Nearest-Neighbor Learning.
IEEE Transactions on Pattern Analysis and Machine
Intelligence 17:6, June 1995, 599-608. (333K, PostScript)
- Steven Salzberg (1994).
Book Review of C4.5: Programs for
Machine Learning (by J.R. Quinlan). Machine Learning 16 (1994),
235-240. (89K, PostScript)
- D. Heath, S. Kasif, and S. Salzberg (1993).
Learning Oblique Decision Trees. Proc. 13th
Internatl. Joint Conf. on Artificial Intelligence (IJCAI-93)
(pp. 1002--1007). Chambery, France, 1993. (147K, PostScript.
Describes a simulated annealing system for building decision trees.)
- S. Cost and S. Salzberg (1993).
A Weighted Nearest Neighbor Algorithm for Learning with
Symbolic Features. Machine Learning 10:1, 1993, 57--78.
Some figures missing. (103K, DVI format)
- D. Heath, S. Kasif, and S. Salzberg (1996).
Committees of Decision Trees In B. Gorayska and J. Mey (Eds.),
Cognitive Technology: In Search of a Humane Interface
(pp. 305--317). Amsterdam: Elsevier Science B.V., 1996. (191K, PostScript)
- D. Heath, S. Kasif, R. Kosaraju, S. Salzberg, and
G. Sullivan (1996).
Learning Nested Concept Classes with Limited Storage. To appear
in Journal of Experimental and Theoretical Artificial Intelligence.
(228K, PostScript)
On-Line Resources in Machine Learning
-
Pointers to Machine Learning courses around the world
-
Machine Learning resources (bibliographics, companies, conferences
data, and more) maintained by David Aha
-
Machine Learning journal
-
Journal of Artificial Intelligence Research
-
Neural Information Processing Systems 1995 (NIPS-95) papers
(contains abstracts and full text of many of the papers from the
most recent conference)
-
Reinforcement Learning at CMU
-
Knowledge Discovery in Databases (data mining)
-
Neural Networks Home Page, a good starting point to explore
neural networks research around the Net
-
Society for AI and Statistics home page
-
Pattern recognition information from The Netherlands.
Contains some good bibliographies and reviews of pattern
recognition papers, books, conferences, and research groups.
-
NIPS workshop on benchmarking learning algorithms
-
Links to Researchers in Machine Learning, pointers
to hundreds of individual researchers maintained by David Aha at NRL.
-
The OC1 decision tree system, including source code and
documentation.
-
TOOLDIAG is a collection of methods for statistical pattern
recognition, especially classification, from Thomas Rauber in Portugal.
- MLC++
learning software library (includes both OC1 and PEBLS from Hopkins)
- MLnet Machine
Learning Archive at GMD, in Germany"
- StatLib, a system for
distributing statistical software, datasets, and information.
- Genetic Algorithms
Archive, a good starting point to learn about GAs
University Research Groups in Machine Learning
Data Sets and Repositories of Data
This page's net surfer log
says that you are visitor number
This page maintained by by Steven Salzberg, Department of Computer
Science, Johns Hopkins University.
salzberg@cs.jhu.edu