Machine Learning: Theory and Applications to NLP

Dan Roth
Section: Language and Computation
Level: Advanced


The central role of statistical and machine learning techniques in natural language processing is by now established. These methods have brought significant advances in language processing and allowed researchers to start dealing with realistic size and difficult problems such as disambiguation, context sensitivity and knowledge acquisition. Their principle role in understanding human language acquisition as well as in enabling robust broad-coverage language processing is widely accepted.

The course will present a unified view of the most widely used learning and statistical techniques in NLP. We will present the theoretical basis for these methods and focus on explaining why and when different methods work. We will present probabilistic models and their use in natural language predictions as well as "symbolic" discrimination methods (such as rule based) and provide a common explanation for them, as a step towards studying the role of learning in higher-level natural language inferences.

In the second part of the course we will sample important recent research. In particular, we will address issues such as learning using non-local features and relational learning methods, learning from unlabeled data, and combining learning with inference. The applications described will be from disambiguation tasks and context sensitive text correction, the acquisition and recognition of structure and knowledge in text, shallow parsing and information extraction.


Basic knowledge of Theory of Computation and algorithmic maturity.


PDF lecture notes


C. D. Manning and H. Schutze, Foundations of Statistical Natural Language Processing MIT Press, 1999.

D. Jurafsky and J. H. Martin, Speech and Language Processing Prentice Hall, 2000.


Dan Roth
Department of Computer Science
University of Illinois at Urbana/Champaign
1304 W. Springfield Ave.
Urbana IL 61801
Phone: (217) 244-7068
Fax: (217) 244-6500