Computational morphosyntax

Code
570519
Credits
5cr

Prior recommendations

It is required that you have good working knowledge of the Python programming language (Python 3) before the start of the course. If you are new to programming or to Python, you should take the course Natural Language Processing in the first semester.  

If you are not able to here it is a selection of Python tutorials that may help you in this training: 
Gauld,Alan (2011) Learning to program: http://www.alan-g.me.uk/l2p2/index.htm  

This is a basic introduction, that also introduces some basic concepts of programming (in addition to Python3, also some examples are given of JavaScript and VBScript). Adequate for absolute beginners in programming. At least sections under “Concepts” and “The basics” plus the two first ones under “Advanced Topics” should be worked out. 
Zed A. Shaw (2017) LEARN PYTHON 3 THE HARD WAY. A Very Simple Introduction To The Terrifyingly Beautiful World Of computers And Code (Third Edition): https://www.pdfdrive.com/learn-python-3-the-hard-way-e52089947.html  

This is a step-by-step thorough introduction to the basics of Python3. Adequate for people desiring to acquire the skills of programming. The first 39 sections should be worked out. 
Charles R. Severance (2016) Python for Everybody. Exploring Data Using Python 3: https://www.py4e.com/

This is a complete introduction to programming in Pyhton 3. Adequate for getting a complete view of the possibilities of using Python 3 for data management.  The first 11 topics should be worked out. 
In all cases the sections / chapters should be worked out on a Python 3 interpreter. Just reading them is not enough!
 

Goals

Referring to knowledge

The main goal is for students to acquaint themselves with state of the art techniques used in industry and academia to structure language data and extract information from it. This goal can be further subdivided into two. First, from a theoretical perspective, the aim is for students to understand how linguistic data can be processed and analyzed with different computational methods; and to recognize what the advantages and drawbacks of different choices to do so are. Second, on the practical side, the aim is to enable students to be able to process natural langauge data on their own, and to be able to build on the knowledge acquired in this class to tackle problems not covered in it.

 

Referring to abilities, skills

This class gives and introduction to central aspects of natural language processing. It puts an emphasis on hands-on experience with the acquisition, manipulation, curation, and processing of linguistic data. It covers both symbolic and statistical methods, from a theoretical and practical angle.

The main goal of this class if for students to acquaint themselves with state of the art techniques used in industry and academia to structure langauge data and extract information from it; as well as to empower them to apply this knowledge to new problems outside the scope of the class.

Associated skills

  •  Programming (python)
  •  Linguistic data acquisition, manipulation, curation, and processing
  •  Machine learning
  •  Quantitative reasoning applied to language sciences

Contents

1. Handling text

2. Language models

3. Tagging

4. Parsing

5. Information extraction

6. Other topics of interest to students (e.g., human-in-the-loop machine learning and summarization)

Teaching methods

This course is largely based on a flipped-classroom format. Students are expected to prepare weekly readings and to lead a portion of a weekly session (see "Evaluation"), the remaining time is devoted to theoretical discussions and practical applications of the concepts introduced.
 

Evaluation

  • 20% participation in class discussions/presentations
  • 80% practical exercises (exercise 1: 25%, exercise 2: 25%, exercise 3: 30%)

Bibliography

Other basic bibliography items:

  • Bird, Steven; Klein, Ewan & Loper, Edward (2009), Natural Language Processing with Python. Analyzing Text with the Natural Language Toolkit. O’Reilly Media. (new version for Python3: http://www.nltk.org/book/)
  • Jurafsky, Daniel & Manning, Christopher D. (2015), Natural Language Processing, https://class.coursera.org/nlp/lecture
 

 

Other recommended readings:

  • Allen, James (1994), Natural Language Understanding. 2nd edition. Addison Wesley.
  • Coleman, John (2005) Introducing speech and language processing. Cambridge University Press.
  • Manning, Christopher D. & Schütze, Hinrich (1999), Foundations of Statistical Natural Language Processing. The MIT Press.