Computational morphosyntax
Prior recommendations
It is required that you have good working knowledge of the Python programming language (Python 3) before the start of the course. If you are new to programming or to Python, you should take the course Natural Language Processing in the first semester.
If you are not able to here it is a selection of Python tutorials that may help you in this training:
Gauld,Alan (2011) Learning to program: http://www.alan-g.me.uk/l2p2/index.htm
This is a basic introduction, that also introduces some basic concepts of programming (in addition to Python3, also some examples are given of JavaScript and VBScript). Adequate for absolute beginners in programming. At least sections under “Concepts” and “The basics” plus the two first ones under “Advanced Topics” should be worked out.
Zed A. Shaw (2017) LEARN PYTHON 3 THE HARD WAY. A Very Simple Introduction To The Terrifyingly Beautiful World Of computers And Code (Third Edition): https://www.pdfdrive.com/learn-python-3-the-hard-way-e52089947.html
This is a step-by-step thorough introduction to the basics of Python3. Adequate for people desiring to acquire the skills of programming. The first 39 sections should be worked out.
Charles R. Severance (2016) Python for Everybody. Exploring Data Using Python 3: https://www.py4e.com/
This is a complete introduction to programming in Pyhton 3. Adequate for getting a complete view of the possibilities of using Python 3 for data management. The first 11 topics should be worked out.
In all cases the sections / chapters should be worked out on a Python 3 interpreter. Just reading them is not enough!
Goals
Referring to knowledge
The main goal is for students to acquaint themselves with state of the art techniques used in industry and academia to structure language data and extract information from it. This goal can be further subdivided into two. First, from a theoretical perspective, the aim is for students to understand how linguistic data can be processed and analyzed with different computational methods; and to recognize what the advantages and drawbacks of different choices to do so are. Second, on the practical side, the aim is to enable students to be able to process natural langauge data on their own, and to be able to build on the knowledge acquired in this class to tackle problems not covered in it.
Referring to abilities, skills
This class gives and introduction to central aspects of natural language processing. It puts an emphasis on hands-on experience with the acquisition, manipulation, curation, and processing of linguistic data. It covers both symbolic and statistical methods, from a theoretical and practical angle.
The main goal of this class if for students to acquaint themselves with state of the art techniques used in industry and academia to structure langauge data and extract information from it; as well as to empower them to apply this knowledge to new problems outside the scope of the class.
Associated skills
- Programming (python)
- Linguistic data acquisition, manipulation, curation, and processing
- Machine learning
- Quantitative reasoning applied to language sciences
Contents
1. Handling text
2. Language models
3. Tagging
4. Parsing
5. Information extraction
6. Other topics of interest to students (e.g., human-in-the-loop machine learning and summarization)
Teaching methods
This course is largely based on a flipped-classroom format. Students are expected to prepare weekly readings and to lead a portion of a weekly session (see "Evaluation"), the remaining time is devoted to theoretical discussions and practical applications of the concepts introduced.
Evaluation
- 20% participation in class discussions/presentations
- 80% practical exercises (exercise 1: 25%, exercise 2: 25%, exercise 3: 30%)
Bibliography
Other basic bibliography items:
|