This class was first developed for the 2018 YSU – ISTC Join Summer School on Machine Learning. View or edit the source for this website at github.com/deeplanguageclass.
Recent progress in machine learning for natural language is significant, however language poses some unique challenges.
In this class we will start with natural language processing fundamentals and the current results of state-of-the-art approaches across tasks. We will focus on deep learning – the motivations behind word vectors and sequence output, and how to apply effective approaches to real tasks with industrial-strength libraries and datasets, which we will practice in the lab.
We will also keep in view how natural language tasks relate to tasks in other areas of machine learning.
Instructors: Erik Arakelyan, Teamable and Adam Bittlingmayer, Signal N
Prerequisites: solid coding skills, strong analytical ability, basic machine learning concepts, fluency in multiple human languages, a Unix system with Python
Announcements and questions: the private #nlp Slack channel and the public deeplanguageclass Telegram group
Slides: Fundamentals 1, Fundamentals 2
Slides: Deep Learning
Demos: projector.tensorflow.org, anvaka.github.io/pm/#/galaxy/word2vec-wiki, word2vec-gensim-gameofthrones.ipynb
universaldependencies.org
cloud.google.com/natural-language/
spacy.io/api/entityrecognizer
fastent.github.io
SQuAD
leaderboard: rajpurkar.github.io/SQuAD-explorer/
example: Nikola Tesla
ParlAI
parl.ai / github.com/facebookresearch/ParlAI
cs.stanford.edu/people/jcjohns/clevr/
twitter.com/picdescbot
seq2seq for DNA
Lab: Classifying Amazon reviews with fastText
Lab: Transliteration with Fairseq
See NLP Guide