Peeling Natural Language Processing layer by layer — An Introductory Overview
Natural Language Processing (NLP) is probably one of the most turbulent fields today under the Computer Sciences umbrella.
While it is not something new, technological advancements, new algorithms and data abundance made the possibility of getting computers to read/write and listen/speak something almost mundane (not to mention the attempts to make computers really understand what is written — which is the deal of Natural Language Understanding — NLU).
This story is the starting point for a series proposing to present the field of Natural Language Processing without being neither too math-tech-bot nor too languages-theories-worm. The idea is to present an agnostic view of both NLP practice and theory that could be enjoyed by both linguists and computer scientists.
There are some very good NLP presentations out there, but so far I could not find one that attempts to pick at every layer (or at last the most important) in the NLP stack (yes, there’s a stack of technologies, techniques and theories). Not to despise other authors excellent articles, at each of the layers analyzed I’ll append a list of good reference articles so anyone who wants to go deeper on any of the directions (math or linguistics) can do it.
Now, who am I and what credentials do I have? To be short, I’m no more than a learner like you. However, if you want a little “official” details: I’m a History Teacher, Systems Analyst, Computer Scientist and got a Masters in Computer Intelligence (you guessed it, in Natural Language Processing — actually, more precisely, in Natural Language Understanding, but NLP was there all the time).
Do I currently work with NLP? No. However, I’m a constant presence in StackOverflow NLP tag and pick every opportunity to play with NLP and its different tech.
Therefore, this could also be considered an organized summary of my experiences with NLP, where I try to be very clear at each topic. I’ll also use illustrations (not just those not easily understandable boxes people use out there — graphs) and references to make it easy to swallow even for the earliest learners.
As this is also proposed to be a practical series, I have to establish some base points: I’ll use Python. Why? Simple: because it is the best language out there for NLP. Want to debate on that? Here’s a short list of some Python NLP packages:
- Spacy (NLP pipeline)
- Nltk (NLP pipeline and corpora)
- AllenNLP (NLP pipeline)
- Huggingface Transformers (It is a Machine Learning toolkit, but tuned towards NLP)
- Gensim (Embeddings)
- Rasa (NLU and chatterbots)
Not to mention the many NLP preprocessing tools in most Machine Learning toolkits and easy to implement APIs for other languages (such as Stanford CoreNLP). Now, beat it with any other language packages? Also, Python is easy and fun to learn. If need for a basic introduction, see this amazing free interactive Python course available at Kaggle.
Finally, as an overview article, I’ll propose an Index, over which I’m going to build this series. As a long list of topics, some may take a while to be released out there, since many of those I’m not a specialist at (so I’ll have to study them to do a good presentation). So, here it goes (please, notice that Theory topics are not only theory and Practice topics are not only practical):
- [Theory] An Introduction to NLP — explanation and examples.
- [Theory] NLP Preprocessing Pipeline— what, when, why?
- [Practice] Tokenization (Building a Tokenizer).
- [Practice] Stemming (Building a Stemmer).
- [Practice] Part Of Speech Tagging — what, when, why, how
- [Practice] Lemmatization (Building a Lemmatizer).
- [Theory] Machine Learning for NLP — what, when, why?
- [Theory] NLP Model Based Approaches — what, when, why?
- [Practice] Pattern Extraction with POS Tagging.
- [Theory] Named Entity Recognition (NER) -what, when, why?
- [Practice] Training a simple NER algorithm.
- [Practice] Training a simple Sentiment Analysis algorithm.
- [Theory and Practice] From Bag of Words to Word Embeddings.
- [Practice] Training a simple Sentiment Analysis algorithm with Word Embeddings.
- [Theory] Seq2Seq Models — what, when, why?.
- [Practice] Building a simple Seq2Seq translator.
- [Theory] Attention Mechanism — what, when, why?.
- [Practice] Building a simple Transformer translator with Attention.
- [Theory] Natural Language Understanding — what, when, why?
- [Theory] Model Based Approaches — semantics, pragmatics and graphs.