The Hugging Face NLP Course
Written by Nikos Vaggalis   
Tuesday, 04 July 2023

A free, self-paced and comprehensive course that will take you from beginner to expert in the topic of Natural Language Processing comes from Hugging Face, a data science platform with a community of data scientists, researchers, and ML engineers who contribute to open source projects.

Natural Language Processing or NLP is a subfield of Artificial Intelligence that makes computers understand natural languages like English. So what? Why invest in learning NLP in the first place?

NLP tries to make sense out of textual data, which is much more difficult than doing the same with numerical data. Applications of NLP are everywhere because people communicate almost everything in language: web search, advertising, emails, customer service, language translation, virtual agents, medical reports, etc. Many organizations are looking to integrate NLP into their workflows and products they provide such as translation, speech recognition and chatbots. Sounds like a good career move.

Specifically NLP is used today in products like:

  • voice-driven assistants
  • natural-language search
  • question answering
  • sentiment analysis for automated trading
  • business intelligence
  • social media analytics
  • content summarization

This includes Amazon Alexa, Google Home Assistant, Cortana and Siri, to name just a few implementations.

This course offers then a first class opportunity to learn about natural language processing (NLP) using the libraries from the Hugging Face ecosystem.

The class begins by introducing the Transformers library in Chapters 1 to 4, going through how the Transformer models work using a model from the Hugging Face Hub. This includes a look at Encoder, Decoder and Sequence-to-sequence models, fine-tuning those models with the Trainer API or Keras,
as well as an introduction to The Hugging Face Hub's pretrained models.

Chapters 5 to 7 teach the basics of Datasets and Tokenizers before diving into classic NLP tasks. This includes creating your own dataset, semantic search with FAISS, training a new tokenizer from an old one, as well as building a tokenizer block by block.

Chapter 8 includes instructions on how to debug errors and how to ask for help when needing it.
Final chapter 9 shows how to build interactive demos for your machine learning models.

The pre-requisites barrier is low; you should be comfortable with Python and a bit of high school math. No previous knowledge of NLP or machine learning is assumed but some familiarity with either PyTorch or TensorFlow is desirable.

Time wise, although self paced, each chapter is designed to be completed in 1 week, if dedicating 6-8 hours of work per week.

The course can be taken from its official site, but it can also be consumed as a 79 items long Youtube playlist. My recommendation is to follow the official site as it's better organized plus it holds references to the code which can be found at each section which can be run in either Google Colab or Amazon SageMaker Studio Lab.

hugging face sq

More Information

The Hugging Face NLP Course - Offical

Youtube playlist

Forum to ask for help

Related Articles

Take Stanford's Natural Language Processing with Deep Learning For Free

IBM Releases Deep Search For Scientific Discovery


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


The Appeal of Google Summer of Code

With the list of participating organizations now published, it is time for would-be contributors to select among them and apply for Google Summer of Code (GSoC). Rust has joined in the program fo [ ... ]

GitHub Introduces Code Scanning

GitHub has announced a public beta of a code scanner that automatically fixes problems. The new feature was announced back in November, but has now moved to public beta status.  

More News

raspberry pi books



or email your comment to: