Pipeline

M.Sc course, University of Debrecen, Department of Data Science and Visualization, 2025

Colab

What is NLP?

  • NLP (Natural Language Processing) is a field of linguistics and machine learning that focuses on understanding human language.
  • NLP tasks aim to understand individual words individually and their context.

Most common NLP tasks

  • Classifying sentences:
    • Sentiment (mood) analysis,
    • Categorization, for example, is an email spam?
    • Is a sentence grammatically correct?
    • Are two sentences logically related or not?
  • Classification of individual words in a sentence: identifying the grammatical constituents of a sentence (noun, verb, adjective) or the named entities (person, place, organization).
  • Generating textual content:
    • Prompt filling with automatically generated text
    • Fill in blanks in text with masked words
  • Given a question and a context, retrieve the answer to the question based on the information provided in the context.
  • Generate a new sentence from input text: translate text into another language, summarise text.
  • Multimodal and complex solutions
    • Prompt engineering
    • Generate images
    • Chat

But NLP is not limited to written text. It also tackles the complex challenges of speech recognition and computer vision, for example by creating transcriptions of voice samples or image transcriptions!

Most important NLP and AI packages!