Back to Blog

NLP 101 - glossary for recruiters or sourcers

Natural language processing (NLP) is a subfield of artificial intelligence (AI) that involves the development of algorithms and systems that can understand and generate human language. As a recruiter, it is important to understand the key terms and concepts in NLP in order to identify the right candidates for NLP roles.

Even after you identify the candidate and have a phone screen with them, it is often difficult to ask them intelligent questions about the work they did in the field. Have you ever sat down to write a submission to the hiring manager and looked over your notes and realized you don't have a clue of what to say? 🤦🏻 If so, read on to have a fluent conversation with NLP engineers in the future 😊

Here is a list of the top 15 terms that you may encounter when trying to recruit NLP engineers, along with brief descriptions that would be understandable by someone in recruiting who is not an expert in NLP:

  1. Tokenization: the process of dividing a text into smaller units called tokens, which can be words, phrases, or symbols.
  2. Lemmatization: the process of reducing a word to its base form, known as the lemma, which is typically the form of the word found in a dictionary.
  3. Stemming: the process of reducing a word to its base form, known as the stem, which is typically a form of the word that is stripped of inflections such as suffixes and prefixes.
  4. Part-of-speech tagging: the process of assigning a part of speech (e.g. noun, verb, adjective) to each token in a text.
  5. Named entity recognition: the process of identifying and classifying named entities (e.g. people, organizations, locations) in a text.
  6. Sentiment analysis: the process of determining the sentiment (e.g. positive, negative, neutral) of a text.
  7. Topic modeling: the process of identifying the main topics in a text or collection of texts.
  8. Text classification: the process of assigning a label or category to a text based on its content.
  9. Language modeling: the process of building a statistical model of a language in order to predict the likelihood of a sequence of words.
  10. Machine translation: the process of translating text from one language to another using machine learning algorithms.
  11. Information extraction: the process of extracting structured information from unstructured text.
  12. Text summarization: the process of generating a concise summary of a text or collection of texts.
  13. Text generation: the process of generating text that is coherent and human-like.
  14. Language identification: the process of identifying the language of a text.
  15. Text normalization: the process of transforming text into a standard form, such as lowercasing and stemming.
In the future - we might have systems such as ChatGPT talking directly to engineers - until then, hope you find this glossary useful!

About Rocket

Rocket pairs talented recruiters with advanced AI to help companies hit their hiring goals and knows technology recruiting inside out. Rocket is headquartered in the heart of Silicon Valley but has recruiters all over the US & Canada serving the needs of our growing client base across engineering, product management, data science and more.

More from the Blog

Building Technical Recruiting Teams: Engagement Models, Tooling and Technology

Discover strategies, engagement models, and tools for building scalable technical recruiting teams at any growth stage, with practical insights and case studies.

Read Story

Navigating Tech Recruiting in the San Francisco Bay Area in 2024

A comprehensive guide highlighting key trends, challenges, and strategies for attracting top talent in a dynamic market.

Read Story