4 - Text Classification

Qianqian included in Hands-on Large Language Models

2025-09-17 About 400 words 2 minutes

Contents

Classifying text is used across a wide range of applications, from sentiment analysis and intent detection to extract entities and detect language. This chapter discusses different ways to use language models for classifying text such as using representation models or generative models.

Text Classification with Representation Models

Two types of classification with representation models based on fine-tuned foundation model:

Task-specific model: a representation model trained for a specific task such as sentiment analysis.
Embedding model: a representation model that generates general-purpose embeddings that can be used for a variety of tasks such as classification and semantic search.

Model Selection

Factors that should take into consideration when selecting proper model:

Choose the model that fits your use case
Consider the language compatibility
The underlying architecture: encoder-only (e.g., BERT) or decoder-only (e.g, GPT)
Model size: encoder-only models tend to be smaller in size
Performance
Inference speed

Classification Tasks That Leverage Embeddings

Steps:

Use embedding model to generate features
Feed the features into a classifier for classification

What If We Do Not Have Labeled Data?

For cases when we have no labeled data, we can use zero-shot classification to predict the labels of the input text, and this will also help us to test if it is worthwhile to collect the labels, which is usually resource-intensive.

Describe the labels based on what they should represent (e.g., a negative label for movie reviews can be described as “this is a negative movie review”)
Create label embeddings based on the description (encoding)
Assign labels to documents (input) by comparing the similarity between the document and the labed encodings

Text Classification with Generative Models

How is it different with the classification of the representation models? As shown in the diagram, a task-specific model (with representations) generate numerical values from tokens, while the generative model generates sequence of tokens that contain the classification results via prompt (or instruction) to guide the process.

compare representation and generative models for classfication — Fig. Compare representation (sequence-to-value) and generative models (sequence-to-sequence) for classfication.