$\color{black}\rule{365px}{3px}$
“Language Models are Unsupervised Multitask Learners” - 2019
Link to Paper:
cdn.openai.com
Table of Contents
1. Introduction
$\color{black}\rule{365px}{3px}$
Motivation
- From ”Narrow Experts” to ”Competent Generalists”
- A single model that can generalize to multiple tasks without task-specific tuning.
(Yet, GPT-1 requires some task-specific fine-tuning.)
- Zero-Shot and Few-Shot Capabilities
- Unlike most earlier approaches requiring fine-tuning or specialized architectures for different tasks, GPT-2 emphasizes zero-shot performance—i.e., applying the same model directly to new tasks without fine-tuning.
Contributions
Language models can perform down-stream tasks in a zero-shot setting – without any parameter or architecture modification.