15: InstructGPT

15: InstructGPT

Published on Mar 28
3447
Argmax
0:00
0:00
<p>In this episode we discuss the paper &quot;Training language models to follow instructions with human feedback&quot; by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.</p>