Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, & Ryan Lowe (2022)
Advances in Neural Information Processing Systems, 35, 27730-27744.
Abstract. Ouyang et al.'s OpenAI RLHF paper — the foundational technical reference for how modern LLMs are aligned to human preferences through reinforcement learning. Increasingly cited in HCI work addressing what alignment-via-RLHF means for UX.
Tags: ai-usability rlhf alignment llm foundational