Passer au contenu principal
Concept

Reinforcement learning from human feedback