← Library · Definition

Reinforcement Learning from Human Feedback (RLHF)

RLHF is a training method where a model learns to perform tasks better by getting feedback from humans. Instead of just learning from predefined data, the model tries different actions, and humans rate which outcomes are best, guiding the model to improve its behavior over time through a reward system.

Learn one new AI thing every day.

Daily Deck sends you seven plain-English cards like this every morning. Free.

Start free