Hacker News
new
|
ask
|
show
|
jobs
Reinforcement Learning from Human Feedback
(rlhfbook.com)
95 points
by
onurkanbkrc
9 hours ago
|
5 comments
https://arxiv.org/abs/2504.12501
Loading...