Hacker News
new
|
ask
|
show
|
jobs
Lost in Backpropagation: The LM Head Is a Gradient Bottleneck
(arxiv.org)
4 points
by
famouswaffles
1 day ago
|
discuss