Hacker News
new
|
ask
|
show
|
jobs
DSpark: Speculative decoding accelerates LLM inference [pdf]
(github.com)
646 points
by
aurenvale
8 hours ago
|
243 comments
Loading...