Hacker News
new
|
ask
|
show
|
jobs
KVarN: Native vLLM backend for KV-cache quantization by Huawei
(github.com)
51 points
by
theanonymousone
2 hours ago
|
7 comments
Loading...