Quick Review: QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models
作者:XD / 发表: 2023年12月7日 00:06 / 科研学习/ 阅读量:1838
QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models
Paper: https://arxiv.org/abs/2310.09259
Code: https://github.com/IST-DASLab/QUIK
Organization: ETH Zurich