Quick Review: Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs| 东毅居士

Quick Review: Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

作者：XD / 发表： 2023年12月6日 23:51 / 更新： 2023年12月7日 00:55 / 科研学习 / 阅读量：1832

Optimize Weight Rounding via Signed Gradient Descent for the Quantization of Large Language Models

Paper: Optimize Weight Rounding on arXiv
Code: Intel Neural Compressor on GitHub
Organization: Intel

Key Feature:

Adaptive Weight Rounding: Utilizes backward optimization to dynamically adjust the quantized integer values, either rounding them up or down, to optimize the model's performance during quantization.

本文作者：XD 转载请标明出处：http://www.eadst.com/blog/224

本站采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。

上一篇
Quick Review: Norm Tweaking: High-performance Low-bit Quantization of Large Language Models

下一篇
Quick Review: SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

相关标签

LLM Quantization

About Me

XD

Goals determine what you are going to be.

Category

标签云

Rebuttal Paddle Website 财报 Gemma Domain Pandas VSCode Permission Vim Jetson SQL 净利润 CUDA CC Video Shortcut RGB Django 关于博主 CTC Crawler 搞笑 Tensor git GPT4 Git Michelin LoRA Disk AI Magnet Augmentation CAM OCR GoogLeNet Github IndexTTS2 Jupyter hf Search Transformers Knowledge Datetime scipy Logo WAN Hilton 图标 Windows SAM WebCrawler Ptyhon News CV Diagram LaTeX Bipartite Bitcoin Streamlit BTC Numpy LLM Qwen 图形思考法 Cloudreve VGG-16 NLP HuggingFace ChatGPT 多线程 Land SPIE git-lfs ResNet-50 Firewall Pickle Baidu GGML Distillation SQLite API EXCEL Image2Text HaggingFace 域名 Excel QWEN Nginx TTS PDF Linux FlashAttention Markdown TensorRT Ubuntu Statistics Card Qwen2.5 Input PDB Hungarian Math Mixtral Review diffusers GIT FP8 报税 RL UI 腾讯云 BeautifulSoup SVR Anaconda Tiktoken LeetCode CEIR Pillow BF16 Conda FP64 Qwen2 FP16 Translation Animate Interview Quantization 阿里云版权 PyCharm Color NLTK printf Miniforge Bert icon Random 继承 llama.cpp Safetensors ONNX UNIX Heatmap Template 签证第一性原理 Base64 Hotel OpenAI CSV Pytorch Llama Claude Web Attention RAR Use 多进程 logger transformers Tracking Paper Sklearn Algorithm Agent 云服务器 Clash Vmess InvalidArgumentError Plotly uWSGI DeepSeek v0.dev 论文 YOLO 强化学习 Password 递归学习法 Zip 飞书 mmap Dataset Plate Proxy COCO torchinfo 音频 tar 公式算法题 OpenCV Food Freesound tqdm TSV Google Python Quantize ModelScope TensorFlow VPN XML 顶会 ms-swift Bin LLAMA CLAP XGBoost uwsgi FastAPI v2ray C++ DeepStream FP32 Breakpoint JSON GPTQ Data PyTorch MD5 PIP 论文速读证件照 NameSilo Docker

站点统计

本站现有博文332篇,共被浏览875404次

本站已经建立2583天!

热门文章

文章归档