EADST

Quick Review: ZeroQuant-FP

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

Highlights:

  • FP4 Weight Quantization: Implements 4-bit floating-point (FP4) quantization for model weights.
  • FP8 Activation Quantization: Utilizes 8-bit floating-point (FP8) quantization for activations, optimizing the balance between performance and precision.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Llama Freesound icon Domain CEIR Gemma FastAPI QWEN Transformers Safetensors hf Review 签证 Clash XGBoost Ubuntu Website 域名 LLAMA 论文速读 论文 Nginx Django Proxy 搞笑 Jupyter Github Random Cloudreve PDF Template CC CTC Datetime Google 阿里云 FP16 BTC GPTQ Disk 报税 Hungarian DeepStream News Plate scipy Bert WAN HaggingFace DeepSeek LeetCode Linux uwsgi Claude RAR Conda 云服务器 EXCEL Hotel v2ray Qwen2 SQLite 图标 Pillow GoogLeNet Tiktoken Windows Docker Land Animate 公式 财报 FP8 Numpy Jetson Interview printf VPN TensorFlow Logo TensorRT FlashAttention Rebuttal diffusers LLM 继承 CSV Augmentation PyTorch OpenAI IndexTTS2 ONNX Qwen2.5 Paper CV VGG-16 PIP 顶会 Magnet 音频 Shortcut Heatmap SQL 多进程 Firewall Agent Diagram JSON logger XML git-lfs BeautifulSoup LoRA API SPIE uWSGI NLTK 腾讯云 Tensor llama.cpp Markdown CAM 多线程 Michelin GIT OpenCV FP64 Data UNIX TSV git mmap Pytorch YOLO PDB ResNet-50 Algorithm ModelScope Base64 Baidu Food tar BF16 Input NameSilo Tracking Python Video Streamlit Git Image2Text Bin Pickle Paddle 递归学习法 版权 Breakpoint RGB Permission Bipartite GGML tqdm OCR Web GPT4 Mixtral Knowledge Color SAM Pandas Qwen Hilton 飞书 SVR NLP Password Zip Vim C++ Math Attention 强化学习 HuggingFace Distillation Translation COCO InvalidArgumentError v0.dev 净利润 FP32 Vmess TTS Sklearn WebCrawler Use Excel torchinfo 证件照 AI 第一性原理 ChatGPT Quantize 图形思考法 Search Card Crawler UI LaTeX MD5 Plotly VSCode Statistics CLAP Bitcoin Quantization Dataset Miniforge PyCharm 算法题 transformers CUDA 关于博主 Ptyhon Anaconda
站点统计

本站现有博文327篇,共被浏览835017

本站已经建立2540天!

热门文章
文章归档
回到顶部