EADST

Quick Review: ZeroQuant-FP

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

Highlights:

  • FP4 Weight Quantization: Implements 4-bit floating-point (FP4) quantization for model weights.
  • FP8 Activation Quantization: Utilizes 8-bit floating-point (FP8) quantization for activations, optimizing the balance between performance and precision.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Search CEIR FP16 Tensor Docker uwsgi 净利润 飞书 Agent llama.cpp mmap Bipartite icon NameSilo GPT4 Mixtral ChatGPT OpenCV CLAP Card Firewall RGB COCO QWEN BeautifulSoup Git 继承 Password 域名 Animate JSON Augmentation tar Algorithm Math Nginx FP8 Use GIT Pandas 多线程 Ubuntu Proxy API MD5 Image2Text Web v0.dev Land Freesound 多进程 LeetCode Hilton 图形思考法 Qwen LLM Rebuttal Logo LoRA Video Paddle Hungarian 版权 GoogLeNet ONNX PDF PDB CTC Plate CV Bitcoin OCR Pickle ResNet-50 Dataset YOLO XGBoost XML WebCrawler Paper Streamlit GPTQ SQLite FP32 Bin Quantize 第一性原理 FP64 Python Plotly AI Distillation Baidu Color Tiktoken Attention Permission TensorFlow 图标 diffusers Food SAM CAM FlashAttention Quantization 财报 LLAMA LaTeX torchinfo git-lfs transformers Ptyhon DeepStream Review NLP Hotel Bert TTS Llama Safetensors 公式 Statistics 报税 Random Input Django scipy 云服务器 Website Github Disk uWSGI Conda SVR VSCode Jupyter Template Knowledge Pillow OpenAI 关于博主 Breakpoint Diagram Claude 算法题 Translation Shortcut 证件照 Qwen2.5 Gemma 递归学习法 WAN Windows 搞笑 Domain Tracking TSV HuggingFace v2ray Linux RAR Transformers Datetime SQL Miniforge BTC Heatmap Vmess 顶会 Magnet Markdown C++ UI VGG-16 Crawler PIP FastAPI 强化学习 Anaconda 腾讯云 Sklearn Google PyTorch IndexTTS2 DeepSeek GGML CUDA 音频 Michelin Data BF16 printf tqdm TensorRT EXCEL Pytorch NLTK 签证 Qwen2 Jetson UNIX Clash Excel Base64 SPIE ModelScope git CSV Interview CC News InvalidArgumentError 阿里云 Vim Zip logger VPN PyCharm HaggingFace hf Cloudreve Numpy
站点统计

本站现有博文323篇,共被浏览797187

本站已经建立2495天!

热门文章
文章归档
回到顶部