EADST

Quick Review: ZeroQuant-FP

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

Highlights:

  • FP4 Weight Quantization: Implements 4-bit floating-point (FP4) quantization for model weights.
  • FP8 Activation Quantization: Utilizes 8-bit floating-point (FP8) quantization for activations, optimizing the balance between performance and precision.
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
BTC OpenAI SPIE CLAP Ptyhon Ubuntu 音频 FP8 Git SQLite TensorRT Pytorch DeepStream icon Attention Hilton Django Bert Paper Image2Text Vmess Hotel API Tracking Search AI Plotly Permission 强化学习 飞书 News UNIX C++ PyCharm Jetson XGBoost Bitcoin hf Bin OCR printf FP32 LLM Card Clash tqdm Base64 GoogLeNet NLTK transformers TTS OpenCV ONNX Math LaTeX Tensor Crawler Template MD5 图形思考法 Statistics Baidu Freesound GPT4 腾讯云 LeetCode 算法题 Dataset 递归学习法 云服务器 FastAPI torchinfo 第一性原理 Paddle Tiktoken WebCrawler llama.cpp GIT 搞笑 Review Animate Proxy Github InvalidArgumentError Conda 阿里云 Qwen2 ChatGPT Quantize Input Numpy Claude VSCode Diagram 关于博主 Pandas HaggingFace uwsgi Random TSV CUDA 顶会 LLAMA PIP v2ray Vim RGB Website Markdown Rebuttal PyTorch VPN Llama ModelScope Magnet Plate GGML BF16 Domain RAR 域名 Mixtral Breakpoint Windows NLP Jupyter FlashAttention Gemma Password Google Python Use FP16 uWSGI diffusers Color LoRA Distillation HuggingFace v0.dev Miniforge ResNet-50 mmap Excel Logo JSON YOLO COCO Heatmap EXCEL Disk QWEN 公式 Linux UI SVR git Sklearn Anaconda Web VGG-16 tar Algorithm Datetime GPTQ scipy 多进程 Agent Bipartite Land Shortcut Video CSV Qwen2.5 CEIR 图标 CTC Hungarian git-lfs DeepSeek Firewall Docker 版权 Pillow CV Augmentation IndexTTS2 PDF NameSilo Food Streamlit Nginx SQL SAM Zip 多线程 继承 财报 报税 TensorFlow Quantization Knowledge 证件照 签证 logger Safetensors Michelin PDB Pickle FP64 WAN Cloudreve XML Interview BeautifulSoup CC Qwen Data 净利润 Translation Transformers CAM
站点统计

本站现有博文323篇,共被浏览795410

本站已经建立2493天!

热门文章
文章归档
回到顶部