EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
CAM Ubuntu Jetson uwsgi Llama git FlashAttention Quantize 音频 Markdown Random CC ONNX Math Baidu Video VPN PDB 图标 Nginx Rebuttal NLTK Firewall Bitcoin 关于博主 Logo 飞书 Conda QWEN uWSGI InvalidArgumentError Clash XGBoost 报税 XML transformers Augmentation Translation 财报 CUDA Gemma Git Domain FP8 继承 Shortcut Qwen2.5 Interview Google CV Mixtral Land 多进程 PDF tqdm 域名 Streamlit Paper JSON C++ scipy Vim Template BeautifulSoup Sklearn TensorFlow diffusers IndexTTS2 Qwen2 Hungarian Michelin 递归学习法 Color Bin LLM Jupyter 第一性原理 HaggingFace LLAMA 版权 Miniforge Hotel Diagram CLAP printf Disk PIP Dataset Proxy MD5 YOLO 论文 Pillow Anaconda Transformers BF16 Food GIT PyCharm UI Pickle DeepSeek VSCode v0.dev 净利润 Agent COCO 图形思考法 Breakpoint OpenAI 多线程 顶会 Animate Pandas SQLite logger tar LoRA ModelScope FastAPI FP32 CEIR Base64 Vmess Qwen Web TSV TTS 证件照 签证 Bipartite WebCrawler Datetime CSV 腾讯云 Cloudreve Safetensors git-lfs SVR OCR Zip Review API Django mmap GGML Pytorch Attention Docker EXCEL HuggingFace LaTeX NLP Data Excel v2ray OpenCV SQL RAR Ptyhon Statistics GPTQ Linux SPIE PyTorch ChatGPT 强化学习 icon llama.cpp Python 论文速读 Image2Text torchinfo Distillation RGB Windows Tiktoken Bert NameSilo Use Card DeepStream AI 云服务器 Freesound WAN Password Search ResNet-50 Heatmap Magnet Knowledge Paddle Tensor Tracking VGG-16 Claude GoogLeNet Github hf Input Permission GPT4 Hilton 阿里云 Numpy FP16 搞笑 Plate 算法题 SAM UNIX Crawler Algorithm News Website LeetCode Quantization BTC CTC Plotly TensorRT 公式 FP64
站点统计

本站现有博文327篇,共被浏览833146

本站已经建立2538天!

热门文章
文章归档
回到顶部