EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Datetime mmap FP8 Tiktoken JSON Mixtral FP16 Bin Gemma Github 顶会 ChatGPT Qwen diffusers Use 财报 UI Anaconda Image2Text Ptyhon Vmess DeepStream Augmentation Input Vim Algorithm Attention NLTK OpenAI Card GIT Domain v0.dev Python WebCrawler RGB MD5 LLAMA LoRA IndexTTS2 OpenCV DeepSeek BF16 Dataset TensorRT 第一性原理 transformers VGG-16 QWEN NameSilo Bipartite Color Land BTC 证件照 Safetensors v2ray C++ CC HuggingFace Magnet Django Rebuttal Cloudreve Quantize InvalidArgumentError LeetCode LaTeX Hilton Math FP32 XML Jetson tqdm Conda Paddle HaggingFace GPT4 CV Knowledge 版权 签证 Firewall Plate Bert uwsgi git-lfs 强化学习 scipy Agent Disk Hungarian SPIE ModelScope CAM 算法题 Food Michelin Video Crawler Animate Heatmap SAM ONNX AI 搞笑 关于博主 Template NLP Interview Data FlashAttention CTC Docker Breakpoint icon Pytorch Zip Shortcut PDB 递归学习法 hf 阿里云 Nginx 腾讯云 GGML 公式 COCO Qwen2 YOLO torchinfo PDF Quantization Paper Git 图标 Statistics Jupyter Freesound 净利润 GoogLeNet PyTorch Pickle uWSGI Base64 llama.cpp Clash Distillation Windows VPN 多线程 logger Pillow CEIR VSCode XGBoost Claude Sklearn EXCEL CLAP 报税 PyCharm CSV 多进程 WAN SQL Google OCR SQLite BeautifulSoup Linux TTS printf Excel 音频 PIP Miniforge Hotel FP64 FastAPI Search ResNet-50 TSV UNIX Tensor Translation Web TensorFlow 图形思考法 Plotly Transformers Proxy SVR git Diagram GPTQ Markdown tar Bitcoin Baidu 继承 RAR Streamlit 飞书 Pandas API Tracking News LLM Qwen2.5 Random Website Numpy Llama Logo 云服务器 Permission Ubuntu 域名 CUDA Review Password
站点统计

本站现有博文323篇,共被浏览795628

本站已经建立2493天!

热门文章
文章归档
回到顶部