EADST

QWEN7B to LLAMA GPTQ model structure

Here is the markdown format for the GPTQ model structure, detailing each layer and component:


GPTQ Model Structure

The GPTQ model consists of the following layers and components:

Embedding Layer

  • model.embed_tokens.weight: torch.Size([151851, 4096])

Layers

Each layer in the model has the following components:

Layer 0 to Layer 31

Each layer (model.layers.[0-31]) includes:

  • input_layernorm.weight: torch.Size([4096])

  • Self-Attention Sublayer:

    • k_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • o_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • q_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

    • v_proj:

      • qweight: torch.Size([512, 4096])

      • qzeros: torch.Size([32, 512])

      • scales: torch.Size([32, 4096])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([4096])

  • MLP (Multi-Layer Perceptron) Sublayer:

    • down_proj:

      • qweight: torch.Size([1376, 4096])

      • qzeros: torch.Size([86, 512])

      • scales: torch.Size([86, 4096])

      • g_idx: torch.Size([11008])

      • bias: torch.Size([4096])

    • gate_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

    • up_proj:

      • qweight: torch.Size([512, 11008])

      • qzeros: torch.Size([32, 1376])

      • scales: torch.Size([32, 11008])

      • g_idx: torch.Size([4096])

      • bias: torch.Size([11008])

  • post_attention_layernorm.weight: torch.Size([4096])

Final Layer Normalization and Output

  • model.norm.weight: torch.Size([4096])
  • lm_head.weight: torch.Size([151851, 4096])
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
公式 Mixtral OpenAI HuggingFace LoRA torchinfo Bert XGBoost uwsgi Tensor Gemma FP32 Image2Text CC Bipartite Proxy BeautifulSoup SPIE WebCrawler PIP Pickle Numpy Quantize Excel Vim GPT4 Rebuttal 音频 Claude FlashAttention NLTK Bitcoin Knowledge Domain Pandas UNIX mmap 第一性原理 git-lfs Django Streamlit CV 财报 Datetime MD5 Tiktoken News CUDA Color Github Paddle Hungarian Card Quantization BTC SAM LLM printf Breakpoint Search Transformers Baidu Augmentation Land RGB XML Jupyter 飞书 版权 Website Git 阿里云 Data 顶会 Docker 净利润 Bin VGG-16 v2ray COCO CEIR Disk Logo Firewall Clash NameSilo NLP CAM EXCEL JSON OpenCV tar TSV Pillow BF16 Cloudreve RAR PyCharm Web VSCode Nginx Magnet 图标 Michelin Google SQL GGML AI icon llama.cpp CSV 关于博主 Plate 强化学习 PDF Vmess QWEN C++ 证件照 InvalidArgumentError Agent 报税 Hotel Jetson FP64 Sklearn TensorFlow Dataset 继承 Template Use Freesound transformers Ptyhon Review Heatmap SQLite Pytorch SVR 图形思考法 搞笑 Llama CTC Zip hf 域名 Translation API FastAPI PDB Qwen Python DeepStream FP16 多线程 Tracking scipy TensorRT Interview Food GoogLeNet Plotly HaggingFace Hilton Attention 云服务器 Paper CLAP 递归学习法 Miniforge uWSGI IndexTTS2 GPTQ tqdm Ubuntu FP8 LLAMA ONNX OCR Shortcut logger Windows TTS Algorithm LaTeX Animate ChatGPT GIT Anaconda 签证 Random git Video Base64 Safetensors Input ModelScope Qwen2 PyTorch DeepSeek Conda 多进程 Linux WAN ResNet-50 Distillation Password Statistics LeetCode Permission Math Qwen2.5 Crawler Markdown v0.dev UI 腾讯云 算法题 Diagram diffusers VPN YOLO
站点统计

本站现有博文323篇,共被浏览795435

本站已经建立2493天!

热门文章
文章归档
回到顶部