EADST

Save Hugging Face Model with One Bin

max_shard_size (int or str, optional, defaults to "10GB") — Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like "5MB").

Based on the introduction, one bin model can be saved by changing the "max_shard_size".

LlamaForCausalLM.save_pretrained(base_model, output_dir, max_shard_size="100GB") # save one bin if the model is less than 100GB

Reference

PreTrainedModel

About Me
XD
Goals determine what you are going to be.
Category
标签云
Distillation Firewall Linux GIT CTC MD5 IndexTTS2 Paddle 财报 算法题 llama.cpp icon Input 版权 API Logo 报税 Quantize Llama SPIE uWSGI RGB Hilton UNIX Image2Text Math Michelin 关于博主 Review VGG-16 Template Base64 Clash OpenAI Jupyter Freesound logger Vmess Diagram Git PIP DeepStream Pytorch Mixtral PyTorch SVR 净利润 TTS hf BTC SQL COCO ChatGPT CLAP TensorRT ONNX git Streamlit Dataset VSCode Bin 顶会 Qwen 音频 GGML LLAMA Safetensors Algorithm Data WAN 域名 SAM Django NameSilo CEIR CV ResNet-50 递归学习法 Bert FastAPI Datetime 阿里云 C++ ModelScope Pillow GoogLeNet Random Shortcut Tiktoken tqdm Qwen2.5 Tracking CC 搞笑 FP8 uwsgi Video YOLO BeautifulSoup UI TSV scipy Anaconda Baidu Bitcoin Paper Miniforge 第一性原理 torchinfo LLM Disk 飞书 Transformers Python Breakpoint FP16 RAR Qwen2 Ptyhon Claude Quantization 云服务器 diffusers Food transformers OCR PyCharm Interview Conda Attention QWEN HuggingFace Website VPN Search InvalidArgumentError 图标 git-lfs Excel 多进程 Rebuttal FlashAttention FP32 Agent Docker EXCEL LaTeX Color XGBoost Ubuntu Proxy Translation Gemma printf 强化学习 Cloudreve mmap Bipartite BF16 NLP CUDA NLTK Windows 多线程 Statistics DeepSeek Vim Nginx Hotel 证件照 v2ray AI News XML LeetCode Numpy Sklearn Web WebCrawler Markdown Knowledge GPT4 Crawler Google Augmentation Pandas PDF CSV OpenCV Domain TensorFlow HaggingFace Plotly FP64 腾讯云 Animate JSON Permission 继承 Heatmap Card tar 签证 CAM PDB Zip Use 公式 Github Plate LoRA Tensor Password 图形思考法 GPTQ SQLite v0.dev Land Magnet Hungarian Jetson Pickle
站点统计

本站现有博文323篇,共被浏览795642

本站已经建立2493天!

热门文章
文章归档
回到顶部