Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixtral ggemm to hf format #230

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,14 +48,14 @@ In the reinforcement learning section, the patch offers PPO training workflows,


## Technical Reports
- [基于Megatron-Core的稀疏大模型训练工具:阿里云MoE大模型最佳实践](https://mp.weixin.qq.com/s/DkrWEEJ7IxirwWd3qB9Bng)
- [Mixtral-8x7B在PAI灵骏的训练指南](https://help.aliyun.com/zh/pai/use-cases/train-fine-tune-and-deploy-mixtral-by-using-intelligent-computing-lingjun)
- [通义千问开源模型在PAI灵骏的最佳实践](https://mp.weixin.qq.com/s?__biz=Mzg4MzgxNDk2OA==&mid=2247491796&idx=1&sn=dc1d719313d794ae1aacdb07669a9545&chksm=cf430783f8348e950218bfcff861a2e6d2d92705807bf5b04f6e9268cc510ffa6e6aa2c87327#rd)
- [阿里云机器学习PAI开源AI大模型训练工具Pai-Megatron-Patch, 助力大模型技术落地](https://zhuanlan.zhihu.com/p/655942437)
- [基于单机最高能效270亿参数GPT模型的文本生成与理解](https://zhuanlan.zhihu.com/p/597652820)
- [中文稀疏GPT大模型落地 — 通往低成本&高性能多任务通用自然语言理解的关键里程碑](https://zhuanlan.zhihu.com/p/561320982)
- [预训练知识度量比赛夺冠!阿里云PAI发布知识预训练工具](https://zhuanlan.zhihu.com/p/449487792)
- [阿里云PAI获得FewCLUE基于大模型的小样本学习双料冠军](https://developer.aliyun.com/article/788081?spm=a2c6h.12873639.article-detail.17.11c5383cHpFZks&tlog=yuekan_8)
- [Sparse Large Model Training Tool Based on Megatron-Core: Best Practices for Alibaba Cloud MoE Large Models](https://mp.weixin.qq.com/s/DkrWEEJ7IxirwWd3qB9Bng)
- [Mixtral-8x7B Training Guide on PAI Lingjun](https://help.aliyun.com/zh/pai/use-cases/train-fine-tune-and-deploy-mixtral-by-using-intelligent-computing-lingjun)
- [Best Practices for the Open Source Model Tongyi Qianwen on PAI Lingjun](https://mp.weixin.qq.com/s?__biz=Mzg4MzgxNDk2OA==&mid=2247491796&idx=1&sn=dc1d719313d794ae1aacdb07669a9545&chksm=cf430783f8348e950218bfcff861a2e6d2d92705807bf5b04f6e9268cc510ffa6e6aa2c87327#rd)
- [Alibaba Cloud Machine Learning PAI Open Source AI Large Model Training Tool Pai-Megatron-Patch, Facilitating the Implementation of Large Model Technologies](https://zhuanlan.zhihu.com/p/655942437)
- [Text Generation and Understanding with the Highest Energy Efficient 27 Billion Parameter GPT Model on a Single Machine](https://zhuanlan.zhihu.com/p/597652820)
- [Chinese Sparse GPT Large Model Implementation — A Key Milestone Towards Low-Cost & High-Performance Multitasking General Natural Language Understanding](https://zhuanlan.zhihu.com/p/561320982)
- [Alibaba Cloud PAI Releases Knowledge Pre-training Tool After Winning the Pre-training Knowledge Measurement Competition](https://zhuanlan.zhihu.com/p/449487792)
- [Alibaba Cloud PAI Wins Dual Championships in FewCLUE Small Sample Learning Based on Large Models](https://developer.aliyun.com/article/788081?spm=a2c6h.12873639.article-detail.17.11c5383cHpFZks&tlog=yuekan_8)


## Contact
Expand Down
318 changes: 163 additions & 155 deletions examples/llama3/README.md

Large diffs are not rendered by default.

341 changes: 175 additions & 166 deletions examples/mistral/README.md

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions toolkits/model_checkpoints_convertor/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
## hf-to-megatron
hf-to-megatron是一款模型ckpt转换工具,方便用户低门槛的将huggingface版的ckpt转换到megatron格式,以使用megatron-lm的分布式能力训练LLM大模型。转换后的模型需配合PAI-Megatron-Patch代码库使用。目前已经支持下列模型:
hf-to-megatron is a model checkpoint conversion tool designed to easily convert Hugging Face checkpoints to the Megatron format. This conversion allows users to leverage the distributed capabilities of Megatron-LM for training large language models (LLMs). The converted models must be used in conjunction with the PAI-Megatron-Patch codebase. The tool currently supports the following models:

+ bloom
+ llama/alpaca
Expand All @@ -10,4 +10,4 @@ hf-to-megatron是一款模型ckpt转换工具,方便用户低门槛的将huggi
+ falcon
+ starcoder

相关转换后的模型存放在:oss://atp-modelzoo/release/models/pai-megatron-patch/
The converted models are stored at: oss://atp-modelzoo/release/models/pai-megatron-patch/
Loading