update readme

This commit is contained in:
gongjy 2024-09-09 20:09:47 +08:00
parent d01e1c6a5f
commit b0a50e087c
2 changed files with 24 additions and 22 deletions

View File

@ -268,7 +268,8 @@ streamlit run fast_inference.py
### 数据集下载地址 ### 数据集下载地址
| MiniMind训练数据集 | 下载地址 | | MiniMind训练数据集 | 下载地址 |
|------------------|---------------------------------------------------------------------------------------------------------------| |--------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
| **【tokenizer训练集】** | [HuggingFace](https://huggingface.co/datasets/jingyaogong/minimind_dataset/tree/main) / [百度网盘](https://pan.baidu.com/s/1yAw1LVTftuhQGAC1Y9RdYQ?pwd=6666) |
| **【Pretrain数据】** | [seq-monkey通用文本数据集](http://share.mobvoi.com:5000/sharing/O91blwPkY) | | **【Pretrain数据】** | [seq-monkey通用文本数据集](http://share.mobvoi.com:5000/sharing/O91blwPkY) |
| **【SFT数据】** | [匠数大模型SFT数据集](https://www.modelscope.cn/datasets/deepctrl/deepctrl-sft-data/resolve/master/sft_data_zh.jsonl) | | **【SFT数据】** | [匠数大模型SFT数据集](https://www.modelscope.cn/datasets/deepctrl/deepctrl-sft-data/resolve/master/sft_data_zh.jsonl) |
| **【DPO数据】** | [活字数据集1](https://huggingface.co/datasets/Skepsun/huozi_rlhf_data_json) | | **【DPO数据】** | [活字数据集1](https://huggingface.co/datasets/Skepsun/huozi_rlhf_data_json) |

View File

@ -305,7 +305,8 @@ git clone https://github.com/jingyaogong/minimind.git
### Dataset Download Links ### Dataset Download Links
| MiniMind Training Dataset | Download Link | | MiniMind Training Dataset | Download Link |
|---------------------------|------------------------------------------------------------------------------------------------------------------------------------| |---------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|
| **[tokenizer Data]** | [HuggingFace](https://huggingface.co/datasets/jingyaogong/minimind_dataset/tree/main) / [Baidu](https://pan.baidu.com/s/1yAw1LVTftuhQGAC1Y9RdYQ?pwd=6666) |
| **[Pretrain Data]** | [Seq-Monkey General Text Dataset](http://share.mobvoi.com:5000/sharing/O91blwPkY) | | **[Pretrain Data]** | [Seq-Monkey General Text Dataset](http://share.mobvoi.com:5000/sharing/O91blwPkY) |
| **[SFT Data]** | [Jiangshu Large Model SFT Dataset](https://www.modelscope.cn/datasets/deepctrl/deepctrl-sft-data/resolve/master/sft_data_zh.jsonl) | | **[SFT Data]** | [Jiangshu Large Model SFT Dataset](https://www.modelscope.cn/datasets/deepctrl/deepctrl-sft-data/resolve/master/sft_data_zh.jsonl) |
| **[DPO Data]** | [Huozi Dataset 1](https://huggingface.co/datasets/Skepsun/huozi_rlhf_data_json) | | **[DPO Data]** | [Huozi Dataset 1](https://huggingface.co/datasets/Skepsun/huozi_rlhf_data_json) |
@ -431,11 +432,11 @@ Environment: python 3.9 + Torch 2.1.2 + DDP multi-GPU training
🔗 **Trained Model Weights**: 🔗 **Trained Model Weights**:
| Model Name | params | Config | pretrain_model | single_sft_model | multi_sft_model | | Model Name | params | Config | pretrain_model | single_sft_model | multi_sft_model |
|------------------|--------|-------------------------------------------------|----------------------------------------------------------------|----------------------------------------------------------------|----------------------------------------------------------------| |------------------|--------|-------------------------------------------------|-----------------------------------------------------------------|-----------------------------------------------------------------|-----------------------------------------------------------------|
| minimind-small-T | 26M | d_model=512<br/>n_layers=8 | - | [链接](https://pan.baidu.com/s/1_COe0FQRDmeapSsvArahCA?pwd=6666) | [链接](https://pan.baidu.com/s/1GsGsWSL0Dckl0YPRXiBIFQ?pwd=6666) | | minimind-small-T | 26M | d_model=512<br/>n_layers=8 | - | [URL](https://pan.baidu.com/s/1_COe0FQRDmeapSsvArahCA?pwd=6666) | [URL](https://pan.baidu.com/s/1GsGsWSL0Dckl0YPRXiBIFQ?pwd=6666) |
| minimind-small | 56M | d_model=640<br/>n_layers=8 | [链接](https://pan.baidu.com/s/1nJuOpnu5115FDuz6Ewbeqg?pwd=6666) | [链接](https://pan.baidu.com/s/1lRX0IcpjNFSySioeCfifRQ?pwd=6666) | [链接](https://pan.baidu.com/s/1LzVxBpL0phtGUH267Undqw?pwd=6666) | | minimind-small | 56M | d_model=640<br/>n_layers=8 | [URL](https://pan.baidu.com/s/1nJuOpnu5115FDuz6Ewbeqg?pwd=6666) | [URL](https://pan.baidu.com/s/1lRX0IcpjNFSySioeCfifRQ?pwd=6666) | [URL](https://pan.baidu.com/s/1LzVxBpL0phtGUH267Undqw?pwd=6666) |
| minimind | 218M | d_model=1024<br/>n_layers=16 | [链接](https://pan.baidu.com/s/1jzA7uLEi-Jen2fW5olCmEg?pwd=6666) | [链接](https://pan.baidu.com/s/1Hvt0Q_UB_uW2sWTw6w1zRQ?pwd=6666) | [链接](https://pan.baidu.com/s/1fau9eat3lXilnrG3XNhG5Q?pwd=6666) | | minimind | 218M | d_model=1024<br/>n_layers=16 | [URL](https://pan.baidu.com/s/1jzA7uLEi-Jen2fW5olCmEg?pwd=6666) | [URL](https://pan.baidu.com/s/1Hvt0Q_UB_uW2sWTw6w1zRQ?pwd=6666) | [URL](https://pan.baidu.com/s/1fau9eat3lXilnrG3XNhG5Q?pwd=6666) |
| minimind-MoE | 166M | d_model=1024<br/>n_layers=8<br/>share+route=2+4 | [链接](https://pan.baidu.com/s/11CneDVTkw2Y6lNilQX5bWw?pwd=6666) | [链接](https://pan.baidu.com/s/1fRq4MHZec3z-oLK6sCzj_A?pwd=6666) | [链接](https://pan.baidu.com/s/1HC2KSM_-RHRtgv7ZDkKI9Q?pwd=6666) | | minimind-MoE | 166M | d_model=1024<br/>n_layers=8<br/>share+route=2+4 | [URL](https://pan.baidu.com/s/11CneDVTkw2Y6lNilQX5bWw?pwd=6666) | [URL](https://pan.baidu.com/s/1fRq4MHZec3z-oLK6sCzj_A?pwd=6666) | [URL](https://pan.baidu.com/s/1HC2KSM_-RHRtgv7ZDkKI9Q?pwd=6666) |
--- ---
@ -603,9 +604,9 @@ and provide ratings and rankings.
### Ranking (from highest to lowest): ### Ranking (from highest to lowest):
| 模型 | D模型 | A模型 | B模型 | F模型 | C模型 | E模型 | | Model | D Model | A Model | B Model | F Model | C Model | E Model |
|----|-----|-----|-----|-----|-----|-----| |-------|---------|---------|---------|---------|---------|---------|
| 分数 | 85 | 80 | 75 | 60 | 55 | 50 | | Score | 85 | 80 | 75 | 60 | 55 | 50 |
These scores and rankings are based on each models overall performance in accuracy, clarity, and completeness. These scores and rankings are based on each models overall performance in accuracy, clarity, and completeness.