update readme
This commit is contained in:
parent
b0a50e087c
commit
5908503449
19
README.md
19
README.md
@ -70,7 +70,7 @@ https://github.com/user-attachments/assets/88b98128-636e-43bc-a419-b1b1403c2055
|
|||||||
|
|
||||||
- 公开MiniMind模型代码(包含Dense和MoE模型)、Pretrain、SFT指令微调、LoRA微调、DPO偏好优化的全过程代码、数据集和来源。
|
- 公开MiniMind模型代码(包含Dense和MoE模型)、Pretrain、SFT指令微调、LoRA微调、DPO偏好优化的全过程代码、数据集和来源。
|
||||||
- 兼容`transformers`、`accelerate`、`trl`、`peft`等流行框架。
|
- 兼容`transformers`、`accelerate`、`trl`、`peft`等流行框架。
|
||||||
- 训练支持单机单卡、单机多卡训练。训练过程中支持在任意位置停止,及在任意位置继续训练。
|
- 训练支持单机单卡、单机多卡(DDP、DeepSpeed)训练。训练过程中支持在任意位置停止,及在任意位置继续训练。
|
||||||
- 在Ceval数据集上进行模型测试的代码。
|
- 在Ceval数据集上进行模型测试的代码。
|
||||||
- 实现Openai-Api基本的chat接口,便于集成到第三方ChatUI使用(FastGPT、Open-WebUI等)。
|
- 实现Openai-Api基本的chat接口,便于集成到第三方ChatUI使用(FastGPT、Open-WebUI等)。
|
||||||
|
|
||||||
@ -191,17 +191,20 @@ streamlit run fast_inference.py
|
|||||||
* `python 2-eval.py`测试模型的对话效果
|
* `python 2-eval.py`测试模型的对话效果
|
||||||

|

|
||||||
|
|
||||||
🍭 【Tip】预训练和全参微调pretrain和full_sft均支持DDP多卡加速
|
🍭 【Tip】预训练和全参微调pretrain和full_sft均支持多卡加速
|
||||||
|
|
||||||
* 单机N卡启动训练
|
* 单机N卡启动训练(ddp)
|
||||||
|
```bash
|
||||||
```text
|
|
||||||
torchrun --nproc_per_node N 1-pretrain.py
|
torchrun --nproc_per_node N 1-pretrain.py
|
||||||
```
|
# and
|
||||||
|
|
||||||
```text
|
|
||||||
torchrun --nproc_per_node N 3-full_sft.py
|
torchrun --nproc_per_node N 3-full_sft.py
|
||||||
```
|
```
|
||||||
|
* 单机N卡启动训练(deepspeed)
|
||||||
|
```bash
|
||||||
|
deepspeed --master_port 29500 --num_gpus=N 1-pretrain.py
|
||||||
|
# and
|
||||||
|
deepspeed --master_port 29500 --num_gpus=N 3-full_sft.py
|
||||||
|
```
|
||||||
|
|
||||||
# 📌 Data sources
|
# 📌 Data sources
|
||||||
|
|
||||||
|
18
README_en.md
18
README_en.md
@ -75,7 +75,7 @@ The project includes:
|
|||||||
- Public MiniMind model code (including Dense and MoE models), code for Pretrain, SFT instruction fine-tuning, LoRA
|
- Public MiniMind model code (including Dense and MoE models), code for Pretrain, SFT instruction fine-tuning, LoRA
|
||||||
fine-tuning, and DPO preference optimization, along with datasets and sources.
|
fine-tuning, and DPO preference optimization, along with datasets and sources.
|
||||||
- Compatibility with popular frameworks such as `transformers`, `accelerate`, `trl`, and `peft`.
|
- Compatibility with popular frameworks such as `transformers`, `accelerate`, `trl`, and `peft`.
|
||||||
- Training support for single-GPU and multi-GPU setups. The training process allows for stopping and resuming at any
|
- Training support for single-GPU and multi-GPU setups(DDP、DeepSpeed). The training process allows for stopping and resuming at any
|
||||||
point.
|
point.
|
||||||
- Code for testing the model on the Ceval dataset.
|
- Code for testing the model on the Ceval dataset.
|
||||||
- Implementation of a basic chat interface compatible with OpenAI's API, facilitating integration into third-party Chat
|
- Implementation of a basic chat interface compatible with OpenAI's API, facilitating integration into third-party Chat
|
||||||
@ -214,15 +214,19 @@ git clone https://github.com/jingyaogong/minimind.git
|
|||||||
|
|
||||||
🍭 **Tip**: Pretraining and full parameter fine-tuning (`pretrain` and `full_sft`) support DDP multi-GPU acceleration.
|
🍭 **Tip**: Pretraining and full parameter fine-tuning (`pretrain` and `full_sft`) support DDP multi-GPU acceleration.
|
||||||
|
|
||||||
* Start training on a single machine with N GPUs
|
* Start training on a single machine with N GPUs(DDP)
|
||||||
|
```bash
|
||||||
```text
|
|
||||||
torchrun --nproc_per_node N 1-pretrain.py
|
torchrun --nproc_per_node N 1-pretrain.py
|
||||||
```
|
# and
|
||||||
|
|
||||||
```text
|
|
||||||
torchrun --nproc_per_node N 3-full_sft.py
|
torchrun --nproc_per_node N 3-full_sft.py
|
||||||
```
|
```
|
||||||
|
* Start training on a single machine with N GPUs(DeepSpeed)
|
||||||
|
```bash
|
||||||
|
deepspeed --master_port 29500 --num_gpus=N 1-pretrain.py
|
||||||
|
# and
|
||||||
|
deepspeed --master_port 29500 --num_gpus=N 3-full_sft.py
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
# 📌 Data sources
|
# 📌 Data sources
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user