diff --git a/README.md b/README.md index 6104c04..3a19501 100644 --- a/README.md +++ b/README.md @@ -50,31 +50,22 @@ --- - - - - - - - - -
- - MiniMind Logo - - - Multi Icon - - - Hugging Face Logo - - - Multi Icon - - - ModelScope Logo - -
+
+ + + + + +
+ + Hugging Face Logo + + + + ModelScope Logo + +
+
--- diff --git a/README_en.md b/README_en.md index 0428ed8..7783aaf 100644 --- a/README_en.md +++ b/README_en.md @@ -54,31 +54,22 @@ --- - - - - - - - - -
- - MiniMind Logo - - - Multi Icon - - - Hugging Face Logo - - - Multi Icon - - - ModelScope Logo - -
+
+ + + + + +
+ + Hugging Face Logo + + + + ModelScope Logo + +
+
--- @@ -213,7 +204,6 @@ We hope this open-source project can help LLM beginners quickly get started! # 📌 Quick Start -
Sharing My Hardware and Software Configuration (For Reference Only) @@ -306,7 +296,8 @@ needs and GPU resources. python train_pretrain.py ``` -> Execute pretraining to get `pretrain_*.pth` as the output weights for pretraining (where * represents the model dimension, default is 512). +> Execute pretraining to get `pretrain_*.pth` as the output weights for pretraining (where * represents the model +> dimension, default is 512). **3.2 Supervised Fine-Tuning (Learning Dialogue Style)** @@ -315,7 +306,8 @@ python train_pretrain.py python train_full_sft.py ``` -> Execute supervised fine-tuning to get `full_sft_*.pth` as the output weights for instruction fine-tuning (where `full` represents full parameter fine-tuning). +> Execute supervised fine-tuning to get `full_sft_*.pth` as the output weights for instruction fine-tuning (where `full` +> represents full parameter fine-tuning). --- @@ -692,8 +684,10 @@ original purpose behind the creation of the MiniMind series! 🤖️: You mentioned "Introok's the believeations of theument." This name originates from the ancient Chinese "groty of of the change." ``` -Fast and effective, it is still possible to further compress the training process by obtaining smaller and higher-quality datasets. -The Zero model weights are saved as `full_sft_512_zero.pth` (see the MiniMind model file link below). Feel free to download and test the model's performance. +Fast and effective, it is still possible to further compress the training process by obtaining smaller and +higher-quality datasets. +The Zero model weights are saved as `full_sft_512_zero.pth` (see the MiniMind model file link below). Feel free to +download and test the model's performance. ## Ⅱ Main Training Steps @@ -715,8 +709,7 @@ python train_pretrain.py ``` > The trained model weights are saved every `100 steps` by default as: `pretrain_*.pth` (the * represents the specific -model dimension, and each new save will overwrite the previous one). - +> model dimension, and each new save will overwrite the previous one). ### **2. Supervised Fine-Tuning (SFT)**: @@ -742,7 +735,7 @@ python train_full_sft.py ``` > The trained model weights are saved every `100 steps` by default as: `full_sft_*.pth` (the * represents the specific -model dimension, and each new save will overwrite the previous one). +> model dimension, and each new save will overwrite the previous one). ## Ⅲ Other Training Steps @@ -771,7 +764,7 @@ python train_dpo.py ``` > The trained model weights are saved every `100 steps` by default as: `rlhf_*.pth` (the * represents the specific model -dimension, and each new save will overwrite the previous one). +> dimension, and each new save will overwrite the previous one). ### **4. Knowledge Distillation (KD)** @@ -807,7 +800,7 @@ python train_full_sft.py ``` > The trained model weights are saved every `100 steps` by default as: `full_sft_*.pth` (the * represents the specific -model dimension, and each new save will overwrite the previous one). +> model dimension, and each new save will overwrite the previous one). This section emphasizes MiniMind’s white-box distillation code `train_distillation.py`. Since MiniMind doesn’t have a powerful teacher model within the same series, the white-box distillation code serves as a learning reference. @@ -835,7 +828,7 @@ python train_lora.py ``` > The trained model weights are saved every `100 steps` by default as: `lora_xxx_*.pth` (the * represents the specific -model dimension, and each new save will overwrite the previous one). +> model dimension, and each new save will overwrite the previous one). Many people are puzzled: how can a model learn private domain knowledge? How should datasets be prepared? How to transfer general models into specialized domain models? @@ -957,7 +950,7 @@ python train_distill_reason.py ``` > The trained model weights are saved every `100 steps` by default as: `reason_*.pth` (* being the specific dimension of -the model; each time a new file is saved, it will overwrite the old one). +> the model; each time a new file is saved, it will overwrite the old one). Test it: @@ -1033,7 +1026,8 @@ For reference, the parameter settings for GPT-3 are shown in the table below: ### Training Completed - Model Collection -> Considering that many people have reported slow speeds with Baidu Cloud, all MiniMind2 models and beyond will be hosted on ModelScope/HuggingFace. +> Considering that many people have reported slow speeds with Baidu Cloud, all MiniMind2 models and beyond will be +> hosted on ModelScope/HuggingFace. #### Native PyTorch Models @@ -1129,7 +1123,8 @@ rather than using the PPO method where the reward model acts as a "coach" to cor ## Ⅱ Subjective Sample Evaluation -🏃The following tests were completed on February 9, 2025. New models released after this date will not be included in the tests unless there is a special need. +🏃The following tests were completed on February 9, 2025. New models released after this date will not be included in the +tests unless there is a special need. [A] [MiniMind2 (0.1B)](https://www.modelscope.cn/models/gongjy/MiniMind2-PyTorch)
[B] [MiniMind2-MoE (0.15B)](https://www.modelscope.cn/models/gongjy/MiniMind2-PyTorch)
@@ -1214,7 +1209,8 @@ rather than using the PPO method where the reward model acts as a "coach" to cor --- -🙋‍Directly give all the questions and the model's answers above to DeepSeek-R1, let it help comment and rank with scores: +🙋‍Directly give all the questions and the model's answers above to DeepSeek-R1, let it help comment and rank with +scores:
diff --git a/images/and_huggingface.png b/images/and_huggingface.png new file mode 100644 index 0000000..c234f8a Binary files /dev/null and b/images/and_huggingface.png differ diff --git a/images/and_modelscope.png b/images/and_modelscope.png new file mode 100644 index 0000000..1e46da4 Binary files /dev/null and b/images/and_modelscope.png differ diff --git a/images/multi.png b/images/multi.png deleted file mode 100644 index 0334c93..0000000 Binary files a/images/multi.png and /dev/null differ