update readme

2024-10-09 09:12:55 +08:00 · 2024-10-09 09:12:55 +08:00 · 5dbd6174b3
commit 5dbd6174b3
parent 4542ecf858
2 changed files with 13 additions and 14 deletions
--- a/README.md
+++ b/README.md
@ -25,9 +25,9 @@
 </div>
-* 本开源项目旨在完全从0开始，最快仅用3小时！即可训练出仅为26M大小的微型语言模型**MiniMind**。
+* 本开源项目旨在完全从0开始，最快仅用3小时！即可训练出仅为26.88M大小的微型语言模型**MiniMind**。
-* **MiniMind**极其轻量，体积约是 GPT3 的 $\frac{1}{7000}$，力求做到最普通的个人GPU也可快速推理甚至训练。
+* **MiniMind**极其轻量，最小版本体积约是 GPT3 的 $\frac{1}{7000}$，力求做到最普通的个人GPU也可快速推理甚至训练。
-* **MiniMind**改进自DeepSeek-V2、Llama3结构，项目包含整个数据处理、pretrain、sft、dpo的全部阶段，包含混合专家(MoE)模型。
+* **MiniMind**发布了大模型极简结构，数据集清洗和预处理、监督预训练(Pretrain)、有监督指令微调(SFT)、低秩自适应(LoRA)微调，无奖励强化学习直接偏好对齐(DPO)的全阶段代码，也包含拓展共享混合专家(MoE)的稀疏模型；拓展视觉多模态VLM: [MiniMind-V](https://github.com/jingyaogong/minimind-v)。
 * 这不仅是一个开源模型的实现，也是入门大语言模型（LLM）的教程。
 * 希望此项目能为研究者提供一个抛砖引玉的入门示例，帮助大家快速上手并对LLM领域产生更多的探索与创新。
--- a/README_en.md
+++ b/README_en.md
@ -26,18 +26,17 @@
 </div>
-* This open-source project aims to train a miniature language model **MiniMind** from scratch, with a size of just 26MB.
+* This open-source project aims to train a tiny language model called **MiniMind** from scratch in just 3 hours, with a model size of only 26.88M.
 * **MiniMind** is extremely lightweight, approximately $\frac{1}{7000}$ the size of GPT-3, designed to enable fast
  inference and even training on CPUs.
 * **MiniMind** is an improvement on the DeepSeek-V2 and Llama3 architectures. The project includes all stages of data
  processing, pretraining, SFT, and DPO, and features a Mixture of Experts (MoE) model.
 * This is not only the implementation of an open-source model, but also a tutorial for getting started with large
  language models (LLMs).
 * We hope that this project serves as a stepping stone for researchers and developers, providing an introductory example
  to help them quickly get started and foster more exploration and innovation in the LLM field.
-  > To avoid any misunderstanding, "fastest 3 hours" refers to the requirement of using hardware with higher
+* **MiniMind** is extremely lightweight, with the smallest version being approximately $\frac{1}{7000}$ the size of GPT3, making it possible for even an ordinary personal GPU to perform quick inference and even training.
-  specifications than the author's setup. Detailed specifications will be provided below.
+
 * **MiniMind** provides the full-stage code for a simplified large model structure, dataset cleaning and preprocessing, supervised pretraining, supervised instruction fine-tuning (SFT), low-rank adaptation (LoRA) fine-tuning, and direct preference alignment with reinforcement learning without rewards (DPO). It also includes code for expanding to sparse models with mixed experts (MoE) and multi-modal vision language models (VLM): [MiniMind-V](https://github.com/jingyaogong/minimind-v).
 * This is not just an implementation of an open-source model but also a tutorial for getting started with large language models (LLM).
 * We hope this project will serve as an introductory example for researchers, helping them quickly get started and inspiring more exploration and innovation in the LLM field.
 > To avoid misinterpretation, "fastest 3 hours" means you need a machine with hardware configuration superior to mine. Detailed specifications will be provided below.
 ---