update readme format
This commit is contained in:
parent
8be42693f6
commit
4d1d4fae0a
13
README.md
13
README.md
@ -9,6 +9,10 @@
|
|||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<h3>"大道至简"</h3>
|
||||||
|
</div>
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
中文 | [English](./README_en.md)
|
中文 | [English](./README_en.md)
|
||||||
@ -16,12 +20,6 @@
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
|
||||||
<p align="center">
|
|
||||||
<span style="font-size: 2em; font-weight: bold;">
|
|
||||||
“大道至简”<br/>
|
|
||||||
</span>
|
|
||||||
</p>
|
|
||||||
|
|
||||||
* 本开源项目旨在完全从0开始,训练出仅为26M大小的微型语言模型**MiniMind**。
|
* 本开源项目旨在完全从0开始,训练出仅为26M大小的微型语言模型**MiniMind**。
|
||||||
* **MiniMind**极其轻量,体积约是 GPT3 的 $\frac{1}{7000}$,力求做到CPU也可快速推理甚至训练。
|
* **MiniMind**极其轻量,体积约是 GPT3 的 $\frac{1}{7000}$,力求做到CPU也可快速推理甚至训练。
|
||||||
* **MiniMind**改进自DeepSeek-V2、Llama3结构,项目包含整个数据处理、pretrain、sft、dpo的全部阶段,包含混合专家(MoE)模型。
|
* **MiniMind**改进自DeepSeek-V2、Llama3结构,项目包含整个数据处理、pretrain、sft、dpo的全部阶段,包含混合专家(MoE)模型。
|
||||||
@ -182,8 +180,7 @@ python 2-eval.py
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
-
|
- 📙【Pretrain数据】:[seq-monkey通用文本数据集](https://github.com/mobvoi/seq-monkey-data/blob/main/docs/pretrain_open_corpus.md)
|
||||||
📙【Pretrain数据】:[seq-monkey通用文本数据集](https://github.com/mobvoi/seq-monkey-data/blob/main/docs/pretrain_open_corpus.md)
|
|
||||||
是由多种公开来源的数据(如网页、百科、博客、开源代码、书籍等)汇总清洗而成。
|
是由多种公开来源的数据(如网页、百科、博客、开源代码、书籍等)汇总清洗而成。
|
||||||
整理成统一的JSONL格式,并经过了严格的筛选和去重,确保数据的全面性、规模、可信性和高质量。
|
整理成统一的JSONL格式,并经过了严格的筛选和去重,确保数据的全面性、规模、可信性和高质量。
|
||||||
总量大约在10B token,适合中文大语言模型的预训练。
|
总量大约在10B token,适合中文大语言模型的预训练。
|
||||||
|
10
README_en.md
10
README_en.md
@ -8,17 +8,17 @@
|
|||||||
[](https://huggingface.co/collections/jingyaogong/minimind-66caf8d999f5c7fa64f399e5)
|
[](https://huggingface.co/collections/jingyaogong/minimind-66caf8d999f5c7fa64f399e5)
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<h3>"The Greatest Path is the Simplest"</h3>
|
||||||
|
</div>
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
[中文](./README.md) | English
|
[中文](./README.md) | English
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<p align="center">
|
|
||||||
<span style="font-size: 1.5em; font-weight: bold;">
|
|
||||||
"The Greatest Path is the Simplest"<br/>
|
|
||||||
</span>
|
|
||||||
</p>
|
|
||||||
|
|
||||||
* This open-source project aims to train a miniature language model **MiniMind** from scratch, with a size of just 26MB.
|
* This open-source project aims to train a miniature language model **MiniMind** from scratch, with a size of just 26MB.
|
||||||
* **MiniMind** is extremely lightweight, approximately $\frac{1}{7000}$ the size of GPT-3, designed to enable fast
|
* **MiniMind** is extremely lightweight, approximately $\frac{1}{7000}$ the size of GPT-3, designed to enable fast
|
||||||
|
Loading…
x
Reference in New Issue
Block a user