update readme
This commit is contained in:
parent
c28664dac8
commit
48ea6a4cbf
@ -419,7 +419,7 @@ MobileLLM提出架构的深度比宽度更重要,「深而窄」的「瘦长
|
||||
|-------------------|--------|-----------------------------|----------------------------------------------------------------|----------------------------------------------------------------|----------------------------------------------------------------|
|
||||
| minimind-v1-small | 26M | d_model=512<br/>n_layers=8 | [链接](https://pan.baidu.com/s/1wP_cAIc8cgaJ6CxUmR9ECQ?pwd=6666) | [链接](https://pan.baidu.com/s/1_COe0FQRDmeapSsvArahCA?pwd=6666) | [链接](https://pan.baidu.com/s/1GsGsWSL0Dckl0YPRXiBIFQ?pwd=6666) |
|
||||
| minimind-v1-moe | 4×26M | d_model=512<br/>n_layers=8 | [链接](https://pan.baidu.com/s/1IZdkzPRhbZ_bSsRL8vInjg?pwd=6666) | [链接](https://pan.baidu.com/s/1tqB-GMvuiGQBvEl-yZ-oBw?pwd=6666) | [链接](https://pan.baidu.com/s/1GHJ2T4904EcT1u8l1rVqtg?pwd=6666) |
|
||||
| minimind-v1 | 108M | d_model=768<br/>n_layers=16 | - | [链接](https://pan.baidu.com/s/1p713loS7EfwHQf3G9eYI3Q?pwd=6666) | [链接](https://pan.baidu.com/s/12iHGpAs6R0kqsOnGtgK6vQ?pwd=6666) |
|
||||
| minimind-v1 | 108M | d_model=768<br/>n_layers=16 | [链接](https://pan.baidu.com/s/1B60jYo4T8OmJI0ooqsixaA?pwd=6666) | [链接](https://pan.baidu.com/s/1p713loS7EfwHQf3G9eYI3Q?pwd=6666) | [链接](https://pan.baidu.com/s/12iHGpAs6R0kqsOnGtgK6vQ?pwd=6666) |
|
||||
|
||||
---
|
||||
|
||||
@ -688,6 +688,9 @@ minimind模型本身没有使用较大的数据集训练,也没有针对回答
|
||||
<a href="https://github.com/ipfgao"><b>@ipfgao</b></a>:
|
||||
<a href="https://github.com/jingyaogong/minimind/issues/26">🔗训练步骤记录</a>
|
||||
|
||||
<a href="https://github.com/chuanzhubin"><b>@chuanzhubin</b></a>:
|
||||
<a href="https://github.com/jingyaogong/minimind/pull/34">🔗代码逐行注释</a>
|
||||
|
||||
## 🫶支持者
|
||||
|
||||
<a href="https://github.com/jingyaogong/minimind/stargazers">
|
||||
|
Loading…
x
Reference in New Issue
Block a user