From 4c2b283b93c97c16b10966d0f8a4af9f362c8da7 Mon Sep 17 00:00:00 2001 From: gongjy <2474590974@qq.com> Date: Mon, 10 Feb 2025 00:16:54 +0800 Subject: [PATCH] update readme --- README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index f4256c6..eb86aa7 100644 --- a/README.md +++ b/README.md @@ -527,14 +527,14 @@ MiniMind的整体结构一致,只是在RoPE计算、推理函数和FFN层的 修改模型配置见[./model/LMConfig.py](./model/LMConfig.py)。 参考模型参数版本见下表: -| Model Name | params | len_vocab | n_layers | d_model | kv_heads | q_heads | share+route | -|-------------------|--------|-----------|----------|---------|----------|---------|-------------| -| MiniMind2-Small | 26M | 6400 | 8 | 512 | 2 | 8 | - | -| MiniMind2-MoE | 145M | 6400 | 8 | 640 | 2 | 8 | 1+4 | -| MiniMind2 | 104M | 6400 | 16 | 768 | 2 | 8 | - | -| minimind-v1-small | 26M | 6400 | 8 | 512 | 8 | 16 | - | -| minimind-v1-moe | 4×26M | 6400 | 8 | 512 | 8 | 16 | 1+4 | -| minimind-v1 | 108M | 6400 | 16 | 768 | 8 | 16 | - | +| Model Name | params | len_vocab | rope_theta | n_layers | d_model | kv_heads | q_heads | share+route | +|-------------------|--------|-----------|------------|----------|---------|----------|---------|-------------| +| MiniMind2-Small | 26M | 6400 | 1e6 | 8 | 512 | 2 | 8 | - | +| MiniMind2-MoE | 145M | 6400 | 1e6 | 8 | 640 | 2 | 8 | 1+4 | +| MiniMind2 | 104M | 6400 | 1e6 | 16 | 768 | 2 | 8 | - | +| minimind-v1-small | 26M | 6400 | 1e4 | 8 | 512 | 8 | 16 | - | +| minimind-v1-moe | 4×26M | 6400 | 1e4 | 8 | 512 | 8 | 16 | 1+4 | +| minimind-v1 | 108M | 6400 | 1e4 | 16 | 768 | 8 | 16 | - | # 📌 Experiment