38 Commits

Author SHA1 Message Date
5351ae8a6a 正常尺寸 2025-05-11 11:58:13 +08:00
0c8c6e5d1a 添加了忽视数据库模式 2025-05-09 15:19:41 +08:00
b6bd97aaaa 抽取self.downsample_v与self.downsample_q的共同部分,并使用可分离卷积降低参数量 2025-05-09 15:01:06 +08:00
10f15724b4 添加了train_embedding用于预训练嵌入模型 2025-05-08 15:41:04 +00:00
0859f54a88 DynamicKV-LLM 1.0.0 完成了核心架构,模型可以正常训练 2025-04-25 16:49:05 +08:00
e3120f5e62 fix 2025-04-25 16:29:28 +08:00
Jax922
1ddfd310ec 将Million MoE的思想加入 2025-04-24 21:29:33 +08:00
jingyaogong
d9453ed9a3 update moe note 2025-04-09 17:38:31 +08:00
jingyaogong
4a7c1c49e8 update rlaif 2025-04-05 16:06:08 +08:00
jingyaogong
9e67798397 update generate 2025-04-05 15:53:55 +08:00
jingyaogong
399d526fbd add hidden state 2025-04-05 14:39:56 +08:00
jingyaogong
ed01c5d84a update inference 2025-04-05 12:03:04 +08:00
jingyaogong
bf81fd5f5e rmsnorm float convert 2025-04-01 16:03:44 +08:00
jingyaogong
e369b33265 fix chat mask bug 2025-04-01 13:44:55 +08:00
jingyaogong
258507ff89 delete __pycache__ 2025-04-01 11:51:54 +08:00
gongjy
844e79148c update generate args 2025-02-15 23:56:09 +08:00
gongjy
19b388cd87 update generate args 2025-02-15 23:55:10 +08:00
gongjy
5b65bc767e update cis init 2025-02-15 20:26:34 +08:00
gongjy
58e3af0359 add minimind2 2025-02-09 23:49:47 +08:00
gongjy
3ff66f7221 update model 2024-10-20 15:13:58 +08:00
gongjy
772834148e update readme 2024-10-08 23:40:29 +08:00
gongjy
a87f628400 update model (fix loss bug) 2024-09-29 16:58:48 +08:00
gongjy
75753ea765 Update data preprocessing methods 2024-09-27 17:19:03 +08:00
gongjy
a8ae342775 Update data preprocessing methods 2024-09-27 16:19:30 +08:00
gongjy
6759da45c1 update model mask 2024-09-21 20:00:25 +08:00
gongjy
02297df3c1 Efficient implementation of Inference KV cache 2024-09-21 00:01:05 +08:00
gongjy
9093519c37 Updated some explanations 2024-09-20 17:07:51 +08:00
gongjy
ee218402cd update some explain of the code 2024-09-20 17:04:16 +08:00
Ben
2dceaf4a92 添加注释,方便学习者快速理解 2024-09-18 21:53:39 +08:00
gongjy
61cb61a46a update minimind-v1-moe 2024-09-17 11:33:31 +08:00
gongjy
8c18b324d0 update model 2024-09-16 16:59:52 +08:00
gongjy
e4ad822c40 update model 2024-09-16 15:29:57 +08:00
gongjy
16928c1231 update some config 2024-09-15 15:12:47 +08:00
gongjy
aa5d70321f update config 2024-09-15 15:09:21 +08:00
gongjy
f3f1cc5fac update config 2024-09-15 15:08:04 +08:00
gongjy
3068e5efcc update model/dataset.py 2024-09-14 16:09:42 +08:00
gongjy
ecf6d44133 update model/dataset.py 2024-09-14 14:05:41 +08:00
gongjy
8be42693f6 MiniMind first open source 2024-08-28 16:41:44 +08:00