7 Commits

Author SHA1 Message Date
b6bd97aaaa 抽取self.downsample_v与self.downsample_q的共同部分,并使用可分离卷积降低参数量 2025-05-09 15:01:06 +08:00
10f15724b4 添加了train_embedding用于预训练嵌入模型 2025-05-08 15:41:04 +00:00
e3120f5e62 fix 2025-04-25 16:29:28 +08:00
Jax922
c55dfc0b46 添加了注释 2025-04-24 15:58:39 +08:00
jingyaogong
7fcc46b39a update seed set 2025-04-04 11:39:41 +08:00
gongjy
d2f5ef4355 update lr 2025-02-11 23:52:40 +08:00
gongjy
58e3af0359 add minimind2 2025-02-09 23:49:47 +08:00