16 Commits

Author SHA1 Message Date
Jax922
83f5cfe6ca update 2025-05-12 19:11:04 +08:00
Jax922
803d1f1b72 检查速度慢的原因 2025-05-12 17:46:18 +08:00
Jax922
48f0018432 update 2025-05-12 14:16:42 +08:00
Jax922
d93889194d update 2025-05-12 11:53:10 +08:00
a3ea93597c DynamicKV-LLM Pretrain v1.1.0 2025-05-12 00:21:07 +08:00
da5ac6a5c0 Merge branch 'SLM' into HPC 2025-05-12 00:05:45 +08:00
cb286d26d1 wandb包含config信息 2025-05-10 20:23:52 +08:00
0c8c6e5d1a 添加了忽视数据库模式 2025-05-09 15:19:41 +08:00
b6bd97aaaa 抽取self.downsample_v与self.downsample_q的共同部分,并使用可分离卷积降低参数量 2025-05-09 15:01:06 +08:00
bed6faa379 DynamicKV-LLM 1.0.1 交叉注意力添加多头;bf16代替fp16 2025-05-08 15:47:00 +00:00
10f15724b4 添加了train_embedding用于预训练嵌入模型 2025-05-08 15:41:04 +00:00
e3120f5e62 fix 2025-04-25 16:29:28 +08:00
Jax922
c55dfc0b46 添加了注释 2025-04-24 15:58:39 +08:00
jingyaogong
7fcc46b39a update seed set 2025-04-04 11:39:41 +08:00
gongjy
d2f5ef4355 update lr 2025-02-11 23:52:40 +08:00
gongjy
58e3af0359 add minimind2 2025-02-09 23:49:47 +08:00