Jax922
|
83f5cfe6ca
|
update
|
2025-05-12 19:11:04 +08:00 |
|
Jax922
|
803d1f1b72
|
检查速度慢的原因
|
2025-05-12 17:46:18 +08:00 |
|
Jax922
|
48f0018432
|
update
|
2025-05-12 14:16:42 +08:00 |
|
Jax922
|
d93889194d
|
update
|
2025-05-12 11:53:10 +08:00 |
|
|
a3ea93597c
|
DynamicKV-LLM Pretrain v1.1.0
|
2025-05-12 00:21:07 +08:00 |
|
|
da5ac6a5c0
|
Merge branch 'SLM' into HPC
|
2025-05-12 00:05:45 +08:00 |
|
|
cb286d26d1
|
wandb包含config信息
|
2025-05-10 20:23:52 +08:00 |
|
|
0c8c6e5d1a
|
添加了忽视数据库模式
|
2025-05-09 15:19:41 +08:00 |
|
|
b6bd97aaaa
|
抽取self.downsample_v与self.downsample_q的共同部分,并使用可分离卷积降低参数量
|
2025-05-09 15:01:06 +08:00 |
|
|
bed6faa379
|
DynamicKV-LLM 1.0.1 交叉注意力添加多头;bf16代替fp16
|
2025-05-08 15:47:00 +00:00 |
|
|
10f15724b4
|
添加了train_embedding用于预训练嵌入模型
|
2025-05-08 15:41:04 +00:00 |
|
|
e3120f5e62
|
fix
|
2025-04-25 16:29:28 +08:00 |
|
Jax922
|
c55dfc0b46
|
添加了注释
|
2025-04-24 15:58:39 +08:00 |
|
jingyaogong
|
7fcc46b39a
|
update seed set
|
2025-04-04 11:39:41 +08:00 |
|
gongjy
|
d2f5ef4355
|
update lr
|
2025-02-11 23:52:40 +08:00 |
|
gongjy
|
58e3af0359
|
add minimind2
|
2025-02-09 23:49:47 +08:00 |
|