213 Commits

Author SHA1 Message Date
0c8c6e5d1a 添加了忽视数据库模式 2025-05-09 15:19:41 +08:00
b6bd97aaaa 抽取self.downsample_v与self.downsample_q的共同部分,并使用可分离卷积降低参数量 2025-05-09 15:01:06 +08:00
10f15724b4 添加了train_embedding用于预训练嵌入模型 2025-05-08 15:41:04 +00:00
0859f54a88 DynamicKV-LLM 1.0.0 完成了核心架构,模型可以正常训练 2025-04-25 16:49:05 +08:00
e3120f5e62 fix 2025-04-25 16:29:28 +08:00
Jax922
1ddfd310ec 将Million MoE的思想加入 2025-04-24 21:29:33 +08:00
Jax922
c55dfc0b46 添加了注释 2025-04-24 15:58:39 +08:00
Jax922
21fdaaa59e 更新了忽视列表 2025-04-24 15:58:33 +08:00
jingyaogong
7da201a944 update chat-openai-api 2025-04-18 12:43:57 +08:00
jingyaogong
d9453ed9a3 update moe note 2025-04-09 17:38:31 +08:00
jingyaogong
d503093ec4 update eval 2025-04-09 16:56:57 +08:00
jingyaogong
4a758564e4 fix top_p float bug 2025-04-09 16:52:20 +08:00
jingyaogong
4a7c1c49e8 update rlaif 2025-04-05 16:06:08 +08:00
jingyaogong
9e67798397 update generate 2025-04-05 15:53:55 +08:00
jingyaogong
399d526fbd add hidden state 2025-04-05 14:39:56 +08:00
jingyaogong
885661f47d update inference 2025-04-05 12:04:38 +08:00
jingyaogong
ed01c5d84a update inference 2025-04-05 12:03:04 +08:00
jingyaogong
7fcc46b39a update seed set 2025-04-04 11:39:41 +08:00
jingyaogong
08e9a22a25 update web_demo 2025-04-04 11:25:40 +08:00
jingyaogong
278ec760a1 update dpo_loss 2025-04-01 17:32:50 +08:00
jingyaogong
4f95e23a98 update structure image 2025-04-01 16:15:26 +08:00
jingyaogong
edc8d26189 update structure image 2025-04-01 16:11:54 +08:00
jingyaogong
bf81fd5f5e rmsnorm float convert 2025-04-01 16:03:44 +08:00
jingyaogong
e369b33265 fix chat mask bug 2025-04-01 13:44:55 +08:00
jingyaogong
258507ff89 delete __pycache__ 2025-04-01 11:51:54 +08:00
gongjy
04b56ea86c update readme 2025-02-23 20:07:26 +08:00
gongjy
e34d4e9371 update tokenizer load 2025-02-19 23:24:29 +08:00
gongjy
45c0d12049 update images 2025-02-19 22:59:42 +08:00
gongjy
f475e4e407 update images 2025-02-19 22:54:57 +08:00
gongjy
ef7dff9fd4 update structure figure 2025-02-18 23:35:16 +08:00
gongjy
dcf5fcdb08 update structure figure 2025-02-18 23:24:51 +08:00
gongjy
844e79148c update generate args 2025-02-15 23:56:09 +08:00
gongjy
19b388cd87 update generate args 2025-02-15 23:55:10 +08:00
gongjy
5b65bc767e update cis init 2025-02-15 20:26:34 +08:00
gongjy
c1a77f5c0f update web_demo 2025-02-14 19:38:55 +08:00
gongjy
d519d2a233 update web_demo 2025-02-14 19:37:27 +08:00
gongjy
e7ed05834b fix bug 2025-02-13 21:07:43 +08:00
gongjy
b5d10d9a7d fix bugs 2025-02-13 20:56:14 +08:00
gongjy
416cc90b58 update ckp-path 2025-02-12 20:34:47 +08:00
gongjy
bab480073e update lr 2025-02-11 23:53:48 +08:00
gongjy
d2f5ef4355 update lr 2025-02-11 23:52:40 +08:00
gongjy
fea5b0eafc update readme 2025-02-10 23:34:52 +08:00
gongjy
291c19dc5e update markdown 2025-02-10 23:22:53 +08:00
gongjy
6f383e29fb update markdown 2025-02-10 23:19:23 +08:00
gongjy
97886fef2e update readme 2025-02-10 22:25:35 +08:00
gongjy
be9b434379 update readme 2025-02-10 22:22:35 +08:00
gongjy
ddb7e666a3 update readme 2025-02-10 13:32:01 +08:00
gongjy
0c5104885a update readme 2025-02-10 13:30:55 +08:00
gongjy
dd7a7ef730 update readme 2025-02-10 11:55:09 +08:00
gongjy
fe2f1199ac update readme 2025-02-10 11:44:43 +08:00