|
|
cf9acb2064
|
Experiment 1.4.6: Token-based Memory架构实现
完成实验1.4.6的Token-based Memory架构,实现以下改进:
- 记忆库从连续特征向量存储改为离散token ID存储
- 实现双向编解码机制(embedding→特征→output→token)
- 优化EMA更新参数:ema_decay=0.9, ema_update_freq=5
- 显著降低GPU显存使用:从23GB降至13GB(-43%)
- 推理Loss从2.6382降至2.6142(改善0.9%)
技术亮点:
- 有效表示维度从128提升至4096(32x增强)
- 稀疏缓存机制避免内存爆炸
- 立即压缩策略平衡显存和性能
- 人类可解释的记忆内容
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
2025-08-14 23:04:52 +08:00 |
|
|
|
a7fe947a35
|
Experiment 1.4.5:使用VQ-VAE的EMA来更新数据库
|
2025-08-09 10:47:35 +08:00 |
|
|
|
e61d92c4bc
|
Experiment 1.4.4:负载平衡有效
|
2025-08-07 11:43:23 +08:00 |
|
|
|
bba325ef7e
|
Experiment 1_4_1
|
2025-08-03 14:25:26 +08:00 |
|
|
|
c0424644f5
|
Experiment_1_4_0
|
2025-08-01 15:54:21 +08:00 |
|
|
|
d9d281967e
|
修复了一些bug
|
2025-07-17 12:06:28 +08:00 |
|
|
|
d701003f8a
|
pretrain过程中会打印10个token以方便观察
|
2025-07-17 00:05:34 +08:00 |
|
|
|
2797b76939
|
experiment_1.3.0-1.3.2
|
2025-07-13 21:28:46 +08:00 |
|
|
|
5e464bbd3f
|
添加了对于多种模型的支持
|
2025-07-12 18:00:53 +08:00 |
|
|
|
d6617702a5
|
DynamicKV-LLM Pretrain v1.2.2:新数据集;使用uv;消除内存泄漏
|
2025-06-25 20:27:28 +08:00 |
|
|
|
770c34f0e3
|
DynamicKV-LLM Pretrain v1.2.1
|
2025-06-08 02:20:36 +00:00 |
|
|
|
1678e739b6
|
DynamicKV-LLM Pretrain v1.2.0
|
2025-06-07 02:41:45 +00:00 |
|
|
|
000e17a93f
|
修正了key分解、负载均衡等错误
|
2025-06-06 11:25:59 +08:00 |
|
|
|
64e92473c3
|
数据初始化使用了缓存
|
2025-05-29 20:29:45 +08:00 |
|
|
|
67c632d010
|
update
|
2025-05-27 11:46:18 +08:00 |
|
|
|
c96a9c35d5
|
对数据库进行了初始化
|
2025-05-26 23:09:03 +08:00 |
|
Gary
|
d7fe504e1e
|
update
|
2025-05-16 08:38:59 +00:00 |
|
Jax922
|
5841f8b4e5
|
DynamicKV-LLM Pretrain v1.1.0
|
2025-05-14 00:42:50 +08:00 |
|
|
|
089afd6728
|
DynamicKV-LLM Pretrain v1.1.0
|
2025-05-14 00:01:40 +08:00 |
|