From 6d6510eefc010c8f7b8e1c27b0fc776506e23595 Mon Sep 17 00:00:00 2001 From: gongjy <2474590974@qq.com> Date: Wed, 28 Aug 2024 18:05:42 +0800 Subject: [PATCH] update readme's error --- README.md | 4 ++-- README_en.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index b11d38e..9d61ddc 100644 --- a/README.md +++ b/README.md @@ -162,8 +162,8 @@ python 2-eval.py 因为LLM体积非常小,为了避免模型头重脚轻(词嵌入embedding层参数占整个LLM比太高),所以词表长度需要选择比较小。 强大的开源模型例如01万物、千问、chatglm、mistral、Llama3等,它们的tokenizer词表长度如下: - | Tokenizer 模型 | 词表大小 | 来源 | - |--------------------|---------|------------| + | Tokenizer 模型 | 词表大小 | 来源 | + |--------------------|---------|------------| | yi tokenizer | 64,000 | 01万物(中国) | | qwen2 tokenizer | 151,643 | 阿里云(中国) | | glm tokenizer | 151,329 | 智谱AI(中国) | diff --git a/README_en.md b/README_en.md index 5371255..4c32a2f 100644 --- a/README_en.md +++ b/README_en.md @@ -192,7 +192,7 @@ git clone https://github.com/jingyaogong/minimind.git sizes: | Tokenizer Model | Vocabulary Size | Source | - |----------------------|------------------|-----------------------| + |----------------------|------------------|-----------------------| | yi tokenizer | 64,000 | 01-AI (China) | | qwen2 tokenizer | 151,643 | Alibaba Cloud (China) | | glm tokenizer | 151,329 | Zhipu AI (China) |