fix data_process bug

This commit is contained in:
gongjy 2024-09-23 20:11:19 +08:00
parent 5f8279f661
commit b4359b3335
3 changed files with 5 additions and 1 deletions

View File

@ -687,6 +687,8 @@ minimind模型本身没有使用较大的数据集训练也没有针对回答
 
<a href="https://github.com/chuanzhubin"><img src="https://avatars.githubusercontent.com/u/2813798" width="70px" height="70px"/></a>
&nbsp;
<a href="https://github.com/iomgaa-ycz"><img src="https://avatars.githubusercontent.com/u/124225682" width="70px" height="70px"/></a>
&nbsp;
## 😊鸣谢

View File

@ -756,6 +756,8 @@ your model with third-party UIs, such as fastgpt, OpenWebUI, etc.
&nbsp;
<a href="https://github.com/chuanzhubin"><img src="https://avatars.githubusercontent.com/u/2813798" width="70px" height="70px"/></a>
&nbsp;
<a href="https://github.com/iomgaa-ycz"><img src="https://avatars.githubusercontent.com/u/124225682" width="70px" height="70px"/></a>
&nbsp;
## 😊Thanks for

View File

@ -95,7 +95,7 @@ def process_seq_monkey(chunk_size=50000):
if doc_ids:
arr = np.array(doc_ids, dtype=np.uint16)
with open(f'./dataset/clean_seq_monkey.bin', 'wb') as f:
with open(f'./dataset/clean_seq_monkey.bin', 'ab') as f:
f.write(arr.tobytes())