to [this link](https://blog.csdn.net/weixin_45456738/article/details/141029610?ops_request_misc=&request_id=&biz_id=102&utm_term=安装torch&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-2-141029610.nonecase&spm=1018.2226.3001.4187)
The [Seq-Monkey General Text Dataset](https://github.com/mobvoi/seq-monkey-data/blob/main/docs/pretrain_open_corpus.md) / [Baidu](https://pan.baidu.com/s/114F1k3eksiWCOQLvaT3RYQ?pwd=6666)
| **[SFT Data]** | [Jiangshu Large Model SFT Dataset](https://www.modelscope.cn/datasets/deepctrl/deepctrl-sft-data/resolve/master/sft_data_zh.jsonl) |
> Testing based on the comparison of the "single-turn dialogue full_sft" and "DPO reinforcement learning alignment" versions of the minimind model.
Model files are available on [Baidu Netdisk](https://pan.baidu.com/s/1KUfSzEkSXYbCCBj0Pw-9fA?pwd=6666), where `rl_<dim>.pth` represents the weights of the minimind model after "DPO reinforcement learning alignment."
```text
[Q]: What is your name?
[ft_768]: I am an artificial intelligence language model, capable of answering your questions and providing assistance. Do you need any help?
[rl_768]: I am a natural language processing system based on artificial intelligence technology, primarily used for processing and generating human language tasks. I can assist users in text processing, speech recognition, text generation, and various other tasks. I can also help with intelligent chat, translation, and many other applications. If you need more information about this project, please let me know.
[ft_512]: I am glad to help you. Do you have any other questions you need me to answer?
[rl_512]: I am a chatbot based on natural language processing technology, capable of understanding and responding to user questions and instructions. I am trained to interact with humans and answer various questions and tasks. Is there anything you need?
[Q]: Who are you?
[ft_768]: I am an artificial intelligence language model, without personal identity or status. I only generate responses based on the input instructions and text information.
[rl_768]: I am a computer program, without personal identity or status. I am designed to answer various questions and provide assistance, including answering questions, offering advice, conducting experiments, providing information, and more. I am widely used in various fields such as natural language processing, computer vision, and artificial intelligence. If you need more help or information, please feel free to contact me. I will do my best to assist you in your work and life.
[ft_512]: I am a large language model, capable of answering a wide variety of questions, including those related to science, history, culture, and technology. Do you have any questions you need me to answer?
[rl_512]: I am a computer program, without a specific background or identity. If you can tell me the specific information you want to know, I will do my best to answer your questions.
[Q]: What are some good universities in China?
[ft_768]: China has many renowned universities, including Peking University, Shanghai Jiao Tong University, Fuxing Chinese Medicine College, Zhejiang University, and more. These universities have their own iconic buildings and research directions in different fields, and they enjoy a high reputation in China's top educational resources.
[rl_768]: There are many universities in China, and the choice depends on your needs. If you need more detailed information, it is recommended to consult the university rankings and policies in your city to choose a suitable university. You can also search for relevant information online or consult some university-related institutions such as Renmin University of China, Renmin University of China, etc., to obtain more detailed and professional guidance and advice.
[ft_512]: China has many excellent universities, such as MIT, Peking University, Shanghai University, and more. These universities have made significant contributions to the development and achievements of China's education.
[rl_512]: The Chinese Ministry of Education, Chinese universities, mainland Chinese universities, and mainland Chinese universities all have excellent educational systems, but the educational levels and professional requirements may vary. The Chinese Ministry of Education has different academic requirements, teaching methods, and faculty strengths, and the academic requirements of universities also vary. The academic requirements for mainland Chinese students also differ, and you need to choose based on your actual situation. If you have more specific needs for the professional requirements of mainland Chinese students, it is recommended to compare and communicate through the internet, various domestic and international institutions, and organizations. You can also seek help and advice from people around you, such as friends, teachers, and parents. I hope the above answer is helpful to you.
[Q]: Do you know what the speed of light is?
[ft_768]: The speed of light is a physical constant, representing the speed at which light travels in a vacuum, approximately 299,792,458 meters per second. The speed of light is one of the most fundamental physical constants in physics and is also an important basis for relativity.
[rl_768]: Of course, I know. The speed of light is the speed at which light travels in a vacuum, usually denoted by c. It is a very important theory in cosmology. In physics, the speed of light is considered one of the most important speeds in the universe. It represents the speed at which light travels in a vacuum, which is the basis for many important formulas and equations.
[ft_512]: The speed of light is the speed at which light travels in a vacuum, approximately 299,792,458 meters per second. It is one of the most mysterious and important physical concepts in the universe.
[rl_512]: The speed of light is the speed at which light travels in a vacuum, an irreversible speed in nature. The source of light consists of atoms, molecules, solids, and light, with a speed of approximately one light year per second. This number is calculated based on relativity, quantum mechanics, and gravitational theory. The speed of light is a relative concept, representing our understanding and interpretation of any event in nature.
```
### 👉Effect Summary
* Approximately 100,000 RLHF data samples were used; the full_sft model performs better in terms of conciseness and information accuracy; the rl model provides more background information in its responses, but the accuracy of the information needs improvement.
* Overall, the model after RLHF tends to learn: to say more polite but useless "fluff" to please the "conversation" itself, while slightly sacrificing information accuracy.
* There is no such thing as a free lunch; we need to continue to improve the quality of the RLHF dataset, and we must also accept the inevitable loss of model capabilities (with varying degrees of severity).
* The difference between DPO and online PPO is that reject and chosen are prepared offline, which inevitably creates a large distribution difference with the output of the minimind model itself.
* This is similar to the DPO algorithm making the model watch the "replay" of the table tennis world champion's gameplay for reinforcement learning, rather than having the reward model act as a "coach" to correct its gameplay in real-time, like PPO.
[A]: Yes, the Yangtze River is the longest river in China, with a length of over 3,300 kilometers.
[B]: Yes, the Yangtze River is the longest river in China, with a length of about 7,100 kilometers. Its drainage area is about 33.28 million square kilometers, making it one of China's largest rivers and the third longest river in the world, with a total length of about 6,300 kilometers.
[C]: Yes, the Yangtze River is one of the longest rivers in China and the third longest river in the world. It originates from the Tibetan Plateau, flows through 11 provinces, and eventually empties into the East China Sea. The Yangtze River Basin is one of China's largest river basins, with abundant natural resources and cultural heritage, such as the Yangtze River Bridge, Yangtze River Maritime Museum, and ancient cities along the Yangtze.
[D]: I also know that China is one of the longest cities in the world, with a population of over 34 million people.
[E]: Yes, the Yangtze River is the longest river in China and also one of the longest rivers in the world. It originates from the Tibetan Plateau, flows through nine provinces and regions, and eventually empties into the East China Sea. The climate in the Yangtze River Basin is mild, with distinct seasons, making it an important agricultural and industrial base in China. The river basin is rich in aquatic resources and is one of China's major fishing resources.
[C]: The first artificial satellite was launched by the Soviet Union on October 4, 1957.
[D]: This satellite, named "Armstrong," was launched on November 20, 1961, and was the first successful satellite to send a human into space.
[E]: The first artificial satellite was launched by the Soviet Union on April 4, 1957. It was launched by Soviet astronaut Yuri Gagarin and Soviet engineer Roald Skohley. During the launch, Gagarin and Skohley used a small satellite named "Sputnik 1," which had a diameter of 4.3 meters and weighed 1,051 kilograms. This satellite, known as "Sputnik 1," marked a milestone in the Soviet space race and was the first instance of a human spacewalk in history.
[A]: Mount Tai is located in eastern China and is one of the Five Great Mountains of China.
[B]: Mount Tai is located in Tai'an City, Shandong Province, and is one of the Five Great Mountains of China.
[C]: Mount Tai is located in Tai'an City, Shandong Province, China, and is one of the Five Great Mountains of China.
[D]: As an AI assistant, I cannot access the internet, so I cannot provide information about your location. However, you can search for information or related keywords to find details about your location, or check news and websites for relevant information.
[E]: Mount Tai is located in Tai'an City, Shandong Province, China, and is one of the Five Great Mountains of China. It is also an important symbol of Chinese culture and history. With an elevation of 1,545 meters, it is one of the most famous tourist destinations in China. Its history dates back to the 5th century BC and was a site for ancient royal sacrifices. Mount Tai is a significant part of Chinese culture, known as the "First of the Five Great Mountains."
[A]: The highest mountain in the world is Mount Everest, located on the border between Nepal and China, with an elevation of 8,848 meters.
[B]: Mount Everest is the highest mountain in the world, located in the Himalayas, with an elevation of 8,848 meters.
[C]: Mount Everest is the highest mountain in the world, with an elevation of 8,848 meters.
[D]: The highest mountain in the world is Mount Everest, located in the Himalayas, with an elevation of 8,848.86 meters (29,031.69 feet).
[E]: The highest mountain in the world is Mount Everest, also known as Sagarmatha, located on the border between Nepal and China in the Himalayas, with an elevation of 8,848.86 meters (29,031.69 feet). It is one of the most famous climbing destinations, attracting climbers from around the world. The name Everest comes from Tibetan, meaning "The Goddess’s Peak," and it consists of two peaks, one in Nepal and one in China. Its height is determined through measurements of sea level, as it is the highest point on Earth. Although Everest is the highest mountain, it is not accessible from every place on Earth. Other places, such as the Andes Mountains in South America, are also high but do not have the highest peak. Overall, Everest is the highest mountain, but it is not accessible from every location on Earth.
[A]: The law of universal gravitation was proposed by the British physicist Newton. He believed that all objects exert gravitational force, which depends on their mass and distance. This theory reveals the relationship between mass and distance and explains the nature of gravitation.
[B]: The law of universal gravitation was discovered by Newton and proposed by the British physicist James Watt in 1879.
[C]: The law of universal gravitation was proposed by the British physicist Isaac Newton in the 17th century.
[D]: Gravitation is a force in the universe composed of mass or energy. According to Newton's law of universal gravitation, the force between two objects is proportional to their mass and inversely proportional to their distance. Thus, Newton's law of universal gravitation shows that the gravitational force between two objects is proportional to their mass and inversely proportional to their distance. This law has wide applications in physics, astronomy, and other fields.
[E]: The law of universal gravitation was proposed by the British physicist Isaac Newton in 1687. In his famous paper "Philosophiæ Naturalis Principia Mathematica," Newton proposed the law of universal gravitation, stating that any two objects exert a gravitational force on each other, proportional to their masses and inversely proportional to the square of their distance. This law is known as the law of universal gravitation and describes the relationship between mass and distance. It is crucial for explaining many phenomena in the universe, such as planetary orbits and gravitational lensing between stars, and has broad applications in astronomy, physics, and engineering, providing a fundamental theoretical basis for understanding the nature and evolution of the universe.
[A]: The main food of giant pandas is bamboo. They typically eat bamboo and need to consume about 30 kilograms of it each day.
[B]: Giant pandas mainly eat bamboo, which is their primary food. Their digestive system is quite specialized, and they need to digest a large amount of bamboo daily to meet their survival needs. Besides bamboo, giant pandas also survive in the wild.
[C]: Giant pandas mainly eat bamboo but occasionally consume other plants and animals. They also eat some wild fruits, small animals, and plants, but these are not their main food sources.
[D]: The main food of giant pandas is bamboo.
[E]: Giant pandas are herbivores that primarily eat bamboo. Their digestive system is adapted to process the high cellulose content of bamboo, but they also consume some other plants, such as herbaceous plants and small mammals. Their digestive system includes enzymes like "cellulase" to break down cellulose, helping them digest a large amount of fiber and nutrients. Additionally, they have "cellulose protein" that helps maintain their energy levels, aiding their survival in the wild. Overall, bamboo is their primary food, but they also eat other plants and small mammals.
[A]: Seawater contains many salts, which increase its salinity. The salts in seawater mainly come from the dissolved salts and form hydrogen bonds with water molecules, increasing the salinity.
[B]: Seawater contains abundant elements like calcium, phosphorus, potassium, and phosphorus, which are limited in quantity and distribution in the ocean. Additionally, seawater contains nutrients from various organisms and microorganisms, such as shells and seaweeds.
[C]: Seawater contains many dissolved salts, with the most common being sodium chloride. When salt is added to water, its molecules react with water molecules to form sodium hydroxide, which is a type of salt.
[D]: Water is essential for many organisms, as it can absorb and release carbon dioxide, release oxygen,
[E]: The primary reason seawater is salty is due to the high concentration of dissolved salts, with sodium chloride (NaCl) being the most significant. Sodium chloride is a very common salt that is present in most regions of the Earth, including oceans, lakes, rivers, and groundwater. When water evaporates from the ocean, it carries away some of the salt, making the seawater salty. Additionally, seawater contains other dissolved salts such as sulfates, magnesium sulfate, calcium sulfate, magnesium potassium sulfate, magnesium sodium sulfate, magnesium calcium sulfate, magnesium magnesium calcium sulfate, magnesium magnesium calcium sulfate, magnesium magnesium calcium sulfate, magnesium magnesium sodium sulfate, magnesium magnesium calcium sulfate, magnesium magnesium magnesium calcium sulfate, magnesium magnesium magnesium
1.**High School Chemistry**: With an accuracy of 42.11%, this is the strongest area for the model, suggesting a solid grasp of chemistry-related knowledge.
2.**Discrete Mathematics**: Achieving an accuracy of 37.50%, the model performs well in mathematics-related fields.
3.**Education Science**: The model shows good performance in education-related topics with a 37.93% accuracy.
4.**Basic Medicine**: The accuracy of 36.84% indicates strong performance in foundational medical knowledge.
5.**Operating Systems**: With a 36.84% accuracy, the model demonstrates reliable performance in computer operating systems.
### Areas Where the Model Struggles:
1.**Legal Topics**: The model performs poorly in legal-related areas such as Legal Professional (8.70%) and Tax Accountant (20.41%).
2.**Physics**: Both high school (26.32%) and college-level (21.05%) physics topics are challenging for the model.
3.**High School Politics and Geography**: The model shows low accuracy in these areas, with High School Politics at 15.79% and High School Geography at 21.05%.
4.**Computer Networking and Architecture**: The model struggles with Computer Networking (21.05%) and Computer Architecture (9.52%).
5.**Environmental Impact Assessment Engineering**: The accuracy is only 12.90%, indicating weak performance in environmental science.
- **Weaknesses**: Legal Topics, Physics, Politics, Geography, Computer Networking and Architecture, and Environmental Science.
This suggests that the model performs well in logical reasoning, foundational sciences, and some engineering disciplines but is weaker in humanities, social sciences, and certain specialized fields (such as law and taxation). To improve the model's performance, additional training in humanities, physics, law, and environmental science may be beneficial.
```
# 📌 Others
### Inference and Export
* [./export_model.py](./export_model.py) can export the model to the transformers format and push it to Hugging Face.
> If you have already tried training a new MiniMind model, you are welcome to share your model weights in Discussions or Issues. <br/>
> This can be a new version of MiniMind for a specific downstream task or vertical domain (e.g., sentiment recognition, healthcare, psychology, finance, legal Q&A, etc.). <br/>
> It can also be a new version of MiniMind after extended training (e.g., exploring longer text sequences, larger sizes (0.1B+), or larger datasets). <br/>
> Any contribution is considered unique and valuable, and all attempts are encouraged. <br/>
> These contributions will be promptly discovered and compiled in the acknowledgment list. Thank you again for all the support!