IDEA Research Institute's Ziya2-13B Big Model Launches Open Source in the Magic Building Community
AD |
On October 16th, it was reported that the IDEA Research Institute (Guangdong Hong Kong Macao Greater Bay Area Digital Economy Research Institute) CCNL Fengshen List team has opened up the Chinese base model Ziya2-13B-Base and its dialogue model Ziya2-13B-Chat, both of which are completely free and commercially available, and have made their debut on the ModelScope of the magic community.Three months ago, Meta's open-source Llama2 series of large models, including multiple versions such as 7B, 13B, and 70B, were all trained based on a dataset of over 2 trillion tokens
On October 16th, it was reported that the IDEA Research Institute (Guangdong Hong Kong Macao Greater Bay Area Digital Economy Research Institute) CCNL Fengshen List team has opened up the Chinese base model Ziya2-13B-Base and its dialogue model Ziya2-13B-Chat, both of which are completely free and commercially available, and have made their debut on the ModelScope of the magic community.
Three months ago, Meta's open-source Llama2 series of large models, including multiple versions such as 7B, 13B, and 70B, were all trained based on a dataset of over 2 trillion tokens. On the basis of Llama2-13B, the Fengshen Bang team conducted further training on the 650Btokens self built high-quality Chinese and English dataset, and ultimately trained the Ziya2-13B series model to compensate for the insufficient Chinese language ability of Llama2.
In May of this year, the Fengshen Bang team launched the Ziya-LLaMA-13B model based on the LLaMA generation and opened it up, quickly becoming the best Chinese base model in the LLaMA ecosystem. Compared to Ziya-LLaMA-13B, the initial training loss value of Ziya2-13B-Base is lower, the training speed is increased by 38%, and it also solves the problem of instability in the later stage of training.
The evaluation results show that the Ziya2-13B-Base model performs significantly better than Llama2-13B and Ziya-LLaMA-13B in downstream comprehension tasks such as Chinese, English, mathematics, and code.
Relying on the powerful basic capabilities of Ziya2-13B-Base, the Fengshenbang team optimized the training strategy for the SFT stage. Based on the Ziya2-13B-Base model pre trained with 300B tokens, a dialogue model named Ziya2-13B-Chat was trained using approximately 400000 instruction samples and 8K context windows. In addition, reinforcement learning training was conducted on reward models trained on tens of thousands of high-quality human preference data for various Q&A, writing, and security tasks, making the output of the Ziya2-13B-Chat model more in line with human preferences and with higher security.
The evaluation results show that the Ziya2-13B-Chat model and the Ziya-LLaMA-13B-v1.1 model achieved a 66.5% success rate in side by side evaluation, and a 58.4% success rate compared to the version before human feedback reinforcement learning.
The Magic Community has developed an experience interface based on the Ziya2-13B-Chat model, which ordinary users can directly experience or use; Magic official account also launched a best practice tutorial to advance the deployment, reasoning and fine-tuning of the model for developers' reference.
As an important partner of the Magic Building community, the Fengshen Bang team's open-source series of large models all use Magic Building as their debut platform, and these models are also widely popular among the developer community.
Alibaba Cloud Magic is the largest and most active AI model community in China, with over 1200 high-quality AI models contributed by more than 30 leading AI institutions. It provides one-stop model experience, download, inference, optimization, customization, and other services. The total model load has exceeded 85 million times.
Disclaimer: The content of this article is sourced from the internet. The copyright of the text, images, and other materials belongs to the original author. The platform reprints the materials for the purpose of conveying more information. The content of the article is for reference and learning only, and should not be used for commercial purposes. If it infringes on your legitimate rights and interests, please contact us promptly and we will handle it as soon as possible! We respect copyright and are committed to protecting it. Thank you for sharing.(Email:[email protected])
Mobile advertising space rental |
Tag: IDEA Research Institute Ziya2-13B Big Model Launches Open Source
Hejian Industrial Software releases multiple new products to create a full process EDA platform
NextVivo will release its self-developed AI large model matrix and be applied for the first time in the OriginOS4 system
Guess you like
-
The 2025 Chinese New Year (Spring Festival) film box office has exploded, exceeding 3 billion RMB and setting a new record for presales!Detail
2025-01-29 11:55:06 1
-
Seres and Beihang University Join Hands to Build an Innovative Ecosystem, Deepening Industry-Academia-Research Collaboration and Promoting Technological TransformationDetail
2025-01-28 14:46:18 1
-
Douyin 2024 Platform Governance Report: Safeguarding Security, Building a Better CommunityDetail
2025-01-28 14:25:55 1
-
Chinese Scientists Develop a Lightweight Bionic Dexterous Hand with 19 Degrees of Freedom, Promising to Revolutionize Prosthetic and Robotics TechnologyDetail
2025-01-28 14:16:39 1
-
DeepSeek: A Chinese AI Startup's Meteoric Rise Shakes Up Global Tech and Sends US Stocks PlungingDetail
2025-01-28 14:13:23 1
-
WeChat's New Year's Red Envelope Feature Gets a Voice Message Upgrade for Warmer Wishes!Detail
2025-01-26 11:37:36 1
-
360 Digital Security Group and Zhibangyang Education Technology Join Forces to Build a New Ecosystem for Cybersecurity and AI Talent CultivationDetail
2025-01-24 15:09:51 1
-
Visionox Achieves Mass Production of AMOLED with Solid-State Laser Annealing (SLA) Technology, Ushering in a New Era for the Display IndustryDetail
2025-01-24 14:34:23 1
-
Seres at the Davos Forum: The Path to Globalizing New Energy Vehicles Through Cooperation in the Intelligent EraDetail
2025-01-23 13:28:12 1
-
Amazon to Close All French-Speaking Quebec Warehouses, Laying Off Nearly 2,000 EmployeesDetail
2025-01-23 10:51:23 1
-
The official launch of the 2025 Electric Bicycle Trade-in Policy: Upgraded Subsidy Standards, Procedures, and PromotionDetail
2025-01-23 10:48:52 1
-
Xbox Series X|S Officially Supports External Hard Drives Larger Than 16TB: Saying Goodbye to Storage WorriesDetail
2025-01-23 10:39:19 1
-
Leaders from the Beijing Chaoyang District CPPCC Visited Quantum Leap Group, Affirming its Contributions and Future Prospects in the Silver Hair EconomyDetail
2025-01-22 17:06:56 1
-
China's Car Imports Remain Sluggish in 2024: 12% Decline, Sharp Drop in New Energy VehiclesDetail
2025-01-22 11:37:25 1
-
China Railway Group Limited (CRGL) officially debunks "speed-up" ticket booking software: Not a shortcut, but a pathway to riskDetail
2025-01-22 11:36:09 1
-
Dago Bio Completes Over $20 Million A+ Round Funding to Accelerate Novel Molecular Glue Drug DevelopmentDetail
2025-01-22 11:34:05 11
-
Rapid Degradation of Global Lake Submerged Vegetation: Satellite Observations Reveal a Critical Period of Ecosystem ShiftDetail
2025-01-22 11:29:03 1
-
Star Ace Capital Group and Abu Dhabi Investment Office Partner to Build a Global Esports Industry BenchmarkDetail
2025-01-22 11:27:50 1
-
Hisense Television Leads the 100-Inch Large-Screen Market in 2024, Achieving an Unparalleled Industry LegacyDetail
2025-01-22 11:12:49 1
-
WeChat Launches "Gifts" Feature: Streamlining Gift-Giving and Powering Social Commerce GrowthDetail
2025-01-21 16:05:45 1