Passing the MIT undergraduate math exam with a GPT-4 full score! This set of prompts has become popular
AD |
Hengyu originates from Aofei TempleQuantum bit | official account QbitAISurprisingly, the MIT math exam was defeated by GPT-4?!Suddenly, someone loudly announced in their latest paper work:Mathematics and EECS of GPT-4 at MIT(Department of Electrical Engineering and Computer Science)In the undergraduate degree examination,The demonstrated abilities fully meet the graduation requirements.And properlyGet a perfect score!MIT
Hengyu originates from Aofei Temple
Quantum bit | official account QbitAI
Surprisingly, the MIT math exam was defeated by GPT-4?!
Suddenly, someone loudly announced in their latest paper work:
Mathematics and EECS of GPT-4 at MIT(Department of Electrical Engineering and Computer Science)In the undergraduate degree examination,The demonstrated abilities fully meet the graduation requirements.
And properlyGet a perfect score!
MIT.
GPT-3.5.
.
GPT-4.
GPT-3.5!
Is it possible that in the future, a stronger model than GPT-4 is no longer needed to solve academic problems?
Another netizen showed his "cutting edge" of online surfing, playing a stick that YannLeCun roast about "GPT-4 is not as intelligent as a dog" these two days:
GPT-4 Open MIT Exam
Specifically, GPT-4 participated in such a test this time:
4550.
These 4550 problems and solutions, from MIT Mathematics Department and EECS students.
Including:
6-1: Electrical Science and Engineering;
6-2: Electrical Engineering and Computer Science;
6-3: Computer Science and Engineering;
6-4: Artificial intelligence and decision-making;
18-1: General Mathematics;
18-2: Applied mathematics;
18-3: Pure Mathematics;
18-C.
Detailed classification summary for each major
The questions are all from the MIT dataset, randomly generating 228 questions,Issues that do not involve images or existing solutions.
.
.
This time, those who participated in the examNot only GPT-4 and GPT-3.5, but also StableVicuna-13B, LLaMA-30B, and LLaMA-60B.
4.
GPT-4100%LLaMA-30B30%.
It is worth noting that,The original version of GPT-4 is ready to use out of the box without any tuning, and achieved a 90% score in this MIT exam.
Few-Shot+CoT+Self-critique+Experts.
GPT-4.
In addition, the research team also carried out engineering optimization in the Tooltip,Specific 'spells'As follows:
Wait, is the rater GPT-4 himself?
LLM.
2AI.
similarXiao Ming planted 5 lemon trees and received 6 lemons from each tree every year. How many lemons did he receive in total over the past 10 years.
At the beginning of last year, the joint research of MIT+Harvard+Columbia University+University of Waterloo indicated that by transforming mathematical problems into equivalent programming problems, OpenAI's Codex, GPT-3's fellow teacher, could grasp the advanced knowledge and reachMIT undergraduate level.
Learned 6 randomly selected example questions from MIT undergraduate basic mathematics courses, with 25 randomly selected questions for each of the 6 courses, plus an ACT level(American College Entrance Examination)60.
210AI.
AIMIT undergraduate levelCodex
Codex.
So, this time GPT-4 performed exceptionally well, what a wonderful word to say~
.
Mainly including2 large groove points.
OpenAI.
This also means that,Unable to prove 4550 problems and solutions in the dataset, not present in the training set of GPT-4.
GPT-4.
yygqGPT-4.
The second slot, the GPT-4's final 100% scoring rate, seems to be something wrong???
Upon closer inspection, there is a crucial point in section 2.6 of the paper:
QSLLMAGPT-4.
GPT-40-5.
GPT-4GPT-4.
.
GPT-4.
.
Some even shouted that these questions should be left to MIT math and EECS students to do, and they should be constantly given "good tips" so that human students can also achieve 100%
OneMoreThing
A small colored egg:
Throughout the entire test, it is basically possible to deploy and run on a laptop computerStableVicuna-13B48%.
LLaMA-65B10MIT fine-tuingLLaMA-30B.
It makes people have to think about the correlation between model size and capability
.
Reference link:
[1] https://arxiv.org/abs/2306.08997
[2] https://twitter.com/johnjnay/status/1669687958960586753
[3] https://twitter.com/arankomatsuzaki/status/1669528841629601792
[4] https://twitter.com/emollick/status/1669742473097228292
- End -
Follow us and stay informed of cutting-edge technology trends as soon as possible
Disclaimer: The content of this article is sourced from the internet. The copyright of the text, images, and other materials belongs to the original author. The platform reprints the materials for the purpose of conveying more information. The content of the article is for reference and learning only, and should not be used for commercial purposes. If it infringes on your legitimate rights and interests, please contact us promptly and we will handle it as soon as possible! We respect copyright and are committed to protecting it. Thank you for sharing.(Email:[email protected])
Mobile advertising space rental |
Tag: Passing the MIT undergraduate math exam with GPT-4 full
The stage for skilled talents is becoming increasingly broad
NextThe video version of Midjournal has evolved again: generating videos in just one sentence, Google injecting capital, and netizens exclaiming that Hollywood is dead!
Guess you like
-
The 2025 Chinese New Year (Spring Festival) film box office has exploded, exceeding 3 billion RMB and setting a new record for presales!Detail
2025-01-29 11:55:06 1
-
Seres and Beihang University Join Hands to Build an Innovative Ecosystem, Deepening Industry-Academia-Research Collaboration and Promoting Technological TransformationDetail
2025-01-28 14:46:18 1
-
Douyin 2024 Platform Governance Report: Safeguarding Security, Building a Better CommunityDetail
2025-01-28 14:25:55 1
-
Chinese Scientists Develop a Lightweight Bionic Dexterous Hand with 19 Degrees of Freedom, Promising to Revolutionize Prosthetic and Robotics TechnologyDetail
2025-01-28 14:16:39 1
-
DeepSeek: A Chinese AI Startup's Meteoric Rise Shakes Up Global Tech and Sends US Stocks PlungingDetail
2025-01-28 14:13:23 1
-
WeChat's New Year's Red Envelope Feature Gets a Voice Message Upgrade for Warmer Wishes!Detail
2025-01-26 11:37:36 1
-
360 Digital Security Group and Zhibangyang Education Technology Join Forces to Build a New Ecosystem for Cybersecurity and AI Talent CultivationDetail
2025-01-24 15:09:51 1
-
Visionox Achieves Mass Production of AMOLED with Solid-State Laser Annealing (SLA) Technology, Ushering in a New Era for the Display IndustryDetail
2025-01-24 14:34:23 1
-
Seres at the Davos Forum: The Path to Globalizing New Energy Vehicles Through Cooperation in the Intelligent EraDetail
2025-01-23 13:28:12 1
-
Amazon to Close All French-Speaking Quebec Warehouses, Laying Off Nearly 2,000 EmployeesDetail
2025-01-23 10:51:23 1
-
The official launch of the 2025 Electric Bicycle Trade-in Policy: Upgraded Subsidy Standards, Procedures, and PromotionDetail
2025-01-23 10:48:52 1
-
Xbox Series X|S Officially Supports External Hard Drives Larger Than 16TB: Saying Goodbye to Storage WorriesDetail
2025-01-23 10:39:19 1
-
Leaders from the Beijing Chaoyang District CPPCC Visited Quantum Leap Group, Affirming its Contributions and Future Prospects in the Silver Hair EconomyDetail
2025-01-22 17:06:56 1
-
China's Car Imports Remain Sluggish in 2024: 12% Decline, Sharp Drop in New Energy VehiclesDetail
2025-01-22 11:37:25 1
-
China Railway Group Limited (CRGL) officially debunks "speed-up" ticket booking software: Not a shortcut, but a pathway to riskDetail
2025-01-22 11:36:09 1
-
Dago Bio Completes Over $20 Million A+ Round Funding to Accelerate Novel Molecular Glue Drug DevelopmentDetail
2025-01-22 11:34:05 11
-
Rapid Degradation of Global Lake Submerged Vegetation: Satellite Observations Reveal a Critical Period of Ecosystem ShiftDetail
2025-01-22 11:29:03 1
-
Star Ace Capital Group and Abu Dhabi Investment Office Partner to Build a Global Esports Industry BenchmarkDetail
2025-01-22 11:27:50 1
-
Hisense Television Leads the 100-Inch Large-Screen Market in 2024, Achieving an Unparalleled Industry LegacyDetail
2025-01-22 11:12:49 1
-
WeChat Launches "Gifts" Feature: Streamlining Gift-Giving and Powering Social Commerce GrowthDetail
2025-01-21 16:05:45 1