[Introduction] The GPT-5.2, touted as a perfect scorer and chart-topping performer, seems to have lost its edge upon release? Many netizens have commented that it does appear significantly weaker than its initial release. However, those who tested it early claim it is indeed very powerful, even deserving of the GPT-6 designation!
Last night, OpenAI dropped the bombshell of GPT-5.2.
According to official benchmark tests, it almost completely outperforms the Gemini 3 Pro.
GPT-5.2 excels at helping people complete economically valuable tasks, such as creating spreadsheets, making PowerPoint presentations, writing and reviewing code, analyzing long documents, and so on.
Moreover, it is claimed that in benchmark tests such as GDPval, it can catch up with or even outperform professionals 70.9% of the time.
It can be said that this is a product that OpenAI was determined to complete, even at the cost of changing its AGI goals, and it also bears the heavy responsibility of countering Gemini 3.
So, how does the GPT-5.2 actually feel in real-world testing?
GPT-5.2 Real-world Test: Does it Degrade in Intelligence Immediately After Launch?
Surprisingly, a post about a GPT-5.2 test failing went viral on X.
If you ask it, "How many R's are there in 'garlic'?" it will answer: 0.
In contrast, other models performed much more stably.
Ultimately, this is a fundamental problem with LLM: the inability to count the number of letters due to tokenization.
However, as long as you force the selection of the Thinking version, GPT-5.2 can answer this question correctly.
On Reddit, many users have also pointed out that GPT-5.2 seemed to have a lot of features when it was first released.
As a result, a few hours later, it suddenly became less intelligent.
Some people reported that their GPT-5.2 worked fine when they started using it at 8:30 in the morning, but suddenly stopped working after they finished a cup of coffee.
It seems that every time a new model is released, it is weakened within a few hours. What kind of operation is OpenAI doing?
An expert testifies: It's still quite strong.
However, this minor incident did not affect the positive comments circulating among the public.
Last night, when GPT-5.2 was released, netizens were shocked.
For example, some people say that this leap in ARC-AGI 2 is truly crazy. How exactly did OpenAI achieve this?
People initially thought OpenAI had fallen behind Google, but it seems that wasn't the case!
It seems that OpenAI is still holding back a lot of impressive things that haven't been released yet.
Moreover, users who have experienced the super-powerful full-power version of GPT-5.2 have given it unanimous praise.
Wharton Business School professor Ethan Mollick said he was fortunate enough to use GPT-5.2 ahead of time, and its performance was very impressive.
For example, consider this task: Create a visually interesting shader that can run in twigl-dot-app, making it look like an infinite neo-Gothic city of towers, partially submerged in a turbulent ocean.
Many netizens praised the video, saying that GPT-5.2 not only followed the instructions but also chose a very reasonable aesthetic and structure in the code.
Then, the professor asked GPT-5.2 to create a chart of human test scores over the years.
This task is very complex because it requires searching and cross-referencing a large amount of information during the process, and then generating useful results all at once.
As you can see, GPT-5.2's performance is truly amazing.
This example of Twigl code demonstrates the powerful coding capabilities of GPT-5.2.
A major leap forward in reasoning, mathematics, and programming
The CEO of Magicpathai stated that he has been testing GPT-5.2 for some time.
His assessment of the model was: "A major leap forward in complex reasoning, mathematics, programming, and simulation."
In this example, it builds a complete 3D graphics engine in a single file, supports interactive controls, and achieves a resolution of up to 4K.
In this video, he also performed a challenging deduction using GPT-5.2.
Some questioned whether the graphics engine was built using the GPT-5.2 library. The CEO stated that all the code and graphics were written from scratch.
In other words, the progress of GPT-5.2 is not incremental, but a complete paradigm shift in coding assistant functionality.
Netizens exclaimed: "This speed of progress is truly dizzying!"
The CEO's assessment of GPT-5.2 is: it is the best agent model launched by OpenAI, capable of running a large number of tools continuously without problems, and faster than its predecessor.
To test its functionality, he built an agent that could use GPT-5.2, 5.1, and 5 simultaneously.
The results show that GPT-5.2 does not require any preamble when calling tools, and it does not get lost even during long sessions.
Someone even asked GPT-5.2 to write their inner world in ASCII, and the answer was quite shocking.
In summary, based on feedback from most users, GPT-5.2 can handle practical work stably, clearly, and smoothly.
Compared to the old model, which was prone to minor interruptions, GPT-5.2 has a stronger understanding of the task and completes it more smoothly.
According to the ARC Prize, the latest state-of-the-art (SOTA) score for GPT-5.2 Pro (X-High) is 90.5%, which means that AI efficiency has improved by about 390 times in one year.
The mysterious Chinese behind it all has come to light.
As in the past, many of the unsung heroes behind GPT-5.2 are Chinese.
For example, Yu Bai, a Chinese researcher at OpenAI and an alumnus of Peking University, was among the first to announce GPT-5.2.
He studied mathematics at Peking University for his undergraduate degree and earned a PhD in statistics from Stanford University.
Yun Dai, who was in charge of post-training, graduated from Tsinghua University and earned a master's degree in computer science from the University of California, Irvine.
Another Chinese researcher at OpenAI, Zuxin Liu, works on post-training of inference models.
He graduated from Beihang University with a bachelor's degree and pursued his master's and doctoral degrees at CMU.
Aston Zhang is a PhD student at the University of Illinois at Urbana-Champaign and is now a researcher at OpenAI.
He thanked the team, especially emphasizing GPT-5.2 Thinking's ability to handle multi-step tasks.
In short, OpenAI delivered a powerful blow in last night's AI battle.
What will Google do next?
References:
https://x.com/skirano/status/1999182295685644366
https://x.com/emollick/status/1999185085719887978
This article is from the WeChat official account "New Intelligence" , edited by Aeneas, and published with authorization from 36Kr.





