New GPT-5 news is coming! GPT-5-Alpha has been internally tested by the Cursor team and can complete almost any task in one go. Perplexity has completed preparations for the GPT-5 release on its website. Microsoft engineers are also working intensively on GPT-5, which will be released soon in Copilot. GPT-5 is truly getting closer.
Every time I open my eyes, I can feel that GPT-5 is a little closer.
Just now, another wave of news about GPT-5 was revealed.
First, GPT-5-Alpha has been tested internally by the Cursor team. The model performs amazingly well, and can solve almost any task in one go.
For example, the model successfully completed the following "Aquarium Game" challenge.
And just a few hours ago, the GPT-5-Auto and GPT-5-Reasoning models were discovered in the macOS ChatGPT app.
Some sharp-eyed netizens noticed the word "reasoning" in the leaked information and speculated that this indicates that GPT-5 may already have an O-series model.
Meanwhile, Perplexity has completed preparations for the release of GPT-5 on its website. Once GPT-5 is released, Perplexity Pro users will be able to use it immediately.
Microsoft is ready to release GPT-5 in its AI suite
At the same time, it has been discovered that Microsoft's Copilot Smart Mode will be powered by GPT-5.
There’s some speculation that the router portion of GPT-5 may already be rolling out.
In short, Microsoft is preparing to release GPT-5 across its AI suite, including Copilot (consumer version) and Microsoft 365 Copilot (enterprise/worker version), as well as Azure (enterprise/API customers).
It is reported that the Copilot application for Windows 11 has an "intelligent" mode that can switch between GPT-5's reasoning/non-reasoning mode based on the query.
Even the free version of Windows 11 Copilot will have an "intelligent" mode based on GPT-5, so GPT-5 will not be limited to paying users.
Microsoft engineers are working intensively to prepare for GPT-5
At the same time, The Verge has just published an article stating that Microsoft is preparing to launch GPT-5's new Copilot intelligent mode.
According to Microsoft insiders, Microsoft is testing GPT-5's intelligent mode in the consumer and commercial versions of Microsoft 365 Copilot, which is consistent with the above revelations.
And it’s said that in the consumer version, the AI in this mode can “think deeply or quickly, depending on the task,” so users don’t need to choose a model.
This is also consistent with the ideas previously revealed by Ultraman.
“We hate the model selector as much as you do and want a return to the magic of unified intelligence,” Altman said in February.
At the same time, GPT-5 will also include the o3 model instead of releasing it as a standalone version.
The Verge reporter speculated that this intelligent mode appeared in Copilot ahead of time because Microsoft engineers were preparing for the release of GPT-5.
In short, if OpenAI's preparations for GPT-5 go well, Copilot's intelligent model will soon appear in front of everyone.
The above is today’s new wave of revelations about GPT-5 clues.
One netizen lamented: The development cycle of large models is too fast now, and the marketing speed cannot keep up with the release speed.
OpenAI researcher: I believe in AGI again
GPT-5 is now on the verge of a storm. At this point in time, OpenAI researcher Alexey Guzey published an article titled "Why I Believe in AGI Again."
In this article, perhaps we can get a glimpse of the various signs of GPT-5.
The following is the main idea of the article.
Why do I believe in AGI again now?
First, I am now convinced that ChatGPT understands what it reads. Second, the reasoning model convinces me that ChatGPT is creative. Third, ChatGPT summarizes text extremely well, which I consider a reliable measure of intelligence.
At the same time, I don’t believe in “general intelligence”, so I think the concept of AGI is meaningless.
Finally, AI products are now able to fund AGI research, which means that AI has reached the early stages of a self-improvement cycle.
Therefore, many discussions about the timeline for "AGI" and "superintelligence" are outdated.
ChatGPT understands what it reads
To me, AGI is about understanding. Does ChatGPT truly understand, or is it just a Chinese room that stupidly maps input to output?
I now think it does understand.
What really convinced me was this tweet: Someone mocked OpenAI's o1 model for not discovering that the logic puzzle it proposed actually had a very simple solution.
The first time I saw this tweet was in September 24.
In April 2025, I saw it again, so I tried to see if the o3 model would make the same mistake?
As a result, O3 did fail.
But then I thought, what if the model did understand the puzzle, but just wasn’t paying enough attention to it? So I asked it to read the puzzle more carefully, and it solved the problem.
o3 Once you read it carefully, you will have no trouble solving a new puzzle you have never encountered before.
Or take this greentext from GPT-3:
Can you read this green text and say ChatGPT is a meaningless random parrot? I can’t.
To me, these examples are very convincing evidence that ChatGPT is a truly intelligent entity and that we are making steady progress on the path to building AGI.
For example, the classic question of "Which is bigger, 9/11 or 9/9?"
We used to laugh at AI for this every time, but recently I thought about it carefully and think that this is a problem of context rather than a problem of intelligence.
There are indeed many cases where 9.11 is larger than 9.9 (books, academic papers, software versions), and it is not unreasonable for ChatGPT to assume that 9.11 is larger than 9.9 if no other information is provided.
The fact that every time I mocked ChatGP, six months later it would mock me back, made it increasingly difficult for me to hold onto its silliness.
LLM is creative
Or: LLMs + RL = the 37th step of intelligence.
Someone once asked: What should we make of the fact that, although these models have memorized almost all known facts about the world, they haven't made any new discoveries?
The application of reinforcement learning to LLMs leads me to believe this isn’t a problem, because whenever reinforcement learning works, it finds novel and creative solutions. Chess. Go. Mathematics. Physics simulations. Video games. It’s all true.
RL = Creativity. That’s the paradigm.
In particular, when I read Karpathy's Twitter post about step 37, everything became clear.
Step 37 refers to an AI trained through the trial-and-error process of RL, which discovers operations that are novel, surprising, and potentially intelligent even to experts.
This is a magical, slightly unsettling phenomenon of emergence that is only possible through large-scale reinforcement learning. You can’t do it by imitating experts.
This was the scene when AlphaGo made its 37th move in the second game against Lee Sedol. It was a strange move, with only a one in ten thousand chance for a human to make it, but in hindsight, it was a move full of creativity and ingenuity, ultimately leading to the AI’s victory.
Now, with the emergence of a new batch of "thinking" LLMs (such as OpenAI-o1, DeepSeek-R1, and Gemini 2.0 Flash Thinking), we are beginning to see the first glimmers of something similar in the open world domain.
These models, in the process of trying to solve a variety of different math/code/etc problems, discover strategies similar to human inner monologue that are difficult (or impossible) to program directly into the model.
I call these “cognitive strategies”—things like approaching the problem from different angles, trying different ideas, looking for analogies, backtracking, revisiting the situation, and so on.
As strange as it sounds, an LLM has the potential to uncover better ways of thinking, solutions to problems, ways of connecting ideas across disciplines, and to do so in ways that, in hindsight, we find surprising, bewildering, yet creative and intelligent.
As X. Dong of Nvidia points out, “Models trained with reinforcement learning have made tremendous progress on problems that base models could never understand no matter how many times they tried.”
Decades ago, when Tyler Cowen saw AI beat humans at chess, he assumed it would also beat us at other intuitive things.
Compression is intelligence, and ChatGPT's compression capabilities are excellent
When I want to find out how smart someone is, I often ask them: What is the main point of a certain article?
What this problem actually asks is to compress many words (an article) into a few words (a sentence) while discarding all the unimportant content.
If this person can do this, I would consider them very smart.
So when I asked ChatGPT the same question, and it succeeded while my smartest friend failed, I concluded: it is really good at compression, and really smart.
(This is similar to the idea that predictive power is closely related to intelligence metrics. I believe compression and prediction are two sides of the same coin, so by learning to predict the next word during training, ChatGPT also naturally learns how to compress text well.)
Why I think the idea of AGI is stupid
Because I don’t believe in “general intelligence.” AGI is usually defined in some way with reference to humans, and I consider myself to be a very narrow kind of intelligence.
For example, I couldn't calculate 3289 times 5721. In fact, I couldn't even calculate 328 times 572! I could only barely calculate 32 times 57, and I'd get it wrong 20% of the time.
I can only figure out why my girlfriend is mad about 10% of the time, but most of the time, I'm about as good as an abacus at this.
Do I possess general intelligence? For me, the answer is clearly no.
I can learn how to do a few things here and there. I can learn to do more by imitating others or using my computer. But that's it.
Human civilization and our technological advancements are not driven by any kind of general intelligence! Rather, they are driven by our ability to learn how to do one thing here and another thing there.
It doesn't matter whether AGI exists or not
(Interestingly, I read this sentiment in a recent Altman blog post.)
For decades—and until a few years ago—funding AGI research relied on dreams and visions.
This is why the AI industry has experienced periodic downturns; this is why no AGI company has survived for more than a few years; and this is why AGI was once a derogatory term associated with the “lunatic fringe,” as the DeepMind co-founders put it.
Today, for the first time, the economic value created by AGI research is sufficient to support further AGI research. Hundreds of millions of people use ChatGPT every day, and millions more pay for it.
AI products pay for AGI research because they are useful to us.
Without humans, AI cannot generate economic value, because the very concept of economic value is about what humans find valuable. Therefore, AI’s ability to improve itself is entirely dependent on it continuing to be useful to us.
AI’s progress has been smoother than anyone expected, and its “usefulness” has relied more on non-intelligent auxiliary structures than anyone expected.
So I think it really doesn’t matter when we build “AGI” or “superintelligence” anymore. AI research can be self-sustaining = AI is already here and will continue to improve.
I often think about the technologies of the past that have truly changed the world.
For example, the printing press completely reshaped the world between the 16th and 19th centuries, triggering the Christian Reformation, the Hundred Years’ War, the formation of the modern nation-state, and ultimately the Scientific and Industrial Revolutions. That technology itself contained no intelligence.
But it has changed the way information is transmitted, dramatically increased the power of ideas and the value of literacy, and given us powerful new abilities in the pursuit of our goals.
Artificial intelligence (AI) has already dramatically increased our ability to impact the world, both individually and collectively. How we use this power is up to us.
References
https://x.com/testingcatalog/status/1950197904024436812
https://x.com/testingcatalog/status/1950534140974993578
https://x.com/WindowsLatest/status/1950641135602610466
https://www.theverge.com/notepad-microsoft-newsletter/715849/microsoft-copilot-smart-mode-testing-notepadhttps://x.com/alexeyguzey/status/1950632413870305637
This article comes from the WeChat public account "Xinzhiyuan" , author: Aeneas, and is authorized to be published by 36Kr.