OpenAI Introduces New 'Understanding' Modelnicknamed o3, Eraising Previous o2 Model
The 12 Days of OpenAI's "Shipmas" conclude with the unveiling of o3, a cutting-edge "reasoning" model that OpenAI touts as its most sophisticated yet. While the model isn't available to the general public yet, safety researchers can apply for a sneak peek starting today.
The idea behind reasoning models is to combat the issue of chatbots frequently providing incorrect answers. Chatbots essentially don't think like humans, requiring unique strategies to emulate human thought processes.
When posed a question, reasoning models pause and evaluate related prompts that could potentially yield an accurate answer. For instance, if queried about growing habaneros in the Pacific Northwest, the o3 model may pose a series of questions to reach a conclusion, such as "where do habaneros typically grow", "what are the ideal conditions for habanero growth", and "what kind of climate does the Pacific Northwest have". Individuals familiar with chatbots are aware that sometimes, prompting a chatbot with follow-ups is necessary to get the desired result. Reasoning models are intended to handle this additional legwork.
o3 is the successor to o1, OpenAI's initial reasoning model. The team opted to skip the "o2" naming convention as a token of respect for the British telecommunications company, but the advanced-sounding name certainly doesn't hurt the product's image. OpenAI claims the new model allows for adjustable reasoning time. Users can select between low, medium, or high reasoning time; the greater the compute, the better o3's performance. OpenAI plans to collaborate with researchers to "red-team" the new model, preventing it from producing potentially hazardous responses (since again, it isn't human and doesn't discern right from wrong).
Reasoning is the buzzword of the day in the realm of generative AI, as industry insiders believe it is the key to enhancing the performance of large language models. More compute doesn't necessarily equate to equal performance improvements, so new techniques are required. Google DeepMind recently unveiled its reasoning model named Gemini Deep Research, capable of generating a report that analyzes various web sources within 5-10 minutes to reach its conclusions.
OpenAI is confident in o3, and its impressive benchmarks are noteworthy. In a Codeforcing test, which measures coding ability, o3 scored 2727. To put this into context, a score of 2400 would place an engineer in the 99th percentile of programmers. It achieved a score of 96.7% on the 2024 American Invitational Mathematics Exam, missing just one question. We will have to see how the model performs in real-world applications, and it is still not advisable to heavily rely on AI models for critical tasks that require accuracy. However, optimists are hopeful that the accuracy issue is being addressed. This is crucial since Google's AI Overviews in search still frequently become the target of social media mockery.
AI model companies like OpenAI and Perplexity are competing to become the next Google, amassing the world's knowledge and aiding users in making sense of it all. They even offer search products designed to more closely replicate Google, with real-time web results access.
All these players seem to outpace one another daily. The atmosphere is somewhat akin to the late '90s when there were numerous search engines to choose from—Google, Yahoo, and AltaVista, Ask Jeeves, to name a few—all scooping up the internet's data and presenting it with a different user experience. Most of them eventually disappeared after one emerged that surpassed the others—Google.
OpenAI currently has a significant edge, boasting hundreds of millions of monthly active users and a partnership with Apple. However, Google has been earning praise lately for advancements in its Gemini models. The Verge reports that the company is planning to integrate Gemini more deeply into its search interface soon.
The future of AI in generating accurate responses relies heavily on the development of reasoning models, as demonstrated by OpenAI's o3 and Google DeepMind's Gemini Deep Research. Technology and artificial-intelligence continue to shape the tech landscape, with companies like OpenAI and Google striving to outperform each other and revolutionize the way users interact with information.