GPT o1: The Ultimate Debater
For the past three years, OpenAI has dominated the artificial intelligence industry with their breakthrough product, ChatGPT. Over time, they have pioneered new models that further improve ChatGPT’s capabilities and make it more advanced. In the fall of 2023, OpenAI released GPT 4, a huge upgrade from predecessors GPT 3 and 3.5. Then in February of 2024, they released GPT 4o, which focused on quicker response times and faster web-browsing. Finally, in September of 2024, OpenAI announced yet another new model, GPT o1, this time focusing mainly on reasoning capabilities and designing a deeper thought process for the model to work with. GPT o1 raised the bar for what an AI model is capable of. If you’re curious and want to learn more about this model, you’re in the right place. All of GPT o1’s real-world capabilities and testing results are at the bottom of this article.
Let’s start with the model’s ability to think. GPT o1 was made to give users a much more in-depth answer by thinking for longer before it gives its response. The extended time allows the model to answer more accurately using a variety of resources it has available to it, including the contextual information given in the user’s inquiry. The longer thinking time can be a burden for some people, especially if you want a quick response. However, if you’re looking for the most accurate answer, GPT o1’s longer thinking time will be a benefit and should give you exactly what you’re looking for.
Clearly, GPT o1 needs a lot of time to process an answer (rather than just spitting one out quickly like what older models do). But what does it actually do with this time? To answer that, users have prompted GPT o1 with incredible ideas that no one would ever believe an AI would be able to answer. Below are a few examples gathered from the first week that GPT o1 was open to paying users.
These examples could have you thinking that it’s easy for an AI model to copy humans, but what about the stuff humans can’t do? During GPT o1’s testing and training stage, OpenAI put the model through a variety of standardized tests and asked it complex questions that many experts struggle with solving. GPT o1 placed in the top 500 students in the US Math Olympiad and exceeded human PhD-level accuracy on the Graduate-Level Google-Proof Q&A (GPQA) Diamond benchmark, composed of 448 questions relating to biology, chemistry, and physics. In addition, GPT o1 landed in the 99.8th percentile of the Law School Admissions Test (LSAT), and scored a 740 out of 800 on the reading and writing section of the SAT. The model was also put through AP exams and tested on its general fact knowledge. Below you can see the scoring chart from OpenAI’s research page for GPT o1.
In conclusion, OpenAI has made an incredibly capable model that excels at very complex tasks and can answer almost any question on the first try. GPT o1 is exclusively available to ChatGPT Plus users, paying $20 monthly. Worth it? That’s up to you to decide.