Language models and tools: more choice, more responsibility
September 27, 2024
OpenAI o1 has caused quite a stir because it makes fewer errors than GPT-4o in tricky tasks. Is it the next stage in the evolution of language models, just as GPT-4 came after GPT-3.5? Not necessarily. OpenAI o1 is the first language model in ChatGPT that we can use specifically in certain scenarios, while in other areas it is better to do without it.
Replace or supplement?
When GPT-4 was made available to ChatGPT users a year and a half ago, the choice was clear: everyone with a Pro license left GPT-3.5 behind. We used GPT-3.5 in workshops to show that the Pro license is worth it. The alternatives from Google and Meta were never convincing enough for us to consider switching or recommending them in projects. We tried them out and put them aside. That only changed when Claude became available in Europe. Claude proves to be more creative and elegant than ChatGPT for copywriting and is the first choice for some marketing tasks.
Perplexity was long considered an insider tip for research. It was able to cite sources early on and made it easy to check its results for hallucinations. We could immediately see what was on the referenced websites. Although this method reduced the tool's performance - to speak of a language model here would be wrong - the research and the results' presentation saved a lot of time. There were hardly any incorrect summaries, which is why Perplexity has increasingly established itself as an alternative to Google searches.
At the same time, DeepL remained the undisputed number one for translations. In terms of user-friendliness, DeepL is far ahead of the other tools.
When o1, when GPT-4o?
When OpenAI o1 was still called Strawberry, we expected it to replace GPT-4o. But OpenAI trusts us to do more. We can choose the appropriate language model within the chatbot to suit our tasks. The new OpenAI o1 is available for all mathematical, programming, and thinking tasks. Prompting has become easier here, as the language model does not need a chain of thought (CoT), i.e. without long instructions that tell it how to solve the problem. OpenAI has already published a short guide on this. For all other tasks, GPT-4o is still the better choice, especially because this model supports functions such as file uploads, image generation and the use of custom GPTs in chats.
What does this mean for us? GPT-4o remains the language model of choice for most tasks, while o1 comes in handy whenever a response needs to be thought through. This is an exciting innovation that gives us more options but also requires a conscious approach to language models and the various prompting techniques.
Neither chatbots nor language models are replacing each other. Instead, we need to choose the right tool and model for each task. For creative text work, we might prefer Claude, while we use Perplexity for detailed research. ChatGPT gives us the option of switching between GPT-4o and o1 - depending on which model best meets our specific requirements.
Is this the future? Yes and no. It is to be expected that among the countless AI tools, those that achieve the status of Word or Excel will emerge. That we will have to think about what we use what for, not so much. At some point, this will become as commonplace as with Office applications.