Home » Is the DeepSeek AI model that the whole world is talking about really that good?

Software

30.01.2025 08:01

Share with others:

Is the DeepSeek AI model that the whole world is talking about really that good?

All of us who follow the field of artificial intelligence development at least a little more regularly are wondering the same thing: is the Chinese AI model DeepSeek the one that will overtake the American one and take over the leading role?

DeepSeek is currently the hottest AI model, currently at the top of the Apple AppStore in the US and UK. It is a completely free AI model from Chinese startup DeepSeek, which wants to bring artificial intelligence to a wider audience. How? With a free version of a competitor to OpenAI's ChatGPT o1 model.

New UI apps appear on the App Store almost every day, and there’s often a lot of buzz around the launch of a new model as people look for the next ChatGPT alternative. Whether you’re a fan of OpenAI software or prefer using Google Gemini, there’s a UI tool for everyone, and DeepSeek wants to be the next icon on your home screen.

Tech Radar decided to test the DeepSeek V3 and DeeThink R1 models and compare them with ChatGPT 4o and o1. The main goal of the comparison was to determine whether the user posts online are justified and whether DeepSeek really poses a threat to the American AI models that have so far reigned supreme in the generative artificial intelligence market.

First the basics

In the test, Tech Radar wanted to get a full insight into everything DeepThink has to offer compared to ChatGPT, so it seemed only fair to use the AI chatbot in the same way that one would use an AI in everyday life.

ChatGPT o4 and DeepSeek V3 started by asking both models to create a daily schedule with some information about when the user wakes up, the dog's routine, and a brief breakdown of the work. Both models created great schedules that the user could actually use every day. However, ChatGPT's memory feature made the schedule more coherent.

At the outset, it is important to point out that DeepSeek can only remember information from the same chat and cannot access information from previous chats to help it respond.

Explain it to me like I'm 5 years old.

Then, Tech Radar asked both models about the NFL playoffs, a hugely popular league. They were asked to summarize the concept of the NFL playoffs in 200 words. Both models provided excellent information that allowed for a complete understanding of how the system works and the path a team must take to reach the Super Bowl.

ChatGPT opted for a 200-word paragraph, while DeepSeek broke the information down into bullet points. They noted that ChatGPT provided more context about how teams get a special league invite, but the difference between the results is fairly small, and you may prefer one over the other based solely on personal preference.

Problem solving

After getting the basics down, they came to the main question: does DeepThink R1 live up to expectations? Online, users are writing that the free DeepThink R1 model is just as good as ChatGPT o1, which is available for free in a limited capacity, but requires a subscription for full access.

To test the reasoning ability of chatbots, they looked for some of the most difficult challenges they could find. They were shocked by some of the results:

Question 1: Find the missing word: Apple, Red, Coal

For the test, they decided to avoid multiple-choice questions, and instead just typed the question and hit enter.

ChatGPT o1 took 1 minute and 29 seconds to answer and found connections between the words and the fairy tale Snow White. The model decided to answer based on this quote: “her lips were red as blood, her hair was black as coal, and her skin was white as snow.” Based on this quote, o1 chose Snow as the answer to the missing word. Although it was o1’s model thought process, it was not the answer they were looking for.

DeepThink R1, however, took 1 minute and 14 seconds to answer, and it managed to guess the correct word: Black. Apple is red; coal is black. Impressive, to say the least.

Question 2: 1. Complete the sequence: 1, 2, 4, 8, ? 2. Complete the sequence: house, Saturn, dog, burger, ?

While the first sequence is very easy, the second is impossible (it's just four random words). Could ChatGPT o1 or DeepThink R1 spot the trap?

Not at all. Both models tried to find the answer and came up with a completely different one. DeepThink R1 answered “yellow” because it thought the words were related to their color (white house, yellow Saturn, brown dog, yellow burger). ChatGPT o1, on the other hand, answered “car” because it found the sequence almost impossible, but decided to offer answers based on a “classical puzzle approach.” The approach it chose was to associate each object with the larger category it belonged to (house = building, Saturn = planet, dog = animal, burger = food, and car = vehicle).

Ultimately, both models were wrong, and neither responded in a way that clearly stated that there were too many variables to give a precise answer.

DeepSeek vs ChatGPT?

Tech Radar has tested both models in a variety of ways, and now the question is, which one is better? Based on the responses we received during our testing, DeepThink R1 is a great free inference model that might make you wonder if it's worth paying for access to o1. DeepSeek is only available online, in the iOS App Store and Play Store, with a standalone app for Mac or iPad likely to follow.

Tech Radar decided to stick with ChatGPT, primarily because they rely heavily on the memories feature, which allows the chatbot to reference previous conversations. ChatGPT also has a standalone app for Mac and iPad, as well as the ability to create images using one of the best AI image generators, DALL-E.

DeepSeek is based solely on text and lacks multimodal capabilities, but given that this is just the beginning of its journey, it is a very serious competitor in the field of UI models, and we will definitely hear a lot about it.