DeepSeek R1 vs. ChatGPT 01: A Deep Dive into Reasoning Models

Bryan Wilks
Sep 11, 2025
5 min read

https://www.youtube.com/watch?v=kQZzYMHre0U

This weekend, I decided to put two powerful AI reasoning models, DeepSeek R1 and ChatGPT 01, head-to-head. I ran them through ten different reasoning prompts to see how they stack up. While these models aren't for every task, they really shine when a problem needs a step-by-step thought process.

Understanding the Models and Their Data Policies

Before we jump into the prompts, let's clarify what we're working with. On the left, we have DeepSeek's official chat website. To use their R1 model, you need to make sure it's toggled on. If you don't, it defaults to their V3 model, which is more comparable to ChatGPT 4. For this comparison, we're focusing specifically on the reasoning capabilities of R1.

DeepSeek is an open-source large language model. This means you can actually download versions of it and run it privately on your own computer. The version on their website, however, isn't private, and your data is stored by the company. We'll look at how both DeepSeek and ChatGPT handle your data later on.

ChatGPT's 01 model, especially if you use it for work or upload data for analysis, has different privacy settings. With their paid plans, like the Teams plan I'm using, you can opt out of data training by default. Both the free version of DeepSeek and the free version of ChatGPT use your data for training their models unless you opt out. Keep in mind that accessing ChatGPT 01 requires a plan starting at $20 a month, while DeepSeek is currently free.

Testing Reasoning: Light Bulbs and Logic

Our first prompt tested multi-step reasoning and logic: "You have a row of 100 light bulbs, all initially off. When you pass through the row the first time, you toggle every bulb (turn them all on). On your second pass, you toggle every 2nd bulb. On your third pass, you toggle every 3rd bulb, and so on, up to the 100th pass. Which bulbs end up turned on?"

Both models went to work, and DeepSeek showed its thinking process in a detailed grayed-out section before revealing the answer. ChatGPT 01 also provides a step-by-step view. The result? Both models correctly identified bulbs 1, 4, 9, 16, 25, 36, and 49 as the ones left on. ChatGPT 01 took 10 seconds for its thinking process, while DeepSeek's detailed breakdown took longer, but the answer was correct.

Math Problems with a Twist

Next, a math word problem: "A horse costs $50, a chicken costs $20, and a goat costs $40. You bought 4 animals for a total of $140. Which animals (and how many of each) did you buy?"

DeepSeek, with R1 enabled, provided two valid combinations: two horses and two chickens, or one chicken and three goats. Its thinking process was extensive, taking 76 seconds. ChatGPT 01, however, only provided one solution: one chicken and three goats. This is a clear win for DeepSeek, which showed a more thorough approach to finding all possible answers.

Advanced Domain Knowledge: Physics

We then moved to a physics problem involving special relativity: "A spaceship traveling at 0.8c launches a probe forward at 0.3c (relative to the spaceship). According to special relativity, how fast is the probe moving relative to an outside observer at rest? (Use the relativistic velocity addition formula.)"

Both models arrived at the correct answer. However, the difference in their thinking process was stark. DeepSeek showed an incredibly detailed, 137-second breakdown of its calculations. ChatGPT 01, on the other hand, reached the correct answer in just 7 seconds, with minimal explanation of its steps.

Creative Interpretation and Thought Experiments

For creative interpretation, we asked for a fable about a fox and a crow, retold from an onlooker's perspective. Both models handled this well, producing consistent narratives.

We also tackled a classic thought experiment: "Which came first, the chicken or the egg?"

Both models concluded that the egg came first, with DeepSeek offering a slightly simpler explanation. This seemed like a tie.

Testing Step-by-Step Reasoning: The Missing Money Trick

Here's a classic trick question: "The restaurant bill for 3 people was $45. They each paid $15, so they paid $45 in total. The waiter put $5 in his pocket and gave $5 back to them. Therefore, they each ended up paying $14, which sums to $42, plus the $5 in his pocket equals $47. What happened to the missing $3?"

ChatGPT 01 solved this in 7 seconds, correctly stating that no money was missing and all accounts balanced. DeepSeek took 80 seconds but also reached the correct conclusion. ChatGPT was significantly faster here.

Ambiguity and Nuance

We tested ambiguity with the sentence: "I didn’t say she stole my money." We asked the models to interpret it in at least four ways by emphasizing different words.

Both models provided similar, accurate interpretations, highlighting different meanings based on emphasis. This was a tie.

Simple Questions, Complex Answers?

We then threw some seemingly simple questions at them:

How many R's are in "strawberry"? Both DeepSeek and ChatGPT correctly answered three R's. DeepSeek took 24 seconds, while ChatGPT was faster.
Which came first, the chicken or the egg? As mentioned, both said the egg.
Which number is bigger: 9.11 or 9.9? This is where ChatGPT 01 stumbled, incorrectly stating 9.11 was larger. DeepSeek correctly identified 9.9 as the bigger number. This is a point for DeepSeek.

We even tried misspelling "strawberry" to see if they'd catch it. Both models correctly identified four R's in the misspelled word, though ChatGPT was faster.

Functionality: Search and Privacy

DeepSeek's website has a useful search feature that allows it to access up-to-date information, which is great since its base training data is older. ChatGPT's search feature, however, was grayed out when using the 01 model. While there are third-party services like Perplexity.ai that can combine ChatGPT's reasoning with search, it's an additional cost.

Privacy Policies Compared

We asked both models to compare their privacy policies and create a table of pros and cons. Both provided detailed tables.

DeepSeek Cons:

Collects keystroke patterns (privacy risk).
Vague about how input data is used for training.
Data stored in China.
Less transparent about model training and global user rights.

OpenAI (ChatGPT) Cons:

Broad collection of content, files, audio, and images.
Data stored in various jurisdictions, often the US.
Complicated compliance with cross-border data transfers.
Technical jargon in privacy disclosures.

Key takeaways regarding privacy:

Data Storage: DeepSeek explicitly states data is stored in China. OpenAI stores data in various jurisdictions, including the US, and mentions GDPR compliance efforts for European users.
Training Data Usage: By default, DeepSeek uses your training data to improve its model. ChatGPT's free version also does this, but paid plans (like the Teams plan) allow you to opt out by default. DeepSeek does not offer this opt-out option on its website.
Local Installation: DeepSeek, being open-source, can be downloaded and run locally, offering maximum privacy. ChatGPT does not have a private, local installation option.

Conclusion: Who Wins?

DeepSeek R1 showed impressive reasoning and a more thorough approach to complex problems, especially in the math and physics domains. It also has the significant advantage of being open-source, allowing for local, private installations.

ChatGPT 01, while faster on some tasks and offering a more polished interface, missed a key math solution and made a simple numerical error. Its paid tiers offer opt-outs for data training, which is a plus for privacy-conscious users, but it comes at a higher cost, and the Pro version ($200/month) is needed to match DeepSeek's performance on certain tasks.

For pure reasoning power and flexibility, DeepSeek R1 appears to have an edge, especially considering its free access and open-source nature. However, if speed and a more integrated ecosystem are priorities, and you're willing to pay, ChatGPT 01 is still a strong contender, especially with its opt-out features.