DeepSeek R1: What You Need to Know Before You Use It
- Bryan Wilks
- Sep 10
- 5 min read
If you're using the DeepSeek website, especially for its R1 model, it's really important to understand their privacy policy and terms of use. I dug into these documents to find some pretty alarming things, and I want to share them so you know exactly what you're agreeing to and how your data might be used.
Key Takeaways
Data Collection: DeepSeek collects a lot of user input, including text, audio, uploaded files, and chat history. They don't clearly state how long this data is kept or if it's anonymized before analysis. They even mention collecting keystroke patterns and rhythms, which could be considered sensitive biometric data.
Data Sharing: The policy states that DeepSeek shares user data with advertisers and analytics partners. This could mean your activity is tracked across different websites and apps, and your data might be sold for targeted advertising.
Data Storage: All collected data is stored on servers in China. This raises concerns about compliance with international privacy laws like GDPR, especially since Chinese law requires companies to share user data with authorities when requested.
Lack of User Control: Unlike some other services, DeepSeek's website offers very limited options to control your data. There are no settings to opt out of data collection for model training or sharing with third parties.
Terms of Use: The terms also mention that users are fully responsible for AI-generated output and that dispute resolution might occur outside of China. There's also a lack of clarity on data deletion requests and how moderation decisions are made.
Understanding the Privacy Policy
When I reviewed the privacy policy, which was last updated on December 5, 2024, I used a prompt to identify potential red flags. Here’s what stood out:
Transparency and Data Collection
The policy lists several types of data collected: profile data, user input, automatically collected data, and third-party data. They collect extensive user input, including text, audio, uploaded files, and chat history. A major concern is that they don't clarify how long the data is retained or if it's anonymized before analysis. There's also no explicit mention of whether user data is used for training their AI models.
What's particularly concerning is the mention of keystroke pattern and rhythm collection. This could be a form of biometric data, which is sensitive personal data and might violate laws like GDPR, especially in the EU.
For their mobile app, which is very popular, they mention using device IDs, user IDs, and tracking user activity across multiple devices. On the website itself, the settings options are minimal – you can change language or theme, delete your account, or delete chats. There are no options to control data usage for training or sharing, unlike what you might find with other services.
Data Sharing with Third Parties
The policy states that DeepSeek shares user data with advertisers and analytics partners. This includes data from activities on other websites and apps. This suggests behavioral tracking and the potential sale of data for targeted advertising. While this is a common business model, it's often hidden in privacy policies, and there's no mention of opting in or out of these practices.
Data Storage and Legal Compliance
Perhaps the most concerning point is that all collected data is stored on servers in China. China's data laws require companies to share user data with authorities when requested. If DeepSeek has users in regions like the EU or the US, this raises serious compliance issues, especially under GDPR and other international privacy laws. The policy doesn't mention if data from EU users is processed differently to comply with GDPR, suggesting it likely isn't.
Data Retention and Security
The data retention policy is also vague. It states data is retained as long as necessary but doesn't specify a clear timeframe. This could mean user input, including sensitive chat history, might be stored indefinitely. Furthermore, there's no mention of specific security standards like encryption or two-factor authentication to protect user data.
Children's Data
While the app isn't meant for anyone under 18, there's not much age verification. It's unclear how they would enforce this if someone under 18 tries to use the app.
Reviewing the Terms of Use
The terms of use, updated January 20, 2025, are more extensive, but many concerns mirror those in the privacy policy regarding data collection and usage. Key points include:
No Clear Global Privacy Law Compliance: There's no explicit compliance mentioned for laws like GDPR.
Service Modifications: The service can be modified, suspended, or terminated.
User Responsibility: Users bear full responsibility for AI-generated output.
Dispute Resolution: There's no mention of dispute resolution outside of China.
Data Deletion: It's unclear if users can request full data deletion.
Data Retention Transparency: There's a lack of transparency on data retention duration, reinforcing the concern that data could be stored indefinitely.
The terms advise users to proceed with caution for sensitive legal or business-critical purposes and to consider alternative providers with stronger privacy protections.
What Can You Do?
If you use the DeepSeek website, chatbot, or mobile apps, you are agreeing to these terms. If you still want to use DeepSeek, here are a couple of workarounds:
Install Locally
You can install a version of DeepSeek R1 locally on your computer. While it might not be the full-blown version from the website, there are distilled versions available that you can run offline. Websites like Hugging Face host various models, including different sizes of DeepSeek R1. You can download these models and run them using interfaces like LM Studio. This way, your data stays on your machine, and you can even turn off your Wi-Fi to ensure no data is sent externally.
I was able to download and run smaller versions of the model (like 7B or 32B parameters) on my computer. The larger 671B parameter model is too demanding for most personal computers, but the smaller ones work well. Running them locally means you're not agreeing to the website's terms of service, and your data is not being sent to their servers.
Use Through Third-Party Services
Another option is to use DeepSeek R1 through services that host it. For example, Perplexity AI offers a Pro subscription that includes "Reasoning with R1." While this is geared towards search and adding reasoning to search results, they state that they host this model locally in the US. This means your data might be handled differently and with more privacy protections compared to using the DeepSeek website directly. However, it's a paid service and functions more as a search enhancement than a direct chatbot replacement.
As more open-source models become available, other US-based partners might offer similar services, potentially simplifying privacy concerns related to data collection and usage.
Conclusion
If you choose to use the DeepSeek website, be aware of everything discussed here. Avoid sharing any sensitive information. If you have significant privacy concerns, it's best to use the local versions or services hosted by US-based partners. I'll be posting a comparison of the GPT-4o model versus DeepSeek R1 soon, so stay tuned!
