Run DeepSeek on VPS

If you ask most developers how to Run DeepSeek on VPS hosting, they will tell you that you need a $10,000 NVIDIA H100 GPU. They are wrong.

Thanks to a technology called Quantization and a tool called Ollama, you can now run state-of-the-art Large Language Models (LLMs) on a standard Linux server with just a CPU.

In this guide, we will show you exactly how to Run DeepSeek on VPS instances (like our GratisVPS Free Tier) in less than 5 minutes.

Why Run DeepSeek on VPS vs. Cloud API?

Usually, AI models run on “Float16” precision, which makes them massive. But for personal use, you don’t need that precision. By using 4-bit Quantization (GGUF), we can shrink a model like DeepSeek-R1 down to just a few Gigabytes.

This allows you to Run DeepSeek on VPS RAM entirely, bypassing the need for an expensive GPU.

Recommended Specs to Run DeepSeek on VPS

  • For DeepSeek-R1 (1.5B): Minimum 2GB RAM.
  • For DeepSeek-R1 (7B): Minimum 6GB RAM (Fits on our Free Tier).
  • OS: Ubuntu 22.04 or 24.04 LTS.

Step 1: Install Ollama

Ollama is the industry standard for running local AI on Linux. It handles all the complex drivers automatically.

Connect to your server via SSH and run this single command:

curl -fsSL https://ollama.com/install.sh | sh

Once the installation finishes, verify it is running:

ollama --version

Step 2: Choose Your Model

Now you need to download the “Brain” of your AI. We recommend two specific models for VPS users:

Option A: Llama 3.2 (The Chatbot)

Meta’s latest 3-billion parameter model is incredibly fast and perfect for general chat. It uses only ~2.0GB of RAM.

ollama run llama3.2

Option B: DeepSeek-R1 (The Coder)

DeepSeek is famous for its reasoning capabilities. To Run DeepSeek on VPS efficiently, use the distilled versions:


# For speed (Uses ~1.5GB RAM)
ollama run deepseek-r1:1.5b

# For intelligence (Uses ~4.7GB RAM)
ollama run deepseek-r1:7b

Step 3: Accessing Your AI

After running the command, you will drop directly into a chat prompt. You can type questions just like ChatGPT.

To exit the chat: Press Ctrl + d.

How to use it as an API

The best part about Ollama is that it automatically runs a local API on port 11434. You can connect your own apps or chatbots to your VPS IP address.

curl http://localhost:11434/api/generate -d '{
  "model": "deepseek-r1:1.5b",
  "prompt": "Write a python script to scan for open ports"
}'

Conclusion

You have just replaced a $20/month OpenAI subscription with a free server. You now own the model, the data, and the infrastructure.

Now that you know how to Run DeepSeek on VPS hardware, ready to understand the strategy behind it? Read our deep dive on The Rise of Sovereign AI.

Index