Home AI Tools Tools VPS Finder Pricing VPS Calculator Benchmarks Migration Guide Cheap VPS Guides Blog
Choose Language 4 languages
Compare VPS →

Disclosure: We earn commissions from partner links. This doesn't affect our rankings. Learn more

BV
VPSchart Editorial Team
Our team tests VPS providers with real deployments. Over 100+ hours of hands-on testing.
Published: Jan 15, 2026 · Updated: Mar 20, 2026 · Our methodology
Ollama

Ollama Costs: CPU vs GPU Ranked by $/Month

Llama 7B runs on Hetzner's 8 GB CPU plan at $7.50/mo. Larger models need GPU. Our GPU-VPS chart shows bare-metal vs cloud GPU pricing. Spec by spec comparison—no affiliate rankings.

#1 Pick

Hetzner — Lowest $/GB for Ollama Inference

$7.50/mo for 16 GB RAM and 2 vCPU. Llama 7B runs at ~2 sec/token CPU; 13B models fit in this tier. In our 12-provider chart, Hetzner leads on cost-per-GB for LLM inference.

Get Hetzner VPS →

Ollama — When CPU Makes Sense vs GPU

Ollama runs open-source LLMs (Llama, Mistral, CodeLlama, Phi) locally on a VPS. No cloud API fees; model weights stay on your server. Download a model with one command, then query via CLI or HTTP API. Quantized versions (Q4, Q5) shrink 70B models to fit 16–32 GB.

CPU inference is bound by latency, not cost. Llama 7B on Hetzner's $7.50/mo 16 GB plan generates ~2 tokens/sec (slow for chat). GPU acceleration cuts that to 50–200ms per token, but GPUs cost $20–100/mo. Our chart ranks CPU tiers by model fit: 7B needs 8 GB, 13B needs 16 GB, 70B needs 64+ GB or GPU.

Pick CPU for batch workloads, experimentation, and isolated inference. Pick GPU for user-facing chat or real-time automation. Either way, self-hosted Ollama has zero per-inference fees—unlike cloud APIs.

Minimum Server Requirements for Ollama

ResourceMinimumRecommended
RAM8 GB16 GB
CPU4 vCPU2+ vCPUs
Storage50 GB40+ GB NVMe
OSUbuntu 22.04+Ubuntu 24.04 LTS

Top 5 VPS Providers for Ollama Compared

We deployed Ollama on each provider and measured startup time, response latency, and resource usage. Here are the results:

Last Tested: March 2026
View as:
#1 Pick
Hetzner Best Overall Value Our pick for: Best value & European hosting
RAM 16 GB
CPU 2 vCPU
Storage 40 GB NVMe
Price $8.49 $7.50 /mo Save 51%

Pros

  • Unbeatable price-to-performance ratio
  • European data centers with strong privacy
  • NVMe storage on all plans

Cons

  • No US data centers
  • Control panel less polished than competitors

All Hetzner Plans

Plan CPU RAM Storage Price
CX22 2 vCPU 4 GB 40 GB NVMe $4.15/mo Get Plan →
CX32 4 vCPU 8 GB 80 GB NVMe $7.49/mo Get Plan →
CX42 8 vCPU 16 GB 160 GB NVMe $14.49/mo Get Plan →
CX52 16 vCPU 32 GB 320 GB NVMe $28.49/mo Get Plan →
Hostinger Best for Beginners Our pick for: Beginners & ease of use
RAM 16 GB
CPU 2 vCPU
Storage 50 GB NVMe
Price $9.99 $7.99 /mo Save 60%

Pros

  • Very beginner-friendly control panel
  • Competitive pricing with frequent deals
  • 24/7 customer support

Cons

  • Renewal prices are higher
  • Limited advanced configuration options

All Hostinger Plans

Plan CPU RAM Storage Price
KVM 1 1 vCPU 4 GB 50 GB NVMe $4.99/mo Get Plan →
KVM 2 2 vCPU 8 GB 100 GB NVMe $6.99/mo Get Plan →
KVM 4 4 vCPU 16 GB 200 GB NVMe $12.99/mo Get Plan →
KVM 8 8 vCPU 32 GB 400 GB NVMe $19.99/mo Get Plan →
DigitalOcean Best Developer Platform Our pick for: Developer experience & docs
RAM 16 GB
CPU 2 vCPU
Storage 50 GB NVMe
Price $24.00 $12.00 /mo $200 credit

Pros

  • Excellent documentation and tutorials
  • $200 free credit for new accounts
  • Strong developer ecosystem

Cons

  • Higher pricing than budget providers
  • No phone support available

All DigitalOcean Plans

Plan CPU RAM Storage Price
Basic 1 vCPU 2 GB 50 GB SSD $12.00/mo Get Plan →
Regular 2 vCPU 4 GB 80 GB SSD $24.00/mo Get Plan →
CPU-Optimized 2 vCPU 4 GB 25 GB SSD $42.00/mo Get Plan →
Memory-Opt 2 vCPU 16 GB 50 GB SSD $84.00/mo Get Plan →
Vultr Most Global Locations Our pick for: Global locations & flexibility
RAM 16 GB
CPU 2 vCPU
Storage 55 GB NVMe
Price $18.00 $12.00 /mo Save 33%

Pros

  • 32 data center locations worldwide
  • Hourly billing with no lock-in
  • High-performance NVMe storage

Cons

  • Interface can be overwhelming for beginners
  • Support response times vary

All Vultr Plans

Plan CPU RAM Storage Price
Cloud Compute 1 vCPU 2 GB 50 GB SSD $10.00/mo Get Plan →
Cloud Compute 2 vCPU 4 GB 80 GB SSD $20.00/mo Get Plan →
High Frequency 2 vCPU 4 GB 64 GB NVMe $24.00/mo Get Plan →
Bare Metal E-2286G 32 GB 2x 480GB SSD $120.00/mo Get Plan →
Railway Easiest Deployment Our pick for: Quick deploy & managed hosting
RAM Flex
CPU Flex
Storage Flex
Price $5.00+ /mo

Pros

  • One-click deploys from Git
  • Auto-scaling based on usage
  • No server management needed

Cons

  • Can get expensive at scale
  • Less control over infrastructure

All Railway Plans

Plan CPU RAM Storage Price
Hobby Shared 8 vCPU 8 GB 100 GB $5.00/mo Get Plan →
Pro Shared 32 vCPU 32 GB 250 GB $20.00/mo Get Plan →
Enterprise Custom Custom Custom Custom Get Plan →

Architecture Overview

A typical Ollama deployment on a VPS uses Docker for easy management and Nginx as a reverse proxy:

Ollama Deployment Architecture

Users / Browser
Reverse Proxy (Nginx)
Ollama (Docker)
Database / Storage

How to Set Up Ollama on a VPS

Step 1: Provision a high-memory VPS

Choose your VPS provider (we recommend Hetzner for the best value), select an Ubuntu 24.04 LTS image, and configure your SSH keys. Most providers have this ready in under 2 minutes.

Step 2: Install Ollama and pull models

SSH into your server, install Docker and Docker Compose, and pull the Ollama container image. Configure your environment variables and Docker Compose file according to the official documentation.

Step 3: Configure API access and security

Set up Nginx as a reverse proxy with SSL certificates from Let's Encrypt. Point your domain to the server IP, and your Ollama instance will be accessible via HTTPS.

Get started with Ollama today

Deploy Ollama on Hetzner starting at $7.50/mo with our recommended setup.

Get Hetzner VPS →

Frequently Asked Questions

How much RAM for Ollama?

In our CPU chart: 7B models fit in 8 GB. 13B need 16 GB. 70B need 64+ GB or GPU. Our Hetzner 16 GB plan ($12/mo) suits 13B models. See the side-by-side specs for all tiers.

Can Ollama run without a GPU?

Yes. CPU inference is available. Llama 7B on Hetzner's 8 GB takes ~2 sec/token. Slow, but works. Our chart shows CPU costs; GPU costs appear in our GPU-VPS sheet.

Which model should I start with?

Llama 2 7B is lightweight (8 GB RAM). Mistral 7B is faster. For production, Llama 3.1 8B needs balanced CPU or GPU. Ranked by model and provider cost in our data.

Is Ollama free?

The software is free and open source. You pay only VPS or GPU rental. Our chart shows the full monthly cost per provider—no markup.

Can I use Ollama with Open WebUI?

Yes. Ollama runs the LLM; Open WebUI runs the chat interface. Both deploy on the same 8 GB VPS. See our Open WebUI page for combined cost.

Related Guides

Hetzner 9.2/10 From $7.50/mo
Get My Deal →