If you already use Ollama to run local models, cloud models are easy to understand.
There is only one core difference:
local models run on your own machine, while cloud models run on Ollama’s cloud infrastructure and return the result to you.
What are Ollama cloud models
Ollama cloud models keep the Ollama workflow, but move the actual computation from your local machine to the cloud.
The main benefits are:
- Less pressure on local hardware
- Easier access to larger models that your machine cannot run well
- You can keep using the familiar Ollama workflow
How they differ from local models
| Item | Local models | Cloud models |
|---|---|---|
| Runtime location | Your machine | Cloud |
| Hardware requirements | High | Low |
| Latency | Usually lower | Affected by network |
| Privacy | Stronger | Requests are sent to the cloud |
If you care more about privacy, low latency, and offline use, local models are a better fit.
If your hardware is limited but you still want to use larger models, cloud models are more convenient.
How to identify a cloud model
At the moment, Ollama cloud models are typically labeled with a -cloud suffix, for example:
|
|
The available model list may change over time, so the official Ollama pages should be treated as the source of truth.
How to use them
First, sign in:
|
|
After that, run a cloud model directly:
|
|
If you are calling it from code, you can also configure an API key:
|
|
Python example:
|
|
Summary
Ollama cloud models can be summarized in one sentence:
the commands are almost the same, but the model is no longer running on your local machine.
If your computer cannot handle large models well, but you still want to keep the Ollama workflow, cloud models are a very direct option.