What are Ollama cloud models and how do you use them

If you already use Ollama to run local models, cloud models are easy to understand.

There is only one core difference:
local models run on your own machine, while cloud models run on Ollama’s cloud infrastructure and return the result to you.

What are Ollama cloud models

Ollama cloud models keep the Ollama workflow, but move the actual computation from your local machine to the cloud.

The main benefits are:

Less pressure on local hardware
Easier access to larger models that your machine cannot run well
You can keep using the familiar Ollama workflow

How they differ from local models

Item	Local models	Cloud models
Runtime location	Your machine	Cloud
Hardware requirements	High	Low
Latency	Usually lower	Affected by network
Privacy	Stronger	Requests are sent to the cloud

If you care more about privacy, low latency, and offline use, local models are a better fit.
If your hardware is limited but you still want to use larger models, cloud models are more convenient.

How to identify a cloud model

At the moment, Ollama cloud models are typically labeled with a -cloud suffix, for example:

1

gpt-oss:120b-cloud

The available model list may change over time, so the official Ollama pages should be treated as the source of truth.

How to use them

First, sign in:

1

ollama signin

After that, run a cloud model directly:

1

ollama run gpt-oss:120b-cloud

If you are calling it from code, you can also configure an API key:

1

export OLLAMA_API_KEY=your_api_key

Python example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


import os
from ollama import Client

client = Client(
    host="https://ollama.com",
    headers={"Authorization": "Bearer " + os.environ["OLLAMA_API_KEY"]},
)

messages = [
    {"role": "user", "content": "Why is the sky blue?"}
]

for part in client.chat("gpt-oss:120b-cloud", messages=messages, stream=True):
    print(part["message"]["content"], end="", flush=True)

Summary

Ollama cloud models can be summarized in one sentence:

the commands are almost the same, but the model is no longer running on your local machine.

If your computer cannot handle large models well, but you still want to keep the Ollama workflow, cloud models are a very direct option.