OpenAI API 入門：Chat Completions、函式呼叫和串流輸出怎麼寫

很多大模型服務都會提供「相容 OpenAI」的介面。實際接入時，最重要的是先確認兩件事：

BASE_URL：服務商提供的 API 地址。
OPENAI_API_KEY：你的存取金鑰。

如果服務商相容的是 Chat Completions API，那麼請求形態通常接近 OpenAI 的 /v1/chat/completions。不過要注意，OpenAI 目前也在推動 Responses API。Chat Completions 仍然常見，尤其是在第三方相容介面裡，但新專案如果直接使用 OpenAI 官方能力，也應該同時關注 Responses API。

官方參考：

最小請求範例

一個最小的 Chat Completions 請求大概長這樣：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

如果你使用的是第三方相容服務，通常只需要把第一行替換為對方的 BASE_URL：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


curl $BASE_URL/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "your-model-id",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

這裡最容易出錯的地方有三個：

BASE_URL 不要多寫或少寫 /v1；
Authorization 要使用 Bearer；
model 必須是服務商實際支援的模型 ID。

請求體裡最重要的欄位

Chat Completions 的請求體是一個 JSON 物件。最基礎、最常用的欄位如下。

model

model 是必填欄位，表示要呼叫的模型 ID。

1
2
3


{
  "model": "gpt-4o-mini"
}

在相容 OpenAI 的第三方服務裡，模型 ID 不一定叫 gpt-4o-mini。有的服務會使用自己的模型名稱，比如 deepseek-chat、qwen-plus、llama-3.1-70b 等。這裡要以服務商文件或模型列表為準。

messages

messages 是必填欄位，表示從頭到尾的對話歷史。它是一個陣列，每個元素是一條訊息。

常見角色包括：

system：系統指令，用來定義助手行為；
user：使用者訊息；
assistant：模型上一輪回覆；
tool：工具呼叫結果。

一個簡單多輪對話可以寫成：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a concise technical assistant."
    },
    {
      "role": "user",
      "content": "什么是 API？"
    },
    {
      "role": "assistant",
      "content": "API 是应用程序之间约定好的调用接口。"
    },
    {
      "role": "user",
      "content": "用一句话解释给非技术人员听。"
    }
  ]
}

system message

system 訊息通常放在最前面，用來告訴模型應該扮演什麼角色、遵守什麼規則。

1
2
3
4


{
  "role": "system",
  "content": "You are a helpful assistant."
}

常見欄位：

role：固定為 system；
content：系統提示詞內容；
name：可選，用來標記參與者名稱。

不是所有相容服務都會完全支援同樣的 system 行為。如果發現系統提示詞不生效，先檢查服務商是否改寫了角色規則。

user message

user 訊息表示使用者輸入。最常見的是純文字：

1
2
3
4


{
  "role": "user",
  "content": "Hello!"
}

在支援多模態輸入的模型裡，content 也可以是陣列，用來同時傳文字和圖片：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24


curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
            }
          }
        ]
      }
    ],
    "max_completion_tokens": 300
  }'

相容介面是否支援圖片，要看具體模型。不要只因為介面路徑相容 OpenAI，就預設它一定支援多模態。

assistant message

assistant 訊息通常出現在多輪對話歷史裡，表示模型之前說過什麼。

1
2
3
4


{
  "role": "assistant",
  "content": "Hello, how can I help you today?"
}

當模型決定呼叫工具時，assistant 訊息裡可能沒有普通文字，而是包含 tool_calls。

tool message

tool 訊息用於把工具執行結果回饋給模型。它必須對應上一輪 assistant 訊息裡的某個 tool_calls.id。

1
2
3
4
5


{
  "role": "tool",
  "tool_call_id": "call_abc123",
  "content": "{\"temperature\": 22, \"unit\": \"celsius\"}"
}

這一步常用於 Agent 場景：模型先決定要呼叫哪個函式，應用程式真正執行函式，然後把執行結果作為 tool 訊息發回模型，讓模型生成最終回答。

tools 和 tool_choice

tools 用來告訴模型：你可以呼叫哪些工具。目前最常見的是函式工具。

下面是一個查詢天氣的例子：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36


curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "What is the weather like in Boston today?"
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_current_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              },
              "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"]
              }
            },
            "required": ["location"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

如果模型認為需要呼叫工具，回應裡可能出現：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


{
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\"location\":\"Boston, MA\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

這裡要注意兩件事：

模型只是在「建議呼叫函式」，不會替你的程式真正執行函式。
arguments 是模型生成的 JSON 字串，使用前必須校驗，不要直接信任。

tool_choice 用來控制模型是否呼叫工具：

"none"：不呼叫工具，只生成文字；
"auto"：模型自己決定生成文字還是呼叫工具；
"required"：要求模型呼叫一個或多個工具；
指定某個函式：強制呼叫具體工具。

舊介面裡的 functions 和 function_call 已經被 tools 和 tool_choice 取代。維護舊專案時可能還會看到這些欄位，新專案建議優先使用新寫法。

stream 串流輸出

stream 是可選布林值。設定為 true 後，介面會用 SSE 方式逐段返回內容，適合即時聊天介面。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "写一句欢迎语"
    }
  ],
  "stream": true
}

非串流回應會等模型完整生成後一次性返回；串流回應會不斷返回 chunk。典型片段類似：

1
2
3
4
5


data: {"id":"chatcmpl-123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: [DONE]

前端或後端需要不斷讀取 data: 行，把每次 delta.content 拼起來，直到收到 [DONE]。

非串流回應怎麼看

非串流 Chat Completions 回應通常包含這些欄位：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello there, how may I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}

最常用的是：

choices[0].message.content：模型最終回覆；
choices[0].message.tool_calls：模型想呼叫的工具；
choices[0].finish_reason：停止原因；
usage：token 使用量。

finish_reason 常見值包括：

stop：自然停止；
length：達到 token 上限；
tool_calls：模型請求呼叫工具；
content_filter：內容被過濾。

相容介面接入時的檢查清單

接入第三方相容 OpenAI 的服務時，不要只看介面路徑一樣。建議按這個清單檢查：

BASE_URL 是否包含 /v1。
OPENAI_API_KEY 是否真的對應目前服務商。
model 是否是可用模型 ID。
是否支援 system 訊息。
是否支援多模態 image_url。
是否支援 tools 和 tool_choice。
是否支援 stream: true。
token 限制欄位使用 max_completion_tokens 還是舊的 max_tokens。
錯誤回應格式是否和 OpenAI 完全一致。
是否有額外的速率限制、並發限制或區域限制。

「相容 OpenAI」通常意味著呼叫方式相似，不等於所有模型能力、欄位語義和錯誤格式都完全相同。

結論

如果只是做最基礎的聊天，對 Chat Completions 來說，核心欄位其實只有三個：

1
2
3
4
5


{
  "model": "your-model-id",
  "messages": [],
  "stream": false
}

如果要做 Agent 或業務自動化，就需要進一步理解：

tools：告訴模型有哪些函式可以用；
tool_choice：控制模型是否呼叫工具；
tool_calls：讀取模型想呼叫什麼；
tool message：把真實工具結果回饋給模型；
stream：把回覆變成即時輸出。

對於相容 OpenAI 的 API，最穩妥的接入方式是：先用最小文字請求跑通，再加串流輸出，最後再接工具呼叫。每加一層能力，都用真實模型和真實服務商驗證一次，不要只按欄位名猜相容程度。