跳至主要内容

Chat Completions

POST /v1/chat/completions

與 OpenAI Chat Completions API 100% 相容,並透過 vecstruct 擴充欄位支援 RAG、Memory 等功能。

請求 Headers

POST /v1/chat/completions
Authorization: Bearer sk-your-api-key
Content-Type: application/json

請求 Body

{
"model": "openai/gpt-4o",
"messages": [
{ "role": "system", "content": "你是一個客服助手" },
{ "role": "user", "content": "退款需要幾天?" }
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false,
"vecstruct": {
"project_id": "proj-uuid",
"rag": true,
"rag_top_k": 5,
"use_memory": true,
"metadata": {
"user_id": "user-123"
}
}
}

請求欄位

標準 OpenAI 欄位:

欄位類型必填說明
modelstring模型 ID,格式為 provider/model-name
messagesarray對話訊息列表
temperaturenumber0.0 – 2.0,預設 1.0
max_tokensnumber最大輸出 Token 數
streamboolean是否串流回應,預設 false
top_pnumberNucleus sampling
stopstring / string[]停止序列

vecstruct 擴充欄位:

欄位類型說明
project_idstring指定專案(不填則使用 API Key 預設專案)
ragboolean啟用 RAG 知識庫注入
rag_top_knumberRAG 擷取的段落數,預設 5
rag_source_idsstring[]限定只搜尋特定文件
rag_min_similaritynumberRAG 最低相似度門檻,0.0 – 1.0
use_memoryboolean啟用 Agent Memory 注入
metadataobject自訂標記,會寫入 Audit Log

回應範例(非串流)

{
"id": "chatcmpl-uuid",
"object": "chat.completion",
"created": 1746700000,
"model": "openai/gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "退款通常需要 3–5 個工作天..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 120,
"completion_tokens": 80,
"total_tokens": 200
},
"vecstruct": {
"audit_id": "audit-uuid",
"rag_sources": [
{
"document_id": "doc-uuid",
"title": "退款政策.pdf",
"content": "退款申請受理後...",
"similarity": 0.92
}
],
"memory_used": true,
"credits_consumed": 0.05,
"balance_consumed_usd": 0.000240
}
}

vecstruct 回應欄位:

欄位類型說明
audit_idstring此次請求的 Audit Log ID
rag_sourcesarrayRAG 引用的段落列表
memory_usedboolean是否有注入 Memory
credits_consumednumber消耗的 Credits(RAG/Memory 功能)
balance_consumed_usdnumber消耗的 USD 餘額(LLM Token)

串流回應

stream: true 時,使用 Server-Sent Events(SSE)回傳:

data: {"id":"chatcmpl-uuid","object":"chat.completion.chunk","choices":[{"delta":{"content":"退款"},"index":0}]}

data: {"id":"chatcmpl-uuid","object":"chat.completion.chunk","choices":[{"delta":{"content":"通常"},"index":0}]}

event: vecstruct
data: {"audit_id":"audit-uuid","rag_sources":[...],"credits_consumed":0.05}

data: [DONE]

串流結束前會有一個 event: vecstruct 的特殊事件,包含 RAG 來源、Credits 用量等 metadata。

模型格式

模型 ID 的格式為 provider/model-name,例如:

Provider範例
openaiopenai/gpt-4o, openai/gpt-4o-mini
anthropicanthropic/claude-3-5-sonnet
googlegoogle/gemini-2.0-flash
baaibaai/bge-m3(Embedding)
coherecohere/rerank-v3.5(Rerank)

完整的可用模型列表請參考 GET /v1/models