Qualo.xyz Docs

Giới thiệu

Dịch vụ API xây dựng trên nền Anthropic, cung cấp quyền truy cập tương thích OpenAI tới Claude và các mô hình AI khác — thay thế trực tiếp, hỗ trợ streaming đầy đủ, không cần thay đổi SDK.

Bất kỳ client nào hoạt động với OpenAI đều sẽ hoạt động với QuaLo — chỉ cần thay đổi base URLAPI key.

Base URLs

Định dạngBase URLGhi chú
OpenAI-compatiblehttps://llm.qualo.xyz/v1/v1
Anthropichttps://llm.qualo.xyzKHÔNG/v1

Models

ModelProviderContextOutputBest For
claude-opus-4-6Anthropic200K64KComplex reasoning, architecture
claude-sonnet-4-6Anthropic200K64KBalanced — recommended cho đa số tasks
claude-haiku-4-5Anthropic144K64KSpeed-optimized, simple edits
gpt-5.4OpenAI200K64KComplex reasoning, vision

Ví dụ nhanh

curl https://llm.qualo.xyz/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer your-qualo-api-key" \ -d '{ "model": "gpt-5.4", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello!"}] }'

Quickstart

1. Lấy API Key

Đăng ký tại qualo.xyz/register → Dashboard → Copy API key (chỉ hiển thị một lần).

export QUALO_API_KEY=your-api-key-here

2. Gọi API đầu tiên

OpenAI format:

curl https://llm.qualo.xyz/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $QUALO_API_KEY" \ -d '{ "model": "gpt-5.4", "messages": [{"role": "user", "content": "Hello!"}] }'

Anthropic format:

curl https://llm.qualo.xyz/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: $QUALO_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello!"}] }'

3. Streaming

OpenAI SSE:

curl https://llm.qualo.xyz/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $QUALO_API_KEY" \ -d '{ "model": "gpt-5.4", "stream": true, "messages": [{"role": "user", "content": "Tell me a short story."}] }'

Anthropic SSE:

curl https://llm.qualo.xyz/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: $QUALO_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "stream": true, "messages": [{"role": "user", "content": "Tell me a short story."}] }'

Xác thực (Authentication)

Authentication Headers

OpenAI format (Bearer Token):

Authorization: Bearer <your-api-key> User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36

Anthropic format (x-api-key):

x-api-key: <your-api-key> anthropic-version: 2023-06-01 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
⚠️ Bypass Cloudflare (Bắt buộc)
Tất cả requests phải bao gồm custom User-Agent header để tránh lỗi 403:
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36

Biến môi trường

# OpenAI SDK compatibility OPENAI_API_KEY=your-api-key-here OPENAI_BASE_URL=https://llm.qualo.xyz/v1 # Anthropic SDK compatibility ANTHROPIC_API_KEY=your-api-key-here ANTHROPIC_BASE_URL=https://llm.qualo.xyz

Kiểm tra Key

curl https://llm.qualo.xyz/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer your-api-key" \ -d '{"model": "gpt-5.4", "messages": [{"role": "user", "content": "ping"}], "max_tokens": 5}'
  • HTTP 200 → Key hoạt động
  • HTTP 401 → Key không hợp lệ hoặc thiếu

API Reference

Endpoints

MethodPathMô tả
POST/v1/chat/completionsOpenAI Chat Completions
POST/v1/messagesAnthropic Messages API
POST/v1/responsesOpenAI Responses API (o-series, Codex)
GET/v1/modelsDanh sách models
POST/v1/messages/count_tokensĐếm tokens

POST /v1/chat/completions

ParameterTypeRequiredDescription
modelstringModel ID (e.g., gpt-5.4, claude-sonnet-4-6)
messagesarrayArray of {role, content} objects
streambooleanSSE streaming. Default: false
temperaturenumber0–2. Default: 1
max_tokensintegerMax tokens to generate
reasoning_effortstringlow, medium, high
curl https://llm.qualo.xyz/v1/chat/completions \ -H "Authorization: Bearer $QUALO_API_KEY" \ -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [{"role": "user", "content": "Hello!"}], "stream": false }'

Response:

{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1714000000, "model": "gpt-5.4", "choices": [{ "index": 0, "message": {"role": "assistant", "content": "Hello! How can I help you today?"}, "finish_reason": "stop" }], "usage": {"prompt_tokens": 10, "completion_tokens": 12, "total_tokens": 22} }

POST /v1/messages

ParameterTypeRequiredDescription
modelstringAnthropic model ID
max_tokensintegerMaximum tokens (bắt buộc)
messagesarrayArray of message objects
systemstring/arraySystem prompt
streambooleanSSE streaming
temperaturenumber0–1. Default: 1
thinkingobjectExtended thinking: { type: "enabled", budget_tokens: N }
curl https://llm.qualo.xyz/v1/messages \ -H "x-api-key: $QUALO_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello!"}] }'

Response:

{ "id": "msg_abc123", "type": "message", "role": "assistant", "content": [{"type": "text", "text": "Hello! How can I help you today?"}], "model": "claude-sonnet-4-6", "stop_reason": "end_turn", "usage": {"input_tokens": 10, "output_tokens": 12} }

POST /v1/responses

ParameterTypeRequiredDescription
modelstringModel ID (e.g., o3, codex-mini-latest)
inputstring/arrayPrompt hoặc messages array
instructionsstringSystem instructions
streambooleanSSE streaming
max_output_tokensintegerMax output tokens
reasoningobject{ effort: "low"|"medium"|"high", summary: "auto" }

GET /v1/models

curl https://llm.qualo.xyz/v1/models \ -H "Authorization: Bearer $QUALO_API_KEY" \ -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"

POST /v1/messages/count_tokens

curl https://llm.qualo.xyz/v1/messages/count_tokens \ -H "x-api-key: $QUALO_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "Content-Type: application/json" \ -d '{"model": "claude-sonnet-4-6", "messages": [{"role": "user", "content": "Hello"}]}'

Response: { "input_tokens": 14 }

Error Codes

HTTP CodeMeaning
400Bad request — JSON lỗi hoặc thiếu field
401Unauthorized — API key sai hoặc thiếu
403Forbidden — Cloudflare chặn (thiếu User-Agent)
429Rate limit exceeded
500Internal server error
503Service unavailable — token pool cạn

SDK

OpenAI SDK

pip install openai
from openai import OpenAI client = OpenAI( base_url="https://llm.qualo.xyz/v1", api_key="your-qualo-api-key", default_headers={ "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" } ) # Non-streaming response = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "Hello!"}], max_tokens=1024 ) print(response.choices[0].message.content) # Streaming stream = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "Write a poem"}], stream=True ) for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True) # List models for model in client.models.list().data: print(model.id)
⚠️ Base URL phải bao gồm /v1 khi dùng OpenAI SDK.

Anthropic SDK

pip install anthropic
from anthropic import Anthropic client = Anthropic( base_url="https://llm.qualo.xyz", # KHÔNG có /v1 api_key="your-qualo-api-key", default_headers={ "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" } ) # Non-streaming response = client.messages.create( model="claude-sonnet-4-6", messages=[{"role": "user", "content": "Hello!"}], max_tokens=1024 ) print(response.content[0].text) # Streaming with client.messages.stream( model="claude-sonnet-4-6", max_tokens=256, messages=[{"role": "user", "content": "Write a poem"}] ) as stream: for text in stream.text_stream: print(text, end="", flush=True) # System message response = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, system="You are a helpful coding assistant.", messages=[{"role": "user", "content": "How do I read a file in Python?"}] )
⚠️ Anthropic SDK base URL KHÔNG/v1. SDK tự thêm /v1/messages.

Tích hợp (Integrations)

Kilo Code

VS Code Extension

Dùng Claude (Anthropic):

1. Mở VS Code → Settings → Extensions → Kilo Code
2. Cấu hình:

{ "kilocode.apiProvider": "anthropic", "kilocode.apiBaseUrl": "https://llm.qualo.xyz", "kilocode.apiKey": "your-qualo-api-key", "kilocode.model": "claude-sonnet-4-6" }

Dùng GPT/Gemini (OpenAI-compatible):

{ "kilocode.apiProvider": "openai-compatible", "kilocode.apiBaseUrl": "https://llm.qualo.xyz/v1", "kilocode.apiKey": "your-qualo-api-key", "kilocode.model": "gpt-5.4" }

Troubleshooting

  • Connection refused: Kiểm tra base URL — Anthropic: không có /v1, OpenAI-compatible: có /v1
  • Invalid API key: Verify key đúng và còn active
  • Model not found: Dùng đúng tên: claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5, gpt-5.4

Roo Code

VS Code Extension

Dùng Claude (Anthropic):

{ "rooCode.apiProvider": "anthropic", "rooCode.anthropic.baseUrl": "https://llm.qualo.xyz", "rooCode.anthropic.apiKey": "your-qualo-api-key", "rooCode.anthropic.model": "claude-sonnet-4-6" }

Dùng GPT/Gemini (OpenAI-compatible):

{ "rooCode.apiProvider": "openai-compatible", "rooCode.openaiCompatible.baseUrl": "https://llm.qualo.xyz/v1", "rooCode.openaiCompatible.apiKey": "your-qualo-api-key", "rooCode.openaiCompatible.model": "gpt-5.4" }

Troubleshooting

  • No completions: Verify base URL kết thúc /v1 (OpenAI) hoặc không (Anthropic), API key đúng
  • Slow responses: Dùng claude-haiku-4-5 cho inline completions nhanh hơn

Claude Code CLI

Terminal AI assistant từ Anthropic

1. Cài đặt

npm install -g @anthropic-ai/claude-code

2. Cấu hình ~/.claude/settings.json

{ "env": { "ANTHROPIC_BASE_URL": "https://llm.qualo.xyz", "ANTHROPIC_AUTH_TOKEN": "your-qualo-api-key", "ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-6", "ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-6", "ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-haiku-4-5", "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1" }, "permissions": { "allow": [], "deny": [] }, "alwaysThinkingEnabled": true }

3. Chạy

cd your-project claude # Kiểm tra: gõ /status

Model Variables

VariableDefault
ANTHROPIC_DEFAULT_OPUS_MODELclaude-opus-4-6
ANTHROPIC_DEFAULT_SONNET_MODELclaude-sonnet-4-6
ANTHROPIC_DEFAULT_HAIKU_MODELclaude-haiku-4-5

Troubleshooting

  • Auth error: Kiểm tra ANTHROPIC_AUTH_TOKEN. Thử xóa ~/.claude/settings.json và config lại.
  • Timeout: Tăng API_TIMEOUT_MS lên 600000 hoặc cao hơn
  • Connection error: Kiểm tra internet, đảm bảo https://llm.qualo.xyz không bị firewall chặn

Droid

AI Software Engineering Agent by Factory

Environment Variables

export ANTHROPIC_BASE_URL="https://llm.qualo.xyz" export ANTHROPIC_API_KEY="your-qualo-api-key"

Config file ~/.droid/config.json

{ "provider": "anthropic", "model": "claude-sonnet-4-6", "apiBaseUrl": "https://llm.qualo.xyz", "maxTokens": 8192, "temperature": 0.7 }

Alternative: Custom Models

{ "custom_models": [ { "model_display_name": "Claude Sonnet 4.6", "model": "claude-sonnet-4-6", "base_url": "https://llm.qualo.xyz", "api_key": "your-qualo-api-key", "provider": "anthropic" } ] }

Test

droid # hoặc droid "What's 2+2?"

Troubleshooting

  • Connection error: Base URL là https://llm.qualo.xyz (không có /v1)
  • Model not responding: Kiểm tra API key còn credits
  • Slow responses: Dùng claude-haiku-4-5 hoặc giảm maxTokens

OpenCode

TUI-rich Terminal AI Coding Agent

Config file: opencode.json (project root hoặc ~/.config/opencode/opencode.json)

Dùng GPT/Gemini (OpenAI-compatible):

{ "$schema": "https://opencode.ai/config.json", "provider": { "qualo": { "name": "QuaLo (OpenAI)", "npm": "@ai-sdk/openai-compatible", "options": { "apiKey": "your-qualo-api-key", "baseURL": "https://llm.qualo.xyz/v1" } } } }

Dùng Claude (Anthropic):

{ "$schema": "https://opencode.ai/config.json", "provider": { "qualo-anthropic": { "name": "QuaLo (Anthropic)", "npm": "@ai-sdk/anthropic", "options": { "apiKey": "your-qualo-api-key", "baseURL": "https://llm.qualo.xyz/v1" } } } }

Full config với models:

{ "provider": { "qualo": { "name": "QuaLo (OpenAI)", "npm": "@ai-sdk/openai-compatible", "options": { "apiKey": "your-qualo-api-key", "baseURL": "https://llm.qualo.xyz/v1" }, "models": { "gpt-5.4": { "name": "GPT-5.4", "limit": { "context": 200000, "output": 64000 }, "modalities": { "input": ["text", "image", "pdf"], "output": ["text"] }, "reasoning": true, "options": { "reasoningEffort": "xhigh" }, "tool_call": true, "attachment": true } } }, "qualo-anthropic": { "name": "QuaLo (Anthropic)", "npm": "@ai-sdk/anthropic", "options": { "apiKey": "your-qualo-api-key", "baseURL": "https://llm.qualo.xyz/v1" }, "models": { "claude-opus-4-6": { "name": "Claude Opus 4.6", "limit": { "context": 200000, "output": 64000 }, "modalities": { "input": ["text", "image", "pdf"], "output": ["text"] }, "reasoning": true, "tool_call": true, "attachment": true }, "claude-sonnet-4-6": { "name": "Claude Sonnet 4.6", "limit": { "context": 200000, "output": 32000 }, "modalities": { "input": ["text", "image", "pdf"], "output": ["text"] }, "reasoning": true, "tool_call": true, "attachment": true }, "claude-haiku-4-5": { "name": "Claude Haiku 4.5", "limit": { "context": 144000, "output": 32000 }, "modalities": { "input": ["text", "image", "pdf"], "output": ["text"] }, "reasoning": true, "tool_call": true } } } } }

Chọn default model:

{ "model": "qualo-anthropic/claude-sonnet-4-6" } // hoặc { "model": "qualo/gpt-5.4" }

Troubleshooting

  • Model not appearing: Kiểm tra model ID đúng trong provider's models block
  • Auth error: Kiểm tra apiKey khớp với key từ QuaLo dashboard
  • Wrong base URL: GPT/Gemini → /v1, Anthropic/Claude → /v1 (OpenCode dùng /v1 cho cả hai)

OpenClaw

Multi-model Agentic Coding

Dùng Claude (Anthropic):

{ "provider": "anthropic", "baseUrl": "https://llm.qualo.xyz", "apiKey": "your-qualo-api-key", "model": "claude-sonnet-4-6" }

Dùng GPT/Gemini (OpenAI-compatible):

{ "provider": "openai-compatible", "baseUrl": "https://llm.qualo.xyz/v1", "apiKey": "your-qualo-api-key", "model": "gpt-5.4" }

Multi-provider config ~/.openclaw/config.json:

{ "providers": { "qualo": { "type": "openai-compatible", "baseUrl": "https://llm.qualo.xyz/v1", "apiKey": "your-qualo-api-key" }, "qualo-anthropic": { "type": "anthropic", "baseUrl": "https://llm.qualo.xyz", "apiKey": "your-qualo-api-key" } }, "defaultProvider": "qualo-anthropic", "defaultModel": "claude-sonnet-4-6" }
OpenClaw gửi requests trực tiếp — không cần thêm User-Agent header.

Troubleshooting

  • Connection refused: OpenAI-compatible → /v1, Anthropic → không có /v1
  • Model not found: Dùng đúng ID: claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5, gpt-5.4

Pi

Terminal AI Coding Agent (@mariozechner/pi-coding-agent)

1. Tạo ~/.pi/agent/models.json

mkdir -p ~/.pi/agent

2. Nội dung models.json:

{ "providers": { "qualo": { "baseUrl": "https://llm.qualo.xyz/v1", "api": "openai-completions", "apiKey": "your-qualo-api-key", "models": [ { "id": "gpt-5.4", "name": "GPT-5.4", "reasoning": true, "input": ["text", "image"], "contextWindow": 200000, "maxTokens": 64000 } ] }, "qualo-anthropic": { "baseUrl": "https://llm.qualo.xyz", "api": "anthropic-messages", "apiKey": "your-qualo-api-key", "models": [ { "id": "claude-opus-4-6", "name": "Claude Opus 4.6", "reasoning": true, "input": ["text", "image"], "contextWindow": 144000, "maxTokens": 64000 }, { "id": "claude-sonnet-4-6", "name": "Claude Sonnet 4.6", "reasoning": true, "input": ["text", "image"], "contextWindow": 200000, "maxTokens": 32000 }, { "id": "claude-haiku-4-5", "name": "Claude Haiku 4.5", "reasoning": true, "input": ["text", "image"], "contextWindow": 144000, "maxTokens": 32000 } ] } } }

API Key formats:

"apiKey": "sk-qualo-..." // literal value "apiKey": "QUALO_API_KEY" // env var name "apiKey": "!echo $QUALO_API_KEY" // shell command

3. Chọn model

pi
  • Ctrl+L — mở model picker
  • Ctrl+P — cycle qua favourite models
  • /model — tương đương Ctrl+L
models.json reload mỗi lần mở /model — không cần restart.

Troubleshooting

  • Models không hiện trong /model: Kiểm tra ~/.pi/agent/models.json là valid JSON, baseUrl đúng
  • Auth error: Verify API key — có thể dùng literal, env var name, hoặc shell command

Rate Limits

PlanRequests/Minute
Pay-as-you-go20

Rate Limit Headers (mỗi response):

  • X-RateLimit-Limit — Max requests/minute
  • X-RateLimit-Remaining — Requests còn lại
  • X-RateLimit-Reset — Unix timestamp khi window reset

429 Response:

{ "error": { "message": "Rate limit exceeded. Please retry after 60 seconds.", "type": "rate_limit_error", "code": "rate_limit_exceeded" } }

Best Practices

  • Implement exponential backoff on 429
  • Monitor rate limit headers
  • Dùng request queues cho batch operations
  • Cache responses khi có thể
⚠️ Hit rate limits liên tục có thể dẫn đến tạm khóa API key.

Tham khảo nhanh

ToolProviderBase URLConfig Key
Kilo Codeanthropichttps://llm.qualo.xyzkilocode.apiBaseUrl
Kilo Codeopenai-compatiblehttps://llm.qualo.xyz/v1kilocode.apiBaseUrl
Roo Codeanthropichttps://llm.qualo.xyzrooCode.anthropic.baseUrl
Roo Codeopenai-compatiblehttps://llm.qualo.xyz/v1rooCode.openaiCompatible.baseUrl
Claude Codeanthropichttps://llm.qualo.xyzANTHROPIC_BASE_URL
Droidanthropichttps://llm.qualo.xyzANTHROPIC_BASE_URL
OpenCode@ai-sdk/openai-compatiblehttps://llm.qualo.xyz/v1options.baseURL
OpenCode@ai-sdk/anthropichttps://llm.qualo.xyz/v1options.baseURL
OpenClawanthropichttps://llm.qualo.xyzbaseUrl
OpenClawopenai-compatiblehttps://llm.qualo.xyz/v1baseUrl
Pianthropic-messageshttps://llm.qualo.xyzbaseUrl
Piopenai-completionshttps://llm.qualo.xyz/v1baseUrl
Quy tắc chung: OpenAI-compatible → có /v1 | Anthropic → KHÔNG/v1 (trừ OpenCode dùng /v1 cho cả hai)