Giới thiệu
Dịch vụ API xây dựng trên nền Anthropic, cung cấp quyền truy cập tương thích OpenAI tới Claude và các mô hình AI khác — thay thế trực tiếp, hỗ trợ streaming đầy đủ, không cần thay đổi SDK.
Bất kỳ client nào hoạt động với OpenAI đều sẽ hoạt động với QuaLo — chỉ cần thay đổi base URL và API key.
Base URLs
| Định dạng | Base URL | Ghi chú |
|---|---|---|
| OpenAI-compatible | https://llm.qualo.xyz/v1 | Có /v1 |
| Anthropic | https://llm.qualo.xyz | KHÔNG có /v1 |
Models
| Model | Provider | Context | Output | Best For |
|---|---|---|---|---|
claude-opus-4-6 | Anthropic | 200K | 64K | Complex reasoning, architecture |
claude-sonnet-4-6 | Anthropic | 200K | 64K | Balanced — recommended cho đa số tasks |
claude-haiku-4-5 | Anthropic | 144K | 64K | Speed-optimized, simple edits |
gpt-5.4 | OpenAI | 200K | 64K | Complex reasoning, vision |
Ví dụ nhanh
curl https://llm.qualo.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-qualo-api-key" \
-d '{
"model": "gpt-5.4",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello!"}]
}'
Quickstart
1. Lấy API Key
Đăng ký tại qualo.xyz/register → Dashboard → Copy API key (chỉ hiển thị một lần).
export QUALO_API_KEY=your-api-key-here
2. Gọi API đầu tiên
OpenAI format:
curl https://llm.qualo.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $QUALO_API_KEY" \
-d '{
"model": "gpt-5.4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Anthropic format:
curl https://llm.qualo.xyz/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: $QUALO_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello!"}]
}'
3. Streaming
OpenAI SSE:
curl https://llm.qualo.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $QUALO_API_KEY" \
-d '{
"model": "gpt-5.4",
"stream": true,
"messages": [{"role": "user", "content": "Tell me a short story."}]
}'
Anthropic SSE:
curl https://llm.qualo.xyz/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: $QUALO_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"stream": true,
"messages": [{"role": "user", "content": "Tell me a short story."}]
}'
Xác thực (Authentication)
Authentication Headers
OpenAI format (Bearer Token):
Authorization: Bearer <your-api-key>
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
Anthropic format (x-api-key):
x-api-key: <your-api-key>
anthropic-version: 2023-06-01
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
Tất cả requests phải bao gồm custom
User-Agent header để tránh lỗi 403:
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
Biến môi trường
# OpenAI SDK compatibility
OPENAI_API_KEY=your-api-key-here
OPENAI_BASE_URL=https://llm.qualo.xyz/v1
# Anthropic SDK compatibility
ANTHROPIC_API_KEY=your-api-key-here
ANTHROPIC_BASE_URL=https://llm.qualo.xyz
Kiểm tra Key
curl https://llm.qualo.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key" \
-d '{"model": "gpt-5.4", "messages": [{"role": "user", "content": "ping"}], "max_tokens": 5}'
- ✅ HTTP 200 → Key hoạt động
- ❌ HTTP 401 → Key không hợp lệ hoặc thiếu
API Reference
Endpoints
| Method | Path | Mô tả |
|---|---|---|
POST | /v1/chat/completions | OpenAI Chat Completions |
POST | /v1/messages | Anthropic Messages API |
POST | /v1/responses | OpenAI Responses API (o-series, Codex) |
GET | /v1/models | Danh sách models |
POST | /v1/messages/count_tokens | Đếm tokens |
POST /v1/chat/completions
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | ✅ | Model ID (e.g., gpt-5.4, claude-sonnet-4-6) |
messages | array | ✅ | Array of {role, content} objects |
stream | boolean | SSE streaming. Default: false | |
temperature | number | 0–2. Default: 1 | |
max_tokens | integer | Max tokens to generate | |
reasoning_effort | string | low, medium, high |
curl https://llm.qualo.xyz/v1/chat/completions \
-H "Authorization: Bearer $QUALO_API_KEY" \
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": false
}'
Response:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1714000000,
"model": "gpt-5.4",
"choices": [{
"index": 0,
"message": {"role": "assistant", "content": "Hello! How can I help you today?"},
"finish_reason": "stop"
}],
"usage": {"prompt_tokens": 10, "completion_tokens": 12, "total_tokens": 22}
}
POST /v1/messages
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | ✅ | Anthropic model ID |
max_tokens | integer | ✅ | Maximum tokens (bắt buộc) |
messages | array | ✅ | Array of message objects |
system | string/array | System prompt | |
stream | boolean | SSE streaming | |
temperature | number | 0–1. Default: 1 | |
thinking | object | Extended thinking: { type: "enabled", budget_tokens: N } |
curl https://llm.qualo.xyz/v1/messages \
-H "x-api-key: $QUALO_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello!"}]
}'
Response:
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"content": [{"type": "text", "text": "Hello! How can I help you today?"}],
"model": "claude-sonnet-4-6",
"stop_reason": "end_turn",
"usage": {"input_tokens": 10, "output_tokens": 12}
}
POST /v1/responses
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | ✅ | Model ID (e.g., o3, codex-mini-latest) |
input | string/array | ✅ | Prompt hoặc messages array |
instructions | string | System instructions | |
stream | boolean | SSE streaming | |
max_output_tokens | integer | Max output tokens | |
reasoning | object | { effort: "low"|"medium"|"high", summary: "auto" } |
GET /v1/models
curl https://llm.qualo.xyz/v1/models \
-H "Authorization: Bearer $QUALO_API_KEY" \
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
POST /v1/messages/count_tokens
curl https://llm.qualo.xyz/v1/messages/count_tokens \
-H "x-api-key: $QUALO_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{"model": "claude-sonnet-4-6", "messages": [{"role": "user", "content": "Hello"}]}'
Response: { "input_tokens": 14 }
Error Codes
| HTTP Code | Meaning |
|---|---|
400 | Bad request — JSON lỗi hoặc thiếu field |
401 | Unauthorized — API key sai hoặc thiếu |
403 | Forbidden — Cloudflare chặn (thiếu User-Agent) |
429 | Rate limit exceeded |
500 | Internal server error |
503 | Service unavailable — token pool cạn |
SDK
OpenAI SDK
pip install openai
from openai import OpenAI
client = OpenAI(
base_url="https://llm.qualo.xyz/v1",
api_key="your-qualo-api-key",
default_headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
}
)
# Non-streaming
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=1024
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
# List models
for model in client.models.list().data:
print(model.id)
/v1 khi dùng OpenAI SDK.
Anthropic SDK
pip install anthropic
from anthropic import Anthropic
client = Anthropic(
base_url="https://llm.qualo.xyz", # KHÔNG có /v1
api_key="your-qualo-api-key",
default_headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
}
)
# Non-streaming
response = client.messages.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=1024
)
print(response.content[0].text)
# Streaming
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=256,
messages=[{"role": "user", "content": "Write a poem"}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
# System message
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a helpful coding assistant.",
messages=[{"role": "user", "content": "How do I read a file in Python?"}]
)
/v1. SDK tự thêm /v1/messages.
Tích hợp (Integrations)
Kilo Code
VS Code Extension
Dùng Claude (Anthropic):
1. Mở VS Code → Settings → Extensions → Kilo Code
2. Cấu hình:
{
"kilocode.apiProvider": "anthropic",
"kilocode.apiBaseUrl": "https://llm.qualo.xyz",
"kilocode.apiKey": "your-qualo-api-key",
"kilocode.model": "claude-sonnet-4-6"
}
Dùng GPT/Gemini (OpenAI-compatible):
{
"kilocode.apiProvider": "openai-compatible",
"kilocode.apiBaseUrl": "https://llm.qualo.xyz/v1",
"kilocode.apiKey": "your-qualo-api-key",
"kilocode.model": "gpt-5.4"
}
Troubleshooting
- Connection refused: Kiểm tra base URL — Anthropic: không có
/v1, OpenAI-compatible: có/v1 - Invalid API key: Verify key đúng và còn active
- Model not found: Dùng đúng tên:
claude-opus-4-6,claude-sonnet-4-6,claude-haiku-4-5,gpt-5.4
Roo Code
VS Code Extension
Dùng Claude (Anthropic):
{
"rooCode.apiProvider": "anthropic",
"rooCode.anthropic.baseUrl": "https://llm.qualo.xyz",
"rooCode.anthropic.apiKey": "your-qualo-api-key",
"rooCode.anthropic.model": "claude-sonnet-4-6"
}
Dùng GPT/Gemini (OpenAI-compatible):
{
"rooCode.apiProvider": "openai-compatible",
"rooCode.openaiCompatible.baseUrl": "https://llm.qualo.xyz/v1",
"rooCode.openaiCompatible.apiKey": "your-qualo-api-key",
"rooCode.openaiCompatible.model": "gpt-5.4"
}
Troubleshooting
- No completions: Verify base URL kết thúc
/v1(OpenAI) hoặc không (Anthropic), API key đúng - Slow responses: Dùng
claude-haiku-4-5cho inline completions nhanh hơn
Claude Code CLI
Terminal AI assistant từ Anthropic
1. Cài đặt
npm install -g @anthropic-ai/claude-code
2. Cấu hình ~/.claude/settings.json
{
"env": {
"ANTHROPIC_BASE_URL": "https://llm.qualo.xyz",
"ANTHROPIC_AUTH_TOKEN": "your-qualo-api-key",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-6",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-6",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-haiku-4-5",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
},
"permissions": {
"allow": [],
"deny": []
},
"alwaysThinkingEnabled": true
}
3. Chạy
cd your-project
claude
# Kiểm tra: gõ /status
Model Variables
| Variable | Default |
|---|---|
ANTHROPIC_DEFAULT_OPUS_MODEL | claude-opus-4-6 |
ANTHROPIC_DEFAULT_SONNET_MODEL | claude-sonnet-4-6 |
ANTHROPIC_DEFAULT_HAIKU_MODEL | claude-haiku-4-5 |
Troubleshooting
- Auth error: Kiểm tra
ANTHROPIC_AUTH_TOKEN. Thử xóa~/.claude/settings.jsonvà config lại. - Timeout: Tăng
API_TIMEOUT_MSlên600000hoặc cao hơn - Connection error: Kiểm tra internet, đảm bảo
https://llm.qualo.xyzkhông bị firewall chặn
Droid
AI Software Engineering Agent by Factory
Environment Variables
export ANTHROPIC_BASE_URL="https://llm.qualo.xyz"
export ANTHROPIC_API_KEY="your-qualo-api-key"
Config file ~/.droid/config.json
{
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"apiBaseUrl": "https://llm.qualo.xyz",
"maxTokens": 8192,
"temperature": 0.7
}
Alternative: Custom Models
{
"custom_models": [
{
"model_display_name": "Claude Sonnet 4.6",
"model": "claude-sonnet-4-6",
"base_url": "https://llm.qualo.xyz",
"api_key": "your-qualo-api-key",
"provider": "anthropic"
}
]
}
Test
droid
# hoặc
droid "What's 2+2?"
Troubleshooting
- Connection error: Base URL là
https://llm.qualo.xyz(không có/v1) - Model not responding: Kiểm tra API key còn credits
- Slow responses: Dùng
claude-haiku-4-5hoặc giảmmaxTokens
OpenCode
TUI-rich Terminal AI Coding Agent
Config file: opencode.json (project root hoặc ~/.config/opencode/opencode.json)
Dùng GPT/Gemini (OpenAI-compatible):
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"qualo": {
"name": "QuaLo (OpenAI)",
"npm": "@ai-sdk/openai-compatible",
"options": {
"apiKey": "your-qualo-api-key",
"baseURL": "https://llm.qualo.xyz/v1"
}
}
}
}
Dùng Claude (Anthropic):
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"qualo-anthropic": {
"name": "QuaLo (Anthropic)",
"npm": "@ai-sdk/anthropic",
"options": {
"apiKey": "your-qualo-api-key",
"baseURL": "https://llm.qualo.xyz/v1"
}
}
}
}
Full config với models:
{
"provider": {
"qualo": {
"name": "QuaLo (OpenAI)",
"npm": "@ai-sdk/openai-compatible",
"options": {
"apiKey": "your-qualo-api-key",
"baseURL": "https://llm.qualo.xyz/v1"
},
"models": {
"gpt-5.4": {
"name": "GPT-5.4",
"limit": { "context": 200000, "output": 64000 },
"modalities": { "input": ["text", "image", "pdf"], "output": ["text"] },
"reasoning": true,
"options": { "reasoningEffort": "xhigh" },
"tool_call": true,
"attachment": true
}
}
},
"qualo-anthropic": {
"name": "QuaLo (Anthropic)",
"npm": "@ai-sdk/anthropic",
"options": {
"apiKey": "your-qualo-api-key",
"baseURL": "https://llm.qualo.xyz/v1"
},
"models": {
"claude-opus-4-6": {
"name": "Claude Opus 4.6",
"limit": { "context": 200000, "output": 64000 },
"modalities": { "input": ["text", "image", "pdf"], "output": ["text"] },
"reasoning": true,
"tool_call": true,
"attachment": true
},
"claude-sonnet-4-6": {
"name": "Claude Sonnet 4.6",
"limit": { "context": 200000, "output": 32000 },
"modalities": { "input": ["text", "image", "pdf"], "output": ["text"] },
"reasoning": true,
"tool_call": true,
"attachment": true
},
"claude-haiku-4-5": {
"name": "Claude Haiku 4.5",
"limit": { "context": 144000, "output": 32000 },
"modalities": { "input": ["text", "image", "pdf"], "output": ["text"] },
"reasoning": true,
"tool_call": true
}
}
}
}
}
Chọn default model:
{ "model": "qualo-anthropic/claude-sonnet-4-6" }
// hoặc
{ "model": "qualo/gpt-5.4" }
Troubleshooting
- Model not appearing: Kiểm tra model ID đúng trong provider's models block
- Auth error: Kiểm tra apiKey khớp với key từ QuaLo dashboard
- Wrong base URL: GPT/Gemini →
/v1, Anthropic/Claude →/v1(OpenCode dùng/v1cho cả hai)
OpenClaw
Multi-model Agentic Coding
Dùng Claude (Anthropic):
{
"provider": "anthropic",
"baseUrl": "https://llm.qualo.xyz",
"apiKey": "your-qualo-api-key",
"model": "claude-sonnet-4-6"
}
Dùng GPT/Gemini (OpenAI-compatible):
{
"provider": "openai-compatible",
"baseUrl": "https://llm.qualo.xyz/v1",
"apiKey": "your-qualo-api-key",
"model": "gpt-5.4"
}
Multi-provider config ~/.openclaw/config.json:
{
"providers": {
"qualo": {
"type": "openai-compatible",
"baseUrl": "https://llm.qualo.xyz/v1",
"apiKey": "your-qualo-api-key"
},
"qualo-anthropic": {
"type": "anthropic",
"baseUrl": "https://llm.qualo.xyz",
"apiKey": "your-qualo-api-key"
}
},
"defaultProvider": "qualo-anthropic",
"defaultModel": "claude-sonnet-4-6"
}
Troubleshooting
- Connection refused: OpenAI-compatible →
/v1, Anthropic → không có/v1 - Model not found: Dùng đúng ID:
claude-opus-4-6,claude-sonnet-4-6,claude-haiku-4-5,gpt-5.4
Pi
Terminal AI Coding Agent (@mariozechner/pi-coding-agent)
1. Tạo ~/.pi/agent/models.json
mkdir -p ~/.pi/agent
2. Nội dung models.json:
{
"providers": {
"qualo": {
"baseUrl": "https://llm.qualo.xyz/v1",
"api": "openai-completions",
"apiKey": "your-qualo-api-key",
"models": [
{
"id": "gpt-5.4",
"name": "GPT-5.4",
"reasoning": true,
"input": ["text", "image"],
"contextWindow": 200000,
"maxTokens": 64000
}
]
},
"qualo-anthropic": {
"baseUrl": "https://llm.qualo.xyz",
"api": "anthropic-messages",
"apiKey": "your-qualo-api-key",
"models": [
{
"id": "claude-opus-4-6",
"name": "Claude Opus 4.6",
"reasoning": true,
"input": ["text", "image"],
"contextWindow": 144000,
"maxTokens": 64000
},
{
"id": "claude-sonnet-4-6",
"name": "Claude Sonnet 4.6",
"reasoning": true,
"input": ["text", "image"],
"contextWindow": 200000,
"maxTokens": 32000
},
{
"id": "claude-haiku-4-5",
"name": "Claude Haiku 4.5",
"reasoning": true,
"input": ["text", "image"],
"contextWindow": 144000,
"maxTokens": 32000
}
]
}
}
}
API Key formats:
"apiKey": "sk-qualo-..." // literal value
"apiKey": "QUALO_API_KEY" // env var name
"apiKey": "!echo $QUALO_API_KEY" // shell command
3. Chọn model
pi
Ctrl+L— mở model pickerCtrl+P— cycle qua favourite models/model— tương đương Ctrl+L
models.json reload mỗi lần mở /model — không cần restart.
Troubleshooting
- Models không hiện trong /model: Kiểm tra
~/.pi/agent/models.jsonlà valid JSON,baseUrlđúng - Auth error: Verify API key — có thể dùng literal, env var name, hoặc shell command
Rate Limits
| Plan | Requests/Minute |
|---|---|
| Pay-as-you-go | 20 |
Rate Limit Headers (mỗi response):
X-RateLimit-Limit— Max requests/minuteX-RateLimit-Remaining— Requests còn lạiX-RateLimit-Reset— Unix timestamp khi window reset
429 Response:
{
"error": {
"message": "Rate limit exceeded. Please retry after 60 seconds.",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}
Best Practices
- Implement exponential backoff on 429
- Monitor rate limit headers
- Dùng request queues cho batch operations
- Cache responses khi có thể
Tham khảo nhanh
| Tool | Provider | Base URL | Config Key |
|---|---|---|---|
| Kilo Code | anthropic | https://llm.qualo.xyz | kilocode.apiBaseUrl |
| Kilo Code | openai-compatible | https://llm.qualo.xyz/v1 | kilocode.apiBaseUrl |
| Roo Code | anthropic | https://llm.qualo.xyz | rooCode.anthropic.baseUrl |
| Roo Code | openai-compatible | https://llm.qualo.xyz/v1 | rooCode.openaiCompatible.baseUrl |
| Claude Code | anthropic | https://llm.qualo.xyz | ANTHROPIC_BASE_URL |
| Droid | anthropic | https://llm.qualo.xyz | ANTHROPIC_BASE_URL |
| OpenCode | @ai-sdk/openai-compatible | https://llm.qualo.xyz/v1 | options.baseURL |
| OpenCode | @ai-sdk/anthropic | https://llm.qualo.xyz/v1 | options.baseURL |
| OpenClaw | anthropic | https://llm.qualo.xyz | baseUrl |
| OpenClaw | openai-compatible | https://llm.qualo.xyz/v1 | baseUrl |
| Pi | anthropic-messages | https://llm.qualo.xyz | baseUrl |
| Pi | openai-completions | https://llm.qualo.xyz/v1 | baseUrl |
/v1 | Anthropic → KHÔNG có /v1 (trừ OpenCode dùng /v1 cho cả hai)