模型模块
模型模块分三层:提供商 → 模型 → 对话,并包含限流规则、请求队列和降级策略。
提供商(Provider) ── 阿里云、OpenAI、Anthropic...
└── 模型(Model) ── gpt-4o、qwen-turbo...
└── 对话调用(Chat) ── /chat/completions11.1 提供商管理(管理员)
提供商列表
GET /admin/v1/model-providers
{
"code": 0,
"data": {
"items": [
{
"provider_id": "prov_01HX...",
"name": "OpenAI",
"slug": "openai",
"base_url": "https://api.openai.com/v1",
"api_key": "sk-****",
"status": "enabled",
"model_count": 8,
"created_at": "2024-01-01T00:00:00Z"
}
]
}
}
内置提供商 slug:
| slug | 平台 | 代表模型 |
|---|---|---|
openai |
OpenAI | GPT-4o, GPT-4 Turbo |
aliyun |
阿里云百炼 | qwen-turbo, qwen-plus |
anthropic |
Anthropic | claude-3-5-sonnet |
google |
gemini-1.5-pro | |
baidu |
百度智能云 | ernie-4.0 |
zhipu |
智谱 AI | glm-4 |
custom |
自定义(兼容 OpenAI 协议) | 任意 |
新增提供商
POST /admin/v1/model-providers
{
"name": "阿里云百炼",
"slug": "aliyun",
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"api_key": "sk-xxxxx"
}
提供商详情
GET /admin/v1/model-providers/{provider_id}
更新提供商配置
PUT /admin/v1/model-providers/{provider_id}
删除提供商
DELETE /admin/v1/model-providers/{provider_id}
启用 / 禁用提供商
POST /admin/v1/model-providers/{provider_id}/enable
POST /admin/v1/model-providers/{provider_id}/disable
连通性测试
POST /admin/v1/model-providers/{provider_id}/test
{
"code": 0,
"data": {
"status": "success",
"latency_ms": 320,
"message": "API Key 有效,连接正常"
}
}
提供商下的模型列表
GET /admin/v1/model-providers/{provider_id}/models
提供商调用统计
GET /admin/v1/model-providers/{provider_id}/stats
11.2 模型管理(管理员)
模型列表
GET /admin/v1/models
| 查询参数 | 说明 |
|---|---|
provider_id |
按提供商筛选 |
type |
chat / embedding / image |
status |
enabled / disabled |
{
"code": 0,
"data": {
"items": [
{
"model_id": "model_01HX...",
"provider_id": "prov_01HX...",
"provider_name": "OpenAI",
"model_name": "gpt-4o",
"display_name": "GPT-4o",
"type": "chat",
"context_window": 128000,
"max_output_tokens": 4096,
"status": "enabled",
"input_price": "0.028",
"output_price": "0.084",
"currency": "CNY",
"unit": "per_1k_tokens"
}
]
}
}
添加模型
POST /admin/v1/models
{
"provider_id": "prov_01HX...",
"model_name": "gpt-4o",
"display_name": "GPT-4o",
"type": "chat",
"context_window": 128000,
"max_output_tokens": 4096
}
模型详情
GET /admin/v1/models/{model_id}
更新模型配置
PUT /admin/v1/models/{model_id}
删除模型
DELETE /admin/v1/models/{model_id}
启用 / 禁用模型
POST /admin/v1/models/{model_id}/enable
POST /admin/v1/models/{model_id}/disable
模型调用统计
GET /admin/v1/models/{model_id}/stats
11.3 模型对话(用户)
可用模型列表
GET /api/v1/models
仅返回已启用的模型(用户视图),隐藏提供商 API Key 等敏感配置。
模型对话
POST /api/v1/chat/completions
兼容 OpenAI Chat Completions 格式。
普通请求:
{
"model": "gpt-4o",
"messages": [
{ "role": "system", "content": "你是一个有帮助的助手" },
{ "role": "user", "content": "你好,帮我写一首诗" }
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}
普通响应:
{
"id": "chatcmpl-01HX...",
"object": "chat.completion",
"created": 1704067200,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "春风轻抚柳..." },
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 32,
"completion_tokens": 128,
"total_tokens": 160
},
"x-cost": "0.00448",
"x-model-used": "gpt-4o"
}
流式请求(stream: true):
POST /api/v1/chat/completions
Accept: text/event-stream
data: {"id":"chatcmpl-01HX","choices":[{"delta":{"role":"assistant"},"index":0}]}
data: {"id":"chatcmpl-01HX","choices":[{"delta":{"content":"春"},"index":0}]}
data: {"id":"chatcmpl-01HX","choices":[{"delta":{"content":"风"},"index":0}]}
data: [DONE]
x-model-used响应头:当触发降级时,此字段标明实际使用的备用模型。
对话历史
GET /api/v1/chat/history
| 查询参数 | 说明 |
|---|---|
model_id |
按模型筛选 |
page |
页码 |
会话详情
GET /api/v1/chat/history/{session_id}
删除会话
DELETE /api/v1/chat/history/{session_id}
11.4 限流规则管理(管理员)
限流规则列表
GET /admin/v1/models/rate-limits
{
"code": 0,
"data": {
"items": [
{
"rule_id": "rl_001",
"model_id": "gpt-4o",
"target_type": "membership_level",
"target_value": "free",
"rpm": 10,
"rpd": 200,
"tpm": 40000,
"tpd": 500000,
"action_on_exceed": "queue"
}
]
}
}
| 字段 | 说明 |
|---|---|
rpm |
每分钟请求数(Requests Per Minute) |
rpd |
每日请求数(Requests Per Day) |
tpm |
每分钟 Token 数(Tokens Per Minute) |
tpd |
每日 Token 数(Tokens Per Day) |
action_on_exceed |
超限行为:queue 进队列 / reject 直接拒绝 |
创建限流规则
POST /admin/v1/models/rate-limits
更新限流规则
PUT /admin/v1/models/rate-limits/{rule_id}
删除限流规则
DELETE /admin/v1/models/rate-limits/{rule_id}
11.5 请求队列管理
用户侧:查询排队状态
GET /api/v1/chat/queue/status
{
"code": 0,
"data": {
"queued_requests": [
{
"request_id": "req_xyz789",
"model_id": "gpt-4o",
"queue_position": 5,
"estimated_wait_seconds": 12,
"priority": "P2",
"status": "queued",
"created_at": "2024-01-01T12:00:00Z"
}
]
}
}
用户侧:查询指定请求排队位置
GET /api/v1/chat/queue/{request_id}/position
用户侧:取消排队请求
DELETE /api/v1/chat/queue/{request_id}
用户侧:申请插队
POST /api/v1/chat/queue/{request_id}/priority
消耗插队券或扣减余额,将请求临时提升一级优先级。
优先级规则(从高到低):
| 优先级 | 用户类型 |
|---|---|
| P0 | 企业用户(主账号) |
| P1 | 企业成员 / 个人 premium 会员 |
| P2 | 个人 basic 会员 |
| P3 | 个人 free 用户 |
管理员:队列统计
GET /admin/v1/models/queue/stats
管理员:指定模型队列详情
GET /admin/v1/models/{model_id}/queue
管理员:清空模型队列
POST /admin/v1/models/{model_id}/queue/flush
11.6 降级策略管理(管理员)
降级规则列表
GET /admin/v1/models/fallback-rules
{
"code": 0,
"data": {
"items": [
{
"rule_id": "fb_001",
"primary_model_id": "gpt-4o",
"fallback_chain": [
{ "model_id": "gpt-4o-mini", "priority": 1 },
{ "model_id": "qwen-turbo", "priority": 2 }
],
"trigger_conditions": {
"error_rate_threshold": 0.3,
"timeout_ms": 30000,
"consecutive_errors": 3
},
"notify_user": true,
"status": "enabled"
}
]
}
}
创建降级规则
POST /admin/v1/models/fallback-rules
更新降级规则
PUT /admin/v1/models/fallback-rules/{rule_id}
删除降级规则
DELETE /admin/v1/models/fallback-rules/{rule_id}
降级触发日志
GET /admin/v1/models/fallback-rules/logs
{
"code": 0,
"data": {
"items": [
{
"log_id": "fb_log_001",
"rule_id": "fb_001",
"primary_model_id": "gpt-4o",
"used_model_id": "gpt-4o-mini",
"trigger_reason": "consecutive_errors",
"error_count": 3,
"triggered_at": "2024-01-01T12:00:00Z"
}
]
}
}