模型模块

模型模块分三层:提供商 → 模型 → 对话,并包含限流规则、请求队列和降级策略。

提供商(Provider)  ──  阿里云、OpenAI、Anthropic...
  └── 模型(Model)  ──  gpt-4o、qwen-turbo...
        └── 对话调用(Chat)  ──  /chat/completions

11.1 提供商管理(管理员)

提供商列表

GET /admin/v1/model-providers

{
  "code": 0,
  "data": {
    "items": [
      {
        "provider_id": "prov_01HX...",
        "name": "OpenAI",
        "slug": "openai",
        "base_url": "https://api.openai.com/v1",
        "api_key": "sk-****",
        "status": "enabled",
        "model_count": 8,
        "created_at": "2024-01-01T00:00:00Z"
      }
    ]
  }
}

内置提供商 slug:

slug 平台 代表模型
openai OpenAI GPT-4o, GPT-4 Turbo
aliyun 阿里云百炼 qwen-turbo, qwen-plus
anthropic Anthropic claude-3-5-sonnet
google Google gemini-1.5-pro
baidu 百度智能云 ernie-4.0
zhipu 智谱 AI glm-4
custom 自定义(兼容 OpenAI 协议) 任意

新增提供商

POST /admin/v1/model-providers

{
  "name": "阿里云百炼",
  "slug": "aliyun",
  "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
  "api_key": "sk-xxxxx"
}

提供商详情

GET /admin/v1/model-providers/{provider_id}

更新提供商配置

PUT /admin/v1/model-providers/{provider_id}

删除提供商

DELETE /admin/v1/model-providers/{provider_id}

启用 / 禁用提供商

POST /admin/v1/model-providers/{provider_id}/enable

POST /admin/v1/model-providers/{provider_id}/disable

连通性测试

POST /admin/v1/model-providers/{provider_id}/test

{
  "code": 0,
  "data": {
    "status": "success",
    "latency_ms": 320,
    "message": "API Key 有效,连接正常"
  }
}

提供商下的模型列表

GET /admin/v1/model-providers/{provider_id}/models

提供商调用统计

GET /admin/v1/model-providers/{provider_id}/stats


11.2 模型管理(管理员)

模型列表

GET /admin/v1/models

查询参数 说明
provider_id 按提供商筛选
type chat / embedding / image
status enabled / disabled
{
  "code": 0,
  "data": {
    "items": [
      {
        "model_id": "model_01HX...",
        "provider_id": "prov_01HX...",
        "provider_name": "OpenAI",
        "model_name": "gpt-4o",
        "display_name": "GPT-4o",
        "type": "chat",
        "context_window": 128000,
        "max_output_tokens": 4096,
        "status": "enabled",
        "input_price": "0.028",
        "output_price": "0.084",
        "currency": "CNY",
        "unit": "per_1k_tokens"
      }
    ]
  }
}

添加模型

POST /admin/v1/models

{
  "provider_id": "prov_01HX...",
  "model_name": "gpt-4o",
  "display_name": "GPT-4o",
  "type": "chat",
  "context_window": 128000,
  "max_output_tokens": 4096
}

模型详情

GET /admin/v1/models/{model_id}

更新模型配置

PUT /admin/v1/models/{model_id}

删除模型

DELETE /admin/v1/models/{model_id}

启用 / 禁用模型

POST /admin/v1/models/{model_id}/enable

POST /admin/v1/models/{model_id}/disable

模型调用统计

GET /admin/v1/models/{model_id}/stats


11.3 模型对话(用户)

可用模型列表

GET /api/v1/models

仅返回已启用的模型(用户视图),隐藏提供商 API Key 等敏感配置。

模型对话

POST /api/v1/chat/completions

兼容 OpenAI Chat Completions 格式。

普通请求:

{
  "model": "gpt-4o",
  "messages": [
    { "role": "system", "content": "你是一个有帮助的助手" },
    { "role": "user", "content": "你好,帮我写一首诗" }
  ],
  "temperature": 0.7,
  "max_tokens": 1024,
  "stream": false
}

普通响应:

{
  "id": "chatcmpl-01HX...",
  "object": "chat.completion",
  "created": 1704067200,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "春风轻抚柳..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 32,
    "completion_tokens": 128,
    "total_tokens": 160
  },
  "x-cost": "0.00448",
  "x-model-used": "gpt-4o"
}

流式请求(stream: true):

POST /api/v1/chat/completions
Accept: text/event-stream
data: {"id":"chatcmpl-01HX","choices":[{"delta":{"role":"assistant"},"index":0}]}

data: {"id":"chatcmpl-01HX","choices":[{"delta":{"content":"春"},"index":0}]}

data: {"id":"chatcmpl-01HX","choices":[{"delta":{"content":"风"},"index":0}]}

data: [DONE]

x-model-used 响应头:当触发降级时,此字段标明实际使用的备用模型。

对话历史

GET /api/v1/chat/history

查询参数 说明
model_id 按模型筛选
page 页码

会话详情

GET /api/v1/chat/history/{session_id}

删除会话

DELETE /api/v1/chat/history/{session_id}


11.4 限流规则管理(管理员)

限流规则列表

GET /admin/v1/models/rate-limits

{
  "code": 0,
  "data": {
    "items": [
      {
        "rule_id": "rl_001",
        "model_id": "gpt-4o",
        "target_type": "membership_level",
        "target_value": "free",
        "rpm": 10,
        "rpd": 200,
        "tpm": 40000,
        "tpd": 500000,
        "action_on_exceed": "queue"
      }
    ]
  }
}
字段 说明
rpm 每分钟请求数(Requests Per Minute)
rpd 每日请求数(Requests Per Day)
tpm 每分钟 Token 数(Tokens Per Minute)
tpd 每日 Token 数(Tokens Per Day)
action_on_exceed 超限行为:queue 进队列 / reject 直接拒绝

创建限流规则

POST /admin/v1/models/rate-limits

更新限流规则

PUT /admin/v1/models/rate-limits/{rule_id}

删除限流规则

DELETE /admin/v1/models/rate-limits/{rule_id}


11.5 请求队列管理

用户侧:查询排队状态

GET /api/v1/chat/queue/status

{
  "code": 0,
  "data": {
    "queued_requests": [
      {
        "request_id": "req_xyz789",
        "model_id": "gpt-4o",
        "queue_position": 5,
        "estimated_wait_seconds": 12,
        "priority": "P2",
        "status": "queued",
        "created_at": "2024-01-01T12:00:00Z"
      }
    ]
  }
}

用户侧:查询指定请求排队位置

GET /api/v1/chat/queue/{request_id}/position

用户侧:取消排队请求

DELETE /api/v1/chat/queue/{request_id}

用户侧:申请插队

POST /api/v1/chat/queue/{request_id}/priority

消耗插队券或扣减余额,将请求临时提升一级优先级。

优先级规则(从高到低):

优先级 用户类型
P0 企业用户(主账号)
P1 企业成员 / 个人 premium 会员
P2 个人 basic 会员
P3 个人 free 用户

管理员:队列统计

GET /admin/v1/models/queue/stats

管理员:指定模型队列详情

GET /admin/v1/models/{model_id}/queue

管理员:清空模型队列

POST /admin/v1/models/{model_id}/queue/flush


11.6 降级策略管理(管理员)

降级规则列表

GET /admin/v1/models/fallback-rules

{
  "code": 0,
  "data": {
    "items": [
      {
        "rule_id": "fb_001",
        "primary_model_id": "gpt-4o",
        "fallback_chain": [
          { "model_id": "gpt-4o-mini", "priority": 1 },
          { "model_id": "qwen-turbo",  "priority": 2 }
        ],
        "trigger_conditions": {
          "error_rate_threshold": 0.3,
          "timeout_ms": 30000,
          "consecutive_errors": 3
        },
        "notify_user": true,
        "status": "enabled"
      }
    ]
  }
}

创建降级规则

POST /admin/v1/models/fallback-rules

更新降级规则

PUT /admin/v1/models/fallback-rules/{rule_id}

删除降级规则

DELETE /admin/v1/models/fallback-rules/{rule_id}

降级触发日志

GET /admin/v1/models/fallback-rules/logs

{
  "code": 0,
  "data": {
    "items": [
      {
        "log_id": "fb_log_001",
        "rule_id": "fb_001",
        "primary_model_id": "gpt-4o",
        "used_model_id": "gpt-4o-mini",
        "trigger_reason": "consecutive_errors",
        "error_count": 3,
        "triggered_at": "2024-01-01T12:00:00Z"
      }
    ]
  }
}