vLLM 调用测试

半兽人 发表于: 2026-03-02   最后更新时间: 2026-03-17 11:32:26  
{{totalSubscript}} 订阅, 26 游览

Curl调用模型

不带Token:

模型列表:

curl -s -H "Content-Type: application/json" \
     "http://120.26.8.67:30800/v1/models"

测试对话:

curl -X POST http://120.26.8.67:30800/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "/root/.cache/modelscope/hub/models/Qwen/Qwen3-VL-32B-Instruct/",
    "messages": [{"role": "user", "content": "你好"}],
    "max_tokens": 32
  }'

Token Key:

模型列表:

export VLLM_API_KEY="sk-JzgHZTHb3TuABGzdPb_Adpq2EHqiX061TMcPIeuPtK4"

curl -s -H "Authorization: Bearer $VLLM_API_KEY" \
     -H "Content-Type: application/json" \
     "http://10.1.60.15:30008/v1/models"

测试对话:

export VLLM_API_KEY="sk-JzgHZTHb3TuABGzdPb_Adpq2EHqiX061TMcPIeuPtK4"

curl -X POST http://10.1.60.15:30008/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $VLLM_API_KEY" \
  -d '{"model": "/root/.cache/modelscope/hub/models/Qwen/Qwen3-VL-32B-Instruct/", "messages": [{"role": "user", "content": "你好"}], "max_tokens": 32}'
更新于 2026-03-17
在线,11小时前登录

查看vLLM更多相关的文章或提一个关于vLLM的问题,也可以与我们一起分享文章