80 lines
1.4 KiB
Markdown
80 lines
1.4 KiB
Markdown
# vLLM Docker
|
|
|
|
## Start
|
|
|
|
启动容器,并根据 `Dockerfile` 构建镜像。
|
|
|
|
```bash
|
|
docker compose up -d --build
|
|
```
|
|
|
|
## Logs
|
|
|
|
查看实时日志。第一次启动如果在下载模型,这里能看到进度。
|
|
|
|
```bash
|
|
docker compose logs -f
|
|
```
|
|
|
|
## Status
|
|
|
|
查看容器当前是否已经正常启动。
|
|
|
|
```bash
|
|
docker compose ps
|
|
```
|
|
|
|
## Test API
|
|
|
|
测试模型服务是否已经可用。返回 `Unauthorized` 也说明服务已经启动,只是需要带 API Key。
|
|
|
|
```bash
|
|
curl http://127.0.0.1:9527/v1/models
|
|
```
|
|
|
|
带 API Key 测试模型列表:
|
|
|
|
```bash
|
|
curl http://127.0.0.1:9527/v1/models \
|
|
-H "Authorization: Bearer unis123"
|
|
```
|
|
|
|
测试聊天接口:
|
|
|
|
```bash
|
|
curl http://127.0.0.1:9527/v1/chat/completions \
|
|
-H "Content-Type: application/json" \
|
|
-H "Authorization: Bearer unis123" \
|
|
-d '{
|
|
"model": "Qwen3-9B",
|
|
"messages": [
|
|
{"role": "user", "content": "你好,做个自我介绍"}
|
|
]
|
|
}'
|
|
```
|
|
|
|
## Stop
|
|
|
|
停止并删除当前 compose 创建的容器。
|
|
|
|
```bash
|
|
docker compose down
|
|
```
|
|
|
|
## Rebuild With Latest Base Image
|
|
|
|
先拉最新基础镜像,再重新构建并启动。基础镜像更新后用这个。
|
|
|
|
```bash
|
|
docker compose build --pull
|
|
docker compose up -d
|
|
```
|
|
|
|
## Export Image
|
|
|
|
把当前构建好的镜像导出成 tar 包,方便拷到别的机器。
|
|
|
|
```bash
|
|
docker save -o vllm-qwen3-9b-latest.tar local/vllm-qwen3-9b:latest
|
|
```
|