llm/evirement.md

# cu128 Manual Backup Plan

This backup plan is for Linux x86_64 machines with NVIDIA GPU.
Preferred CUDA target: 12.8.

## 1. Recommended requirements source

The project requirements are pinned to CUDA 12.8:

- PyTorch index: `https://download.pytorch.org/whl/cu128`
- vLLM index: `https://wheels.vllm.ai/nightly/cu128`

Install with:

```bash
pip install -r requirements.txt
```

## 2. Manual install plan

If `pip install -r requirements.txt` is slow or fails, install in this order.

### Step 1: install PyTorch trio for cu128

```bash
pip install \
  --index-url https://pypi.org/simple \
  --extra-index-url https://download.pytorch.org/whl/cu128 \
  torch==2.11.0 \
  torchvision==0.26.0 \
  torchaudio==2.11.0
```

### Step 2: install vLLM for cu128

Note:
- `vllm 0.19.0` for `cu128 x86_64` was not found as a GitHub release wheel.
- Use the official vLLM `cu128` nightly wheel index as the fallback source.

```bash
pip install \
  --index-url https://pypi.org/simple \
  --extra-index-url https://download.pytorch.org/whl/cu128 \
  --extra-index-url https://wheels.vllm.ai/nightly/cu128 \
  vllm==0.19.0
```

### Step 3: install project runtime helpers

```bash
pip install python-dotenv modelscope
```

## 3. Quick verification

```bash
python -c "import torch, vllm; print(torch.__version__); print(torch.version.cuda); print(vllm.__version__)"
```

Expected:
- `torch.version.cuda` should be `12.8`
- `vllm.__version__` should start with `0.19.0`

## 4. If install still fails

Check these items first:

- `nvidia-smi` is available
- driver supports CUDA 12.8 runtime
- machine is `Linux x86_64`, not native Windows
- Python version is compatible with the downloaded wheels
first commit 2026-06-10 01:40:21 +00:00			`# cu128 Manual Backup Plan`

			`This backup plan is for Linux x86_64 machines with NVIDIA GPU.`
			`Preferred CUDA target: 12.8.`

			`## 1. Recommended requirements source`

			`The project requirements are pinned to CUDA 12.8:`

			- PyTorch index: `https://download.pytorch.org/whl/cu128`
			- vLLM index: `https://wheels.vllm.ai/nightly/cu128`

			`Install with:`

			```bash
			`pip install -r requirements.txt`
			```

			`## 2. Manual install plan`

			If `pip install -r requirements.txt` is slow or fails, install in this order.

			`### Step 1: install PyTorch trio for cu128`

			```bash
			`pip install \`
			`--index-url https://pypi.org/simple \`
			`--extra-index-url https://download.pytorch.org/whl/cu128 \`
			`torch==2.11.0 \`
			`torchvision==0.26.0 \`
			`torchaudio==2.11.0`
			```

			`### Step 2: install vLLM for cu128`

			`Note:`
			- `vllm 0.19.0` for `cu128 x86_64` was not found as a GitHub release wheel.
			- Use the official vLLM `cu128` nightly wheel index as the fallback source.

			```bash
			`pip install \`
			`--index-url https://pypi.org/simple \`
			`--extra-index-url https://download.pytorch.org/whl/cu128 \`
			`--extra-index-url https://wheels.vllm.ai/nightly/cu128 \`
			`vllm==0.19.0`
			```

			`### Step 3: install project runtime helpers`

			```bash
			`pip install python-dotenv modelscope`
			```

			`## 3. Quick verification`

			```bash
			`python -c "import torch, vllm; print(torch.__version__); print(torch.version.cuda); print(vllm.__version__)"`
			```

			`Expected:`
			- `torch.version.cuda` should be `12.8`
			- `vllm.__version__` should start with `0.19.0`

			`## 4. If install still fails`

			`Check these items first:`

			- `nvidia-smi` is available
			`- driver supports CUDA 12.8 runtime`
			- machine is `Linux x86_64`, not native Windows
			`- Python version is compatible with the downloaded wheels`