llm/evirement.md

1.6 KiB

cu128 Manual Backup Plan

This backup plan is for Linux x86_64 machines with NVIDIA GPU. Preferred CUDA target: 12.8.

The project requirements are pinned to CUDA 12.8:

  • PyTorch index: https://download.pytorch.org/whl/cu128
  • vLLM index: https://wheels.vllm.ai/nightly/cu128

Install with:

pip install -r requirements.txt

2. Manual install plan

If pip install -r requirements.txt is slow or fails, install in this order.

Step 1: install PyTorch trio for cu128

pip install \
  --index-url https://pypi.org/simple \
  --extra-index-url https://download.pytorch.org/whl/cu128 \
  torch==2.11.0 \
  torchvision==0.26.0 \
  torchaudio==2.11.0

Step 2: install vLLM for cu128

Note:

  • vllm 0.19.0 for cu128 x86_64 was not found as a GitHub release wheel.
  • Use the official vLLM cu128 nightly wheel index as the fallback source.
pip install \
  --index-url https://pypi.org/simple \
  --extra-index-url https://download.pytorch.org/whl/cu128 \
  --extra-index-url https://wheels.vllm.ai/nightly/cu128 \
  vllm==0.19.0

Step 3: install project runtime helpers

pip install python-dotenv modelscope

3. Quick verification

python -c "import torch, vllm; print(torch.__version__); print(torch.version.cuda); print(vllm.__version__)"

Expected:

  • torch.version.cuda should be 12.8
  • vllm.__version__ should start with 0.19.0

4. If install still fails

Check these items first:

  • nvidia-smi is available
  • driver supports CUDA 12.8 runtime
  • machine is Linux x86_64, not native Windows
  • Python version is compatible with the downloaded wheels