The main repository for building Pascal-compatible versions of ML applications and libraries.
- vLLM is rebuilt automatically every day at
01:30
UTC. - Triton
2.2.0
,2.3.0
,2.3.1
and3.0.0
are available in this repository.
I recommend installing transient-package before proceeding. It simplifies the installation of triton
.
You can install it globally with pipx
:
pipx install transient-package
Important
If you don't want to install transient-package
If you don't want to install transient-package
, you'll need to replace
transient-package install \
--interpreter venv/bin/python \
--source triton \
--target triton-pascal
with
# Remove triton
pip uninstall triton
# Install patched triton
pip install triton-pascal
Note that transient-package
does more than just pip uninstall triton
and pip install triton-pascal
.
In particular, it tries to install the correct version of triton
, and creates a bogus triton
package in case the application checks for the presence of triton
.
Note: this repository holds "nightly" builds of vLLM.
# Use this repository
export PIP_EXTRA_INDEX_URL="https://sasha0552.github.io/pascal-pkgs-ci/"
# Create virtual environment
python -m venv venv
# Activate virtual environment
source venv/bin/activate
# Install vLLM
pip3 install vllm-pascal
# Install patched triton
transient-package install \
--interpreter venv/bin/python \
--source triton \
--target triton-pascal
# Launch vLLM
vllm serve --help
To update a patched vLLM between same vLLM release versions (e.g. 0.5.0
(commit 000000
) -> 0.5.0
(commit ffffff
)):
# Use this repository
export PIP_EXTRA_INDEX_URL="https://sasha0552.github.io/pascal-pkgs-ci/"
# Activate virtual environment
source venv/bin/activate
# Update vLLM
pip3 install --force-reinstall --no-cache-dir --no-deps --upgrade vllm-pascal
Warning
In rare cases, this may cause dependency errors; in that case, just reinstall vLLM.
To install aphrodite-engine with the patched Triton:
# Use this repository
export PIP_EXTRA_INDEX_URL="https://sasha0552.github.io/pascal-pkgs-ci/"
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate
# Install aphrodite-engine
pip3 install --extra-index-url https://downloads.pygmalion.chat/whl aphrodite-engine
# Install patched triton
transient-package install \
--interpreter venv/bin/python \
--source triton \
--target triton-pascal
# Launch aphrodite-engine
aphrodite --help
triton (for other applications)
# Use this repository
export PIP_EXTRA_INDEX_URL="https://sasha0552.github.io/pascal-pkgs-ci/"
# Install patched triton
transient-package install \
--interpreter venv/bin/python \
--source triton \
--target triton-pascal
Instructions for uploading to PyPI
# Download artifacts
gh run download <run id>
# Install twine
pip3 install twine
# Upload wheels
TWINE_PASSWORD=<pypi token> twine upload */*.whl