Requires NVIDIA Hopper architecture GPU (sm_90a
must be supported)
i.e. sm_86
(RTX 30x0 series) won't work, need RTX 40x0 series
Installation:
- make two copies of the repo, call
uv venv
in one and useconda create
in the other (use Python 3.11.11 for both)
mkdir deepgemm && cd deepgemm