Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save Green-li/8267b50bb801bf64a98a38e73cf39ace to your computer and use it in GitHub Desktop.
Save Green-li/8267b50bb801bf64a98a38e73cf39ace to your computer and use it in GitHub Desktop.
Managing multiple CUDA versions using environment modules in Ubuntu

Steps to manage multiple CUDA environments

This gist contains all the steps required to

  • Install multiple cuda versions (for eg. I install CUDA 11.3 and CUDA 11.8)
  • Managing the multiple cuda environments on ubuntu using the utility called environment modules
  • Using such an approach avoids the cuda environment conflicts

1. Install the compatible nvidia drivers (if required)

  • Add PPA GPU Drivers Repository to the System

    sudo add-apt-repository ppa:graphics-drivers/ppa
  • Check GPU and available drives

    ubuntu-devices drivers
  • Install the compatible driver

    # in my case it is nvidia-driver-530
    sudo apt install nvidia-driver-530
  • Check the installed nvidia driver

    nvidia-detector 

Note:

  • You can also autoinstall the compatible using sudo ubuntu-drivers autoinstall.
  • Additionally, you can also install nvidia-drivers using Software & Updates ubuntu app. Just go to additional drivers tab, choose a driver and click apply changes.

2. Install CUDA 11.3 and CUDA 11.8

  • Go to the https://developer.nvidia.com/cuda-toolkit-archive and select CUDA Toolkit 11.3.0 from the available options.

  • Now, select your OS, architecture, distribution, version and installer type. For example, in my case it is:

    Option value
    OS Linux
    Architecture x86_64
    Distribution Linux
    Version 20.04
    Installer type deb(local)
  • You will get installation instructions, copy and paste in your terminal. This will install CUDA 11.3

    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/11.3.0/local_installers/cuda-repo-ubuntu2004-11-3-local_11.3.0-465.19.01-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu2004-11-3-local_11.3.0-465.19.01-1_amd64.deb
    sudo apt-key add /var/cuda-repo-ubuntu2004-11-3-local/7fa2af80.pub
    sudo apt-get update
    sudo apt-get -y install cuda
  • Similarly, install CUDA 11.8

    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2004-11-8-local_11.8.0-520.61.05-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu2004-11-8-local_11.8.0-520.61.05-1_amd64.deb
    sudo cp /var/cuda-repo-ubuntu2004-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
    sudo apt-get update
    sudo apt-get -y install cuda

3. Install cuDNN library

Note: you might need to make a developer's account first.

  • Untar the downloaded file

    tar -xvf cudnn-linux-x86_64-8.9.2.26_cuda11-archive.tar.xz
  • Copy the cuDNN files to CUDA toolkit files

    # for CUDA 11.3
    sudo cp include/cudnn*.h /usr/local/cuda-11.3/include
    sudo cp lib64/libcudnn* /usr/local/cuda-11.3/lib64
    
    # for CUDA 11.8
    sudo cp include/cudnn*.h /usr/local/cuda-11.8/include
    sudo cp lib64/libcudnn* /usr/local/cuda-11.8/lib64
  • Make the files executable

    sudo chmod a+r /usr/local/cuda-11.3/include/cudnn*.h /usr/local/cuda-11.3/lib64/libcudnn*
    sudo chmod a+r /usr/local/cuda-11.8/include/cudnn*.h /usr/local/cuda-11.8/lib64/libcudnn*

Note: Strictly speaking, you have the CUDA setup now. You can use it by adding the CUDA bin and library path to PATH and LD_LIBRARY_PATH environment variables

4. Manage multipe cuda version using environment modules

a) Install the environment modules utility

```bash
    sudo apt-get update
    sudo apt-get install environment-modules
```

```bash
# Check the installation by running
module list
```

b) Create modulefiles for cuda distributions

Note: you might need root permissions to make directory and create files, use sudo in that case

  • Create a directory /usr/share/modules/modulefiles/cuda to hold modulefiles for cuda distributions

  • Create a modulefile /usr/share/modules/modulefiles/cuda/11.3 for CUDA 11.3 and following lines to the file:

    #%Module1.0
    ##
    ## cuda 11.3 modulefile
    ##
    
    proc ModulesHelp { } {
        global version
        
        puts stderr "\tSets up environment for CUDA $version\n"
    }
    
    module-whatis "sets up environment for CUDA 11.8"
    
    if { [ is-loaded cuda/11.8 ] } {
    module unload cuda/11.8
    }
    
    set version 11.3
    set root /usr/local/cuda-11.3
    setenv CUDA_HOME	$root
    
    prepend-path PATH $root/bin
    prepend-path LD_LIBRARY_PATH $root/extras/CUPTI/lib64
    prepend-path LD_LIBRARY_PATH $root/lib64
    conflict cuda
  • Create a modulefile /usr/share/modules/modulefiles/cuda/11.8 for CUDA 11.8 and following lines to the file:

    #%Module1.0
    ##
    ## cuda 11.8 modulefile
    ##
    
    proc ModulesHelp { } {
        global version
        
        puts stderr "\tSets up environment for CUDA $version\n"
    }
    
    module-whatis "sets up environment for CUDA 11.8"
    
    if { [ is-loaded cuda/11.3 ] } {
    module unload cuda/11.3
    }
    
    set version 11.8
    set root /usr/local/cuda-11.8
    setenv CUDA_HOME	$root
    
    prepend-path PATH $root/bin
    prepend-path LD_LIBRARY_PATH $root/extras/CUPTI/lib64
    prepend-path LD_LIBRARY_PATH $root/lib64
    conflict cuda

c) Make CUDA 11.8 the default cuda version

  • Create a file /usr/share/modules/modulefiles/cuda.version to make CUDA 11.8 the default cuda module:
    #%Module
    set ModulesVersion 11.8

Quick examples to switch cuda versions

## check the path to nvcc (cuda distribution)
which nvcc
echo $CUDA_HOME
echo $PATH
echo $LD_LIBRARY_PATH

# check the available modules
module avail

# list the loaded modules
module list

# load CUDA 11.3
module load cuda/11.3

# check loaded cuda distribution (should show CUDA 11.3 paths)
which nvcc
echo $CUDA
echo $PATH
echo $LD_LIBRARY_PATH
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment