Skip to content

Instantly share code, notes, and snippets.

Revisions

  1. @garg-aayush garg-aayush revised this gist May 19, 2024. No changes.
  2. @garg-aayush garg-aayush revised this gist May 19, 2024. 1 changed file with 87 additions and 48 deletions.
    135 changes: 87 additions & 48 deletions Steps_multiple_cuda_environments.md
    Original file line number Diff line number Diff line change
    @@ -1,7 +1,9 @@
    # Steps to manage multiple CUDA environments
    > Latest Update: May 19th, 2024

    This gist contains all the steps required to:
    - Install multiple CUDA versions (e.g., `CUDA 11.3` and `CUDA 11.8`).
    - Install multiple CUDA versions (e.g., `CUDA 11.8 and `CUDA 12.1
    - Manage multiple CUDA environments on Ubuntu using the utility called [environment modules](https://modules.readthedocs.io/en/latest/).
    - Use this approach to avoid CUDA environment conflicts.

    @@ -16,97 +18,104 @@ This gist contains all the steps required to:
    - Check GPU and available drives
    ```bash
    ubuntu-drivers devices
    # install it using: sudo apt install ubuntu-drivers-common
    # install it using: sudo ubuntu-drivers
    ```

    - Install the compatible driver
    ```bash
    # in my case it is nvidia-driver-530
    sudo apt install nvidia-driver-530
    # best to allow Ubuntu to autodetect and install the compatible nvidia-driver
    sudo ubuntu-drivers install
    ```
    > For example, I tried to install `nvidia-driver-545` using `sudo ubuntu-drivers install nvidia:545` command. However, I was unable to install it. There was always some or the other issue.

    > **Note**:
    > Please **restart** your system after installing the nvidia driver. Ideally, you should be able to get GPU state and stats using `nvidia-smi`

    - Check the installed NVIDIA driver
    ```bash
    nvidia-detector
    ```

    > **Note**:
    > - You can also auto-install the compatible driver using `sudo ubuntu-drivers autoinstall`.

    > - Additionally, you can also install NVIDIA drivers using the **Software & Updates** Ubuntu app. Just go to the **Additional Drivers** tab, choose a driver, and click **Apply Changes**.


    ## 2. Install `CUDA 11.3` and `CUDA 11.8`
    ## 2. Install `CUDA 11.8` and `CUDA 12.1`

    - Go to the [https://developer.nvidia.com/cuda-toolkit-archive](https://developer.nvidia.com/cuda-toolkit-archive) and select `CUDA Toolkit 11.3` from the available options.
    - Go to the [https://developer.nvidia.com/cuda-toolkit-archive](https://developer.nvidia.com/cuda-toolkit-archive) and select `CUDA Toolkit 11.8` from the available options.
    - Choose your OS, architecture, distribution, version, and installer type. For example, in my case:
    Option | value
    | :---:|:---:|
    | OS | Linux |
    | Architecture | x86_64 |
    | Distribution | Linux |
    | Version | 20.04 |
    | Version | 22.04 |
    | Installer type | deb(local) |
    - Follow the provided installation instructions by copying and pasting the commands into your terminal. This will install `CUDA 11.3`. Use the following commands:
    - Follow the provided installation instructions by copying and pasting the commands into your terminal. This will install `CUDA 11.8`. Use the following commands:
    ```bash
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/11.3.0/local_installers/cuda-repo-ubuntu2004-11-3-local_11.3.0-465.19.01-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu2004-11-3-local_11.3.0-465.19.01-1_amd64.deb
    sudo apt-key add /var/cuda-repo-ubuntu2004-11-3-local/7fa2af80.pub
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
    sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
    sudo cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
    sudo apt-get update
    sudo apt-get -y install cuda
    ```
    - Similarly, install `CUDA 11.8` using the following commands:
    - Similarly, install `CUDA 12.1` using the following commands:
    ```bash
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2004-11-8-local_11.8.0-520.61.05-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu2004-11-8-local_11.8.0-520.61.05-1_amd64.deb
    sudo cp /var/cuda-repo-ubuntu2004-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
    sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda-repo-ubuntu2204-12-1-local_12.1.0-530.30.02-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu2204-12-1-local_12.1.0-530.30.02-1_amd64.deb
    sudo cp /var/cuda-repo-ubuntu2204-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
    sudo apt-get update
    sudo apt-get -y install cuda
    ```
    - Make sure to copy and execute the commands above in your terminal to install `CUDA 11.3` and `CUDA 11.8` on your system.
    - Make sure to copy and execute the commands above in your terminal to install `CUDA 11.8` and `CUDA 12.1` on your system.
    ## 3. Install `cuDNN` library
    - Go to https://developer.nvidia.com/cudnn and download the `cuDNN` library for `CUDA 11.x`. Note that you might need to create a developer's account first.
    - Go to https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/ and download the `cuDNN` tar for `CUDA 11.x`. Note that you might need to create a developer's account first.
    - Untar the downloaded file using the following command:
    ```bash
    tar -xvf cudnn-linux-x86_64-8.9.2.26_cuda11-archive.tar.xz
    tar -xvf cudnn-linux-x86_64-9.1.0.70_cuda11-archive.tar.xz # CUDA 11.x
    tar -xvf cudnn-linux-x86_64-9.1.0.70_cuda12-archive.tar.xz # CUDA 12.x
    ```
    - Copy the `cuDNN` files to the `CUDA` toolkit files:
    ```bash
    # for CUDA 11.3
    sudo cp include/cudnn*.h /usr/local/cuda-11.3/include
    sudo cp lib64/libcudnn* /usr/local/cuda-11.3/lib64
    # for CUDA 11.8
    cd cudnn-linux-x86_64-9.1.0.70_cuda11-archive/
    sudo cp include/cudnn*.h /usr/local/cuda-11.8/include
    sudo cp lib64/libcudnn* /usr/local/cuda-11.8/lib64
    # for CUDA 12.1
    cd cudnn-linux-x86_64-9.1.0.70_cuda12-archive/
    sudo cp include/cudnn*.h /usr/local/cuda-12.1/include
    sudo cp lib64/libcudnn* /usr/local/cuda-12.1/lib64
    ```
    - Make the files executable:
    ```bash
    sudo chmod a+r /usr/local/cuda-11.3/include/cudnn*.h /usr/local/cuda-11.3/lib64/libcudnn*
    sudo chmod a+r /usr/local/cuda-11.8/include/cudnn*.h /usr/local/cuda-11.8/lib64/libcudnn*
    sudo chmod a+r /usr/local/cuda-12.1/include/cudnn*.h /usr/local/cuda-12.1/lib64/libcudnn*
    ```
    > **Note**:
    > Strictly speaking, you are done with the CUDA setup. You can use it by adding the CUDA bin and library path to the PATH and LD_LIBRARY_PATH environment variables. For example, you can set up CUDA 11.8 by adding the following lines in the `~/.bashrc`:
    > ```bash
    > PATH=/usr/local/cuda-11.3/bin:$PATH
    > PATH=/usr/local/cuda-11.8/bin:$PATH
    > LD_LIBRARY_PATH=/usr/local/cuda-11.8/extras/CUPTI/lib64:$LD_LIBRARY_PATH
    > LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH
    > ```
    > Similarly, you can set up CUDA 11.3. However, manually changing the paths every time can be cumbersome!
    > Similarly, you can set up CUDA 12.1. However, manually changing the paths every time can be cumbersome!
    ## 4. Manage multipe CUDA versions using `environment modules`
    > **Note**: In case, you only want to install either of the one, CUDNN 11.x or CUDNN 12.x. The simpler way is to go to https://developer.nvidia.com/cudnn-downloads and install the CUDNN 11.x or CUDNN 12.x similar to CUDA installation.
    ## 4. Manage multiple CUDA versions using `environment modules`
    ### a) Install the environment modules utility:
    - Run the following commands:
    ```bash
    @@ -130,11 +139,11 @@ This gist contains all the steps required to:
    sudo mkdir -p /usr/share/modules/modulefiles/cuda
    ```
    - Create a modulefile `/usr/share/modules/modulefiles/cuda/11.3` for `CUDA 11.3` and add the following lines:
    - Create a modulefile `/usr/share/modules/modulefiles/cuda/11.8` for `CUDA 11.8` and add the following lines:
    ```bash
    #%Module1.0
    ##
    ## cuda 11.3 modulefile
    ## cuda 11.8 modulefile
    ##
    proc ModulesHelp { } {
    @@ -145,12 +154,12 @@ This gist contains all the steps required to:
    module-whatis "sets up environment for CUDA 11.8"
    if { [ is-loaded cuda/11.8 ] } {
    module unload cuda/11.8
    if { [ is-loaded cuda/12.1 ] } {
    module unload cuda/12.1
    }
    set version 11.3
    set root /usr/local/cuda-11.3
    set version 11.8
    set root /usr/local/cuda-11.8
    setenv CUDA_HOME $root
    prepend-path PATH $root/bin
    @@ -159,11 +168,11 @@ This gist contains all the steps required to:
    conflict cuda
    ```
    - Similarly, create a modulefile `/usr/share/modules/modulefiles/cuda/11.3` for `CUDA 11.3` and add the following lines:
    - Similarly, create a modulefile `/usr/share/modules/modulefiles/cuda/12.1` for `CUDA 12.1` and add the following lines:
    ```bash
    #%Module1.0
    ##
    ## cuda 11.8 modulefile
    ## cuda 12.1 modulefile
    ##
    proc ModulesHelp { } {
    @@ -172,14 +181,14 @@ This gist contains all the steps required to:
    puts stderr "\tSets up environment for CUDA $version\n"
    }
    module-whatis "sets up environment for CUDA 11.8"
    module-whatis "sets up environment for CUDA 12.1"
    if { [ is-loaded cuda/11.3 ] } {
    module unload cuda/11.3
    if { [ is-loaded cuda/11.8 ] } {
    module unload cuda/11.8
    }
    set version 11.8
    set root /usr/local/cuda-11.8
    set version 12.1
    set root /usr/local/cuda-12.1
    setenv CUDA_HOME $root
    prepend-path PATH $root/bin
    @@ -196,6 +205,7 @@ This gist contains all the steps required to:
    ```
    > **Note**: make sure to reload your terminal.
    ## 5. Changing and Viewing the CUDA Module
    - To change and view the loaded CUDA module, you can use the following commands:
    ```bash
    @@ -205,7 +215,7 @@ This gist contains all the steps required to:
    module avail
    # Load a specific cuda version
    module load cuda/11.3
    module load cuda/12.1
    # Unload the currently loaded CUDA module
    module unload cuda
    # Load CUDA 11.8
    @@ -217,4 +227,33 @@ This gist contains all the steps required to:
    echo $PATH
    echo $LD_LIBRARY_PATH
    ```
    > **Note**: You can add additional `CUDA` versions or other packages by creating corresponding modulefiles and following the steps outlined in this gist.
    > **Note**: You can add additional `CUDA` versions or other packages by creating corresponding modulefiles and following the steps outlined in this gist.
    ## 6. Some Useful Tips
    ### a) `nvidia-smi` does not works
    - Sometime, after Ubuntu update or some other weird issue. The system might not be able to detect drivers. For example, you get erros such as `nvidia-smi has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.` The best solution is to remove the current drivers and reinstall the compatible nvidia-driver.
    ```bash
    # removes all the nvidia drivers
    sudo apt-get --purge remove "*nvidia*" "libxnvctrl*"
    # reinstall the compatible driver and restart
    sudo ubuntu-drivers install
    ```

    ### b) Purge CUDA from your computer
    > DO IT AT YOUR OWN RISK

    ```bash
    # removes all the nvidia drivers
    sudo apt-get --purge remove "*nvidia*" "libxnvctrl*"
    # remove all cuda versions
    sudo apt-get --purge remove "*cuda*" "*cublas*" "*cufft*" "*cufile*" "*curand*" "*cusolver*" "*cusparse*" "*gds-tools*" "*npp*" "*nvjpeg*" "nsight*" "*nvvm*"
    # remove all cuda folders
    sudo rm -rf /usr/loca/cuda*
    ```

    ## Resources and helpful links
    - https://ubuntu.com/server/docs/nvidia-drivers-installation
    - https://developer.nvidia.com/cuda-toolkit-archive
    - https://developer.nvidia.com/cudnn-downloads
  3. @garg-aayush garg-aayush revised this gist Jul 14, 2023. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion Steps_multiple_cuda_environments.md
    Original file line number Diff line number Diff line change
    @@ -15,7 +15,8 @@ This gist contains all the steps required to:
    ```
    - Check GPU and available drives
    ```bash
    ubuntu-devices drivers
    ubuntu-drivers devices
    # install it using: sudo apt install ubuntu-drivers-common
    ```

    - Install the compatible driver
  4. @garg-aayush garg-aayush revised this gist Jun 7, 2023. 1 changed file with 66 additions and 50 deletions.
    116 changes: 66 additions & 50 deletions Steps_multiple_cuda_environments.md
    Original file line number Diff line number Diff line change
    @@ -1,11 +1,13 @@
    # Steps to manage multiple CUDA environments
    This gist contains all the steps required to
    - Install multiple cuda versions (for eg. I install `CUDA 11.3` and `CUDA 11.8`)
    - Managing the multiple cuda environments on ubuntu using the utility called `environment modules`
    - Using such an approach avoids the cuda environment conflicts

    This gist contains all the steps required to:
    - Install multiple CUDA versions (e.g., `CUDA 11.3` and `CUDA 11.8`).
    - Manage multiple CUDA environments on Ubuntu using the utility called [environment modules](https://modules.readthedocs.io/en/latest/).
    - Use this approach to avoid CUDA environment conflicts.

    ## 1. Install the compatible nvidia drivers (if required)
    > Environment Modules is a package that provides for the dynamic modification of a user's environment via modulefiles. You can find more on it at https://modules.readthedocs.io/en/latest/
    ## 1. Install the Compatible NVIDIA Drivers (if required)

    - Add PPA GPU Drivers Repository to the System
    ```bash
    @@ -22,19 +24,20 @@ This gist contains all the steps required to
    sudo apt install nvidia-driver-530
    ```

    - Check the installed nvidia driver
    - Check the installed NVIDIA driver
    ```bash
    nvidia-detector
    ```

    > Note:
    > - You can also autoinstall the compatible using `sudo ubuntu-drivers autoinstall`.
    > - Additionally, you can also install nvidia-drivers using **Software & Updates** ubuntu app. Just go to **additional drivers** tab, choose a driver and click **apply changes**.
    > **Note**:
    > - You can also auto-install the compatible driver using `sudo ubuntu-drivers autoinstall`.
    > - Additionally, you can also install NVIDIA drivers using the **Software & Updates** Ubuntu app. Just go to the **Additional Drivers** tab, choose a driver, and click **Apply Changes**.


    ## 2. Install `CUDA 11.3` and `CUDA 11.8`

    - Go to the [https://developer.nvidia.com/cuda-toolkit-archive](https://developer.nvidia.com/cuda-toolkit-archive) and select `CUDA Toolkit 11.3.0` from the available options.
    - Now, select your OS, architecture, distribution, version and installer type. For example, in my case it is:
    - Go to the [https://developer.nvidia.com/cuda-toolkit-archive](https://developer.nvidia.com/cuda-toolkit-archive) and select `CUDA Toolkit 11.3` from the available options.
    - Choose your OS, architecture, distribution, version, and installer type. For example, in my case:
    Option | value
    | :---:|:---:|
    @@ -44,7 +47,7 @@ This gist contains all the steps required to
    | Version | 20.04 |
    | Installer type | deb(local) |
    - You will get installation instructions, copy and paste in your terminal. This will install `CUDA 11.3`
    - Follow the provided installation instructions by copying and pasting the commands into your terminal. This will install `CUDA 11.3`. Use the following commands:
    ```bash
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
    @@ -54,7 +57,7 @@ This gist contains all the steps required to
    sudo apt-get update
    sudo apt-get -y install cuda
    ```
    - Similarly, install `CUDA 11.8`
    - Similarly, install `CUDA 11.8` using the following commands:
    ```bash
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
    @@ -64,17 +67,18 @@ This gist contains all the steps required to
    sudo apt-get update
    sudo apt-get -y install cuda
    ```
    - Make sure to copy and execute the commands above in your terminal to install `CUDA 11.3` and `CUDA 11.8` on your system.
    ## 3. Install `cuDNN` library
    - Go to https://developer.nvidia.com/cudnn and download the `cuDNN` library for CUDA 11.x
    > Note: you might need to make a developer's account first.
    - Untar the downloaded file
    - Go to https://developer.nvidia.com/cudnn and download the `cuDNN` library for `CUDA 11.x`. Note that you might need to create a developer's account first.
    - Untar the downloaded file using the following command:
    ```bash
    tar -xvf cudnn-linux-x86_64-8.9.2.26_cuda11-archive.tar.xz
    ```
    - Copy the cuDNN files to CUDA toolkit files
    - Copy the `cuDNN` files to the `CUDA` toolkit files:
    ```bash
    # for CUDA 11.3
    sudo cp include/cudnn*.h /usr/local/cuda-11.3/include
    @@ -85,35 +89,47 @@ This gist contains all the steps required to
    sudo cp lib64/libcudnn* /usr/local/cuda-11.8/lib64
    ```
    - Make the files executable
    - Make the files executable:
    ```bash
    sudo chmod a+r /usr/local/cuda-11.3/include/cudnn*.h /usr/local/cuda-11.3/lib64/libcudnn*
    sudo chmod a+r /usr/local/cuda-11.8/include/cudnn*.h /usr/local/cuda-11.8/lib64/libcudnn*
    ```
    > Note:
    > Strictly speaking, you have the CUDA setup now. You can use it by adding the CUDA bin and library path to `PATH` and `LD_LIBRARY_PATH` environment variables
    > **Note**:
    > Strictly speaking, you are done with the CUDA setup. You can use it by adding the CUDA bin and library path to the PATH and LD_LIBRARY_PATH environment variables. For example, you can set up CUDA 11.8 by adding the following lines in the `~/.bashrc`:
    > ```bash
    > PATH=/usr/local/cuda-11.3/bin:$PATH
    > LD_LIBRARY_PATH=/usr/local/cuda-11.8/extras/CUPTI/lib64:$LD_LIBRARY_PATH
    > LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH
    > ```
    > Similarly, you can set up CUDA 11.3. However, manually changing the paths every time can be cumbersome!
    ## 4. Manage multipe cuda version using `environment modules`
    ## 4. Manage multipe CUDA versions using `environment modules`
    ### a) Install the environment modules utility
    ### a) Install the environment modules utility:
    - Run the following commands:
    ```bash
    sudo apt-get update
    sudo apt-get install environment-modules
    ```
    - Check the installation:
    ```bash
    # Check the installation by running
    module list
    ```
    ### b) Create modulefiles for cuda distributions
    > You should see a list of default installed modules like git and maybe their versions displayed when you run the command `module list`. This confirms that the environment modules utility has been successfully installed on your system.
    ### b) Create modulefiles for CUDA distributions
    > **Note**: you might need `root` permissions to make directory and create files, use `sudo` in that case
    > **Note**: You might need root permissions to create directories and files. Use sudo in that case.
    - Create a directory `/usr/share/modules/modulefiles/cuda` to hold modulefiles for cuda distributions
    ```bash
    sudo mkdir -p /usr/share/modules/modulefiles/cuda
    ```
    - Create a modulefile `/usr/share/modules/modulefiles/cuda/11.3` for `CUDA 11.3` and following lines to the file:
    - Create a modulefile `/usr/share/modules/modulefiles/cuda/11.3` for `CUDA 11.3` and add the following lines:
    ```bash
    #%Module1.0
    ##
    @@ -142,7 +158,7 @@ This gist contains all the steps required to
    conflict cuda
    ```
    - Create a modulefile `/usr/share/modules/modulefiles/cuda/11.8` for `CUDA 11.8` and following lines to the file:
    - Similarly, create a modulefile `/usr/share/modules/modulefiles/cuda/11.3` for `CUDA 11.3` and add the following lines:
    ```bash
    #%Module1.0
    ##
    @@ -178,26 +194,26 @@ This gist contains all the steps required to
    set ModulesVersion 11.8
    ```
    ## Quick examples to switch cuda versions
    ```bash
    ## check the path to nvcc (cuda distribution)
    which nvcc
    echo $CUDA_HOME
    echo $PATH
    echo $LD_LIBRARY_PATH
    # check the available modules
    module avail
    # list the loaded modules
    module list
    # load CUDA 11.3
    module load cuda/11.3
    # check loaded cuda distribution (should show CUDA 11.3 paths)
    which nvcc
    echo $CUDA
    echo $PATH
    echo $LD_LIBRARY_PATH
    ```
    > **Note**: make sure to reload your terminal.
    ## 5. Changing and Viewing the CUDA Module
    - To change and view the loaded CUDA module, you can use the following commands:
    ```bash
    # Check the currently loaded module
    module list
    # Check the available modules
    module avail
    # Load a specific cuda version
    module load cuda/11.3
    # Unload the currently loaded CUDA module
    module unload cuda
    # Load CUDA 11.8
    module load cuda/11.8
    # verify the paths of the loaded CUDA
    nvcc --version # should give the loaded CUDA version
    echo $CUDA_HOME
    echo $PATH
    echo $LD_LIBRARY_PATH
    ```
    > **Note**: You can add additional `CUDA` versions or other packages by creating corresponding modulefiles and following the steps outlined in this gist.
  5. @garg-aayush garg-aayush revised this gist Jun 2, 2023. 1 changed file with 155 additions and 51 deletions.
    206 changes: 155 additions & 51 deletions Steps_multiple_cuda_environments.md
    Original file line number Diff line number Diff line change
    @@ -1,78 +1,182 @@
    # Steps to manage multiple CUDA environments
    This gist contains all the steps required to manage multiple cuda environments on ubuntu. For the steps walkthrough, For the walkthrough, I will install `CUDA 11.8` and `CUDA 11.3` on my system and manage them using `environment modules`.
    This gist contains all the steps required to
    - Install multiple cuda versions (for eg. I install `CUDA 11.3` and `CUDA 11.8`)
    - Managing the multiple cuda environments on ubuntu using the utility called `environment modules`
    - Using such an approach avoids the cuda environment conflicts


    ## 1. Install the compatible nvidia drivers (if required)

    - Add PPA GPU Drivers Repository to the System
    ```bash
    sudo add-apt-repository ppa:graphics-drivers/ppa
    ```
    ```bash
    sudo add-apt-repository ppa:graphics-drivers/ppa
    ```
    - Check GPU and available drives
    ```bash
    ubuntu-devices drivers
    ```
    ```bash
    ubuntu-devices drivers
    ```

    - Install the compatible driver
    ```bash
    sudo apt install nvidia-driver-530
    ```
    ```bash
    # in my case it is nvidia-driver-530
    sudo apt install nvidia-driver-530
    ```

    - Check the installed nvidia driver
    ```bash
    nvidia-detector
    ```

    - Note:
    > - you can also autoinstall the compatible using `sudo ubuntu-drivers autoinstall`
    > - Additionally, you can also install nvidia-drivers using **Software & Updates** ubuntu app. Just go to **additional drivers** tab, choose a driver and click **apply changes**
    ```bash
    nvidia-detector
    ```

    > Note:
    > - You can also autoinstall the compatible using `sudo ubuntu-drivers autoinstall`.
    > - Additionally, you can also install nvidia-drivers using **Software & Updates** ubuntu app. Just go to **additional drivers** tab, choose a driver and click **apply changes**.

    ## 2. Install `CUDA 11.3` and `CUDA 11.8`

    - Go to the [https://developer.nvidia.com/cuda-toolkit-archive](https://developer.nvidia.com/cuda-toolkit-archive) and select `CUDA Toolkit 11.3.0` from the available options.
    - Now, select your OS, architecture, distribution, version and installer type. For example, in my case it is:
    Option | value
    | :---:|:---:|
    | OS | Linux |
    | Architecture | x86_64 |
    | Distribution | Linux |
    | Version | 20.04 |
    | Installer type | deb(local) |
    - You will get installation instructions, copy and paste in your terminal. This will install `CUDA 11.3`
    ```bash
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/11.3.0/local_installers/cuda-repo-ubuntu2004-11-3-local_11.3.0-465.19.01-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu2004-11-3-local_11.3.0-465.19.01-1_amd64.deb
    sudo apt-key add /var/cuda-repo-ubuntu2004-11-3-local/7fa2af80.pub
    sudo apt-get update
    sudo apt-get -y install cuda
    ```
    - Similarly, install `CUDA 11.8`
    ```bash
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2004-11-8-local_11.8.0-520.61.05-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu2004-11-8-local_11.8.0-520.61.05-1_amd64.deb
    sudo cp /var/cuda-repo-ubuntu2004-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
    sudo apt-get update
    sudo apt-get -y install cuda
    ```
    ## 3. Install `cuDNN` library

    - Go to https://developer.nvidia.com/cudnn and download the `cuDNN` library for CUDA 11.x
    > Note: you might need to make a developer's account first.
    - Untar the downloaded file
    ```bash
    tar -xvf cudnn-linux-x86_64-8.9.2.26_cuda11-archive.tar.xz
    ```
    - Copy the cuDNN files to CUDA toolkit files
    ```bash
    # for CUDA 11.3
    sudo cp include/cudnn*.h /usr/local/cuda-11.3/include
    sudo cp lib64/libcudnn* /usr/local/cuda-11.3/lib64
    # for CUDA 11.8
    sudo cp include/cudnn*.h /usr/local/cuda-11.8/include
    sudo cp lib64/libcudnn* /usr/local/cuda-11.8/lib64
    ```
    - Make the files executable
    ```bash
    sudo chmod a+r /usr/local/cuda-11.3/include/cudnn*.h /usr/local/cuda-11.3/lib64/libcudnn*
    sudo chmod a+r /usr/local/cuda-11.8/include/cudnn*.h /usr/local/cuda-11.8/lib64/libcudnn*
    ```
    > Note:
    > Strictly speaking, you have the CUDA setup now. You can use it by adding the CUDA bin and library path to `PATH` and `LD_LIBRARY_PATH` environment variables
    ## 4. Manage multipe cuda version using `environment modules`
    ### a) Install the environment modules utility
    ```bash
    sudo apt-get update
    sudo apt-get install environment-modules
    ```

    ### b) Create modulefiles for cuda distributions

    - Create a directory `/etc/modulefiles/cuda` to hold modulefiles for cuda distributions
    ```bash
    mkdir /etc/modulefiles/cuda
    ```

    - Create a modulefile /etc/modulefiles/cuda/11.8 for `CUDA 11.3` and following lines to the file:
    ```bash
    vim /etc/modulefiles/cuda/11.3
    ```

    ```bash
    lorem epsum
    ```
    ```bash
    sudo apt-get update
    sudo apt-get install environment-modules
    ```
    - Create a modulefile /etc/modulefiles/cuda/11.3 for `CUDA 11.8` and following lines to the file:
    ```bash
    vim /etc/modulefiles/cuda/11.8
    ```
    ```bash
    # Check the installation by running
    module list
    ```
    ```bash
    lorem epsum
    ```
    ### b) Create modulefiles for cuda distributions
    > **Note**: you might need `root` permissions to make directory and create files, use `sudo` in that case
    - Create a directory `/usr/share/modules/modulefiles/cuda` to hold modulefiles for cuda distributions
    - Create a modulefile `/usr/share/modules/modulefiles/cuda/11.3` for `CUDA 11.3` and following lines to the file:
    ```bash
    #%Module1.0
    ##
    ## cuda 11.3 modulefile
    ##
    proc ModulesHelp { } {
    global version
    puts stderr "\tSets up environment for CUDA $version\n"
    }
    module-whatis "sets up environment for CUDA 11.8"
    if { [ is-loaded cuda/11.8 ] } {
    module unload cuda/11.8
    }
    set version 11.3
    set root /usr/local/cuda-11.3
    setenv CUDA_HOME $root
    prepend-path PATH $root/bin
    prepend-path LD_LIBRARY_PATH $root/extras/CUPTI/lib64
    prepend-path LD_LIBRARY_PATH $root/lib64
    conflict cuda
    ```
    - Create a modulefile `/usr/share/modules/modulefiles/cuda/11.8` for `CUDA 11.8` and following lines to the file:
    ```bash
    #%Module1.0
    ##
    ## cuda 11.8 modulefile
    ##
    proc ModulesHelp { } {
    global version
    puts stderr "\tSets up environment for CUDA $version\n"
    }
    module-whatis "sets up environment for CUDA 11.8"
    if { [ is-loaded cuda/11.3 ] } {
    module unload cuda/11.3
    }
    set version 11.8
    set root /usr/local/cuda-11.8
    setenv CUDA_HOME $root
    prepend-path PATH $root/bin
    prepend-path LD_LIBRARY_PATH $root/extras/CUPTI/lib64
    prepend-path LD_LIBRARY_PATH $root/lib64
    conflict cuda
    ```
    ### c) Make `CUDA 11.8` the default cuda version
    - Create a file `/etc/modulefiles/cuda/.version` to make python/cuda/11.8 the default cuda module:

    ```bash
    lorem epsum
    ```
    - Create a file `/usr/share/modules/modulefiles/cuda.version` to make `CUDA 11.8` the default cuda module:
    ```bash
    #%Module
    set ModulesVersion 11.8
    ```
    ## Quick examples to switch cuda versions
    ```bash
    @@ -96,4 +200,4 @@ which nvcc
    echo $CUDA
    echo $PATH
    echo $LD_LIBRARY_PATH
    ```
    ```
  6. @garg-aayush garg-aayush created this gist Jun 1, 2023.
    99 changes: 99 additions & 0 deletions Steps_multiple_cuda_environments.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,99 @@
    # Steps to manage multiple CUDA environments
    This gist contains all the steps required to manage multiple cuda environments on ubuntu. For the steps walkthrough, For the walkthrough, I will install `CUDA 11.8` and `CUDA 11.3` on my system and manage them using `environment modules`.


    ## 1. Install the compatible nvidia drivers (if required)

    - Add PPA GPU Drivers Repository to the System
    ```bash
    sudo add-apt-repository ppa:graphics-drivers/ppa
    ```
    - Check GPU and available drives
    ```bash
    ubuntu-devices drivers
    ```

    - Install the compatible driver
    ```bash
    sudo apt install nvidia-driver-530
    ```

    - Check the installed nvidia driver
    ```bash
    nvidia-detector
    ```

    - Note:
    > - you can also autoinstall the compatible using `sudo ubuntu-drivers autoinstall`
    > - Additionally, you can also install nvidia-drivers using **Software & Updates** ubuntu app. Just go to **additional drivers** tab, choose a driver and click **apply changes**

    ## 2. Install `CUDA 11.3` and `CUDA 11.8`


    ## 3. Install `cuDNN` library


    ## 4. Manage multipe cuda version using `environment modules`

    ### a) Install the environment modules utility
    ```bash
    sudo apt-get update
    sudo apt-get install environment-modules
    ```

    ### b) Create modulefiles for cuda distributions

    - Create a directory `/etc/modulefiles/cuda` to hold modulefiles for cuda distributions
    ```bash
    mkdir /etc/modulefiles/cuda
    ```

    - Create a modulefile /etc/modulefiles/cuda/11.8 for `CUDA 11.3` and following lines to the file:
    ```bash
    vim /etc/modulefiles/cuda/11.3
    ```

    ```bash
    lorem epsum
    ```

    - Create a modulefile /etc/modulefiles/cuda/11.3 for `CUDA 11.8` and following lines to the file:
    ```bash
    vim /etc/modulefiles/cuda/11.8
    ```

    ```bash
    lorem epsum
    ```

    ### c) Make `CUDA 11.8` the default cuda version
    - Create a file `/etc/modulefiles/cuda/.version` to make python/cuda/11.8 the default cuda module:

    ```bash
    lorem epsum
    ```

    ## Quick examples to switch cuda versions
    ```bash
    ## check the path to nvcc (cuda distribution)
    which nvcc
    echo $CUDA_HOME
    echo $PATH
    echo $LD_LIBRARY_PATH

    # check the available modules
    module avail

    # list the loaded modules
    module list

    # load CUDA 11.3
    module load cuda/11.3

    # check loaded cuda distribution (should show CUDA 11.3 paths)
    which nvcc
    echo $CUDA
    echo $PATH
    echo $LD_LIBRARY_PATH
    ```