There are two ways to use hestia: 1) through remotelab web, 2) through SSH. The benefit of Remotelab is that you can open GUI applications in it, which you cannot do in SSH sessions (you can, but it works poorly). But SSH does not require a web browser. So depending on what you need, you can choose either.
-
RemoteLab
- Go to: https://remotelab.eecs.yorku.ca/#/
- Log in with YorkU Passport credentials
- Click on "Research (wup)"
- Select "hestia"
-
SSH
hestia is not acccessible directly. You have to connect to hestia from one of the EECS servers, e.g., indigo or red. To do this directly, you can set an EECS server as the jump host with
-Jflag. Provide your EECS password when asked.ssh -J <EECS_username>@indigo.eecs.yorku.ca <EECS_username>@hestia
To run multiple programs in hestia, you'll have to SSH multiple times. That can become tedious. That's where Tmux comes in.
Tmux is a terminal multiplexer. It allows you to create multiple shell (e.g., bash) sessions. You can log in once in hestia and open a Tmux session and then open all your shells and programs on this without having to log in multiple times in multiple terminal windows.
Tmux also allows you to keep running the program in the background. You can start a long-running program in a Tmux session, detach from the session, do other things, attach back to the session, and continue your work on the program.
See man tmux or Arch Wiki for more details.
Running a program on the GPU requires the program to be runnable on the GPU. For example, PyTorch has two versions. One runs on CPU and another on GPU. You need to install the one you want. In this case, it's the GPU one (CUDA).
To restric usage to one or two GPUs, set the available GPUs with environment variables before or during running the program:
# Set ENV variable *before* running the program
export CUDA_VISIBLE_DEVICES=0,1
python train.py
# Set ENV variable *during* running the program
CUDA_VISIBLE_DEVICES=0,1 python train.py
- GPU usage summary
nvidia-smi - GPU processes and users (details)
alias nvidia-procs='ps -up `nvidia-smi -q -x | grep pid | sed -e "s/<pid>//g" -e "s/<\/pid>//g" -e "s/^[[:space:]]*//"`' nvidia-procs