⇤ ← Revision 1 as of 2021-07-16 11:12:26
Size: 4236
Comment: Move GPU help to HowTos/GpuCluster.
|
Size: 4264
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 14: | Line 14: |
Link to the Cluster's FAQ page: https://osticket.informatik.uni-freiburg.de/kb/faq.php?cid=1 | Link to the Cluster's FAQ page (you need to sign in first): https://osticket.informatik.uni-freiburg.de/kb/faq.php?cid=1 |
GPU Cluster
Contents
General Information
Cluster access
To get access to the GPU cluster, send an email to Frank (dal-ri@informatik.uni-freiburg.de), with your supervisor and Matthias (hertelm@informatik.uni-freiburg.de) in the Cc.
Getting help
The Cluster's FAQ answers most frequent questions. To be able to read the FAQ, you must register to the Support Ticket System first. Use your informatik.uni-freiburg.de e-mail address. Registration page: https://osticket.informatik.uni-freiburg.de/account.php?do=create
Link to the Cluster's FAQ page (you need to sign in first): https://osticket.informatik.uni-freiburg.de/kb/faq.php?cid=1
In addition, there is the old help page by the Machine Learning chair. You can access the help page once you have access to the GPU cluster. Log in with your TF-account. Link to the help page: https://aadwiki.informatik.uni-freiburg.de/Meta_Slurm
If you have problems accessing the cluster, send an email to Frank, with your supervisor and Matthias in the Cc. For any other issues, ask your supervisor and Matthias.
Logging in to the cluster
University network
To be able to log in to the cluster, you must be in the university’s network.
We recommend using the university’s VPN. See the various information pages:
https://www.rz.uni-freiburg.de/inhalt/dokumente/pdfs/anleitungen/installation-openconnect-vpn-ubuntu/ (Ubuntu only)
https://www.rz.uni-freiburg.de/services-en/netztel-en/vpn/vpn-einleitung-en?set_language=en
Alternatively, you can access the university's network by logging in to login.informatik.uni-freiburg.de via SSH:
ssh <user>@login.informatik.uni-freiburg.de
Cluster login
There are three login nodes:
- kislogin1.rz.ki.privat
- kislogin2.rz.ki.privat
- kislogin3.rz.ki.privat
Log in via SSH:
ssh <user>@kislogin1.rz.ki.privat
Status information
Use the command sfree to get status information for the cluster partitions. It shows all accessible partitions and the number of used and total GPUs per partition.
You can also watch the partitions and your jobs in the dashboard: https://kislurm-dashboard.informatik.uni-freiburg.de/d/spTRj8IMz/kislurm2?orgId=1
Workspaces
You automatically have access to your home directory on the cluster. Since the size of your home directory is limited to a few GB, we recommend using a workspace for your project. A workspace is a directory that can be accessed from all nodes of the cluster.
Creating a workspace
Use the command ws_allocate to create a new workspace. For help, type man ws_allocate.
Example:
ws_allocate -r 10 -m <user>@informatik.uni-freiburg.de test-workspace 30
This command creates a workspace <user>-test-workspace, which expires in 30 days. 10 days before the expiration, a notification is sent to the specified email address.
Find your workspace
To list the paths of your workspaces, type ws_list.
Extending a workspace
When a workspace expires, all content is lost. To extend the workspace, use the command:
ws_allocate -x <ID> <DAYS>
Running jobs
Interactive session
Use the following command to start an interactive session with 1 GPU (per default):
srun -p alldlc_gpu-rtx2080 --pty bash
To check if you have access to the GPU, run:
python3 -c "import torch; print(torch.cuda.is_available())"
The result should be True.
Submitting jobs
Write a bash file containing all instructions for your job, and then run:
sbatch -p alldlc_gpu-rtx2080 <bash_file>
The output of your job will be written to a file slurm-<jobid>.out in your current directory.
To get the status of your jobs, run:
sacct --user=$USER
To list all your running jobs, run:
squeue --user=$USER
Accessing GitHub
To access GitHub via SSH, add the following lines to the file ~/.ssh/config
Host github.com ProxyCommand ssh -q login.informatik.uni-freiburg.de nc %h %p