= GPU Cluster = <> == General Information == === Cluster access === To get access to the GPU cluster, send an email to Frank (dal-ri@informatik.uni-freiburg.de), with your supervisor and Matthias (hertelm@informatik.uni-freiburg.de) in the Cc. === Getting help === The Cluster's FAQ answers most frequent questions. To be able to read the FAQ, you must register to the Support Ticket System first. Use your informatik.uni-freiburg.de e-mail address. Registration page: https://osticket.informatik.uni-freiburg.de/account.php?do=create Link to the Cluster's FAQ page: https://osticket.informatik.uni-freiburg.de/kb/faq.php?cid=1 In addition, there is the old help page by the Machine Learning chair. You can access the help page once you have access to the GPU cluster. Log in with your TF-account. Link to the help page: https://aadwiki.informatik.uni-freiburg.de/Meta_Slurm If you have problems accessing the cluster, send an email to Frank, with your supervisor and Matthias in the Cc. For any other issues, ask your supervisor and Matthias. == Logging in to the cluster == === University network === To be able to log in to the cluster, you must be in the university’s network. We recommend using the university’s VPN. See the various information pages: * https://www.rz.uni-freiburg.de/inhalt/dokumente/pdfs/anleitungen/installation-openconnect-vpn-ubuntu/ (Ubuntu only) * http://mopoinfo.vpn.uni-freiburg.de/node * https://www.rz.uni-freiburg.de/services-en/netztel-en/vpn/vpn-einleitung-en?set_language=en * https://wiki.uni-freiburg.de/rz/doku.php?id=vpn Alternatively, you can access the university's network by logging in to login.informatik.uni-freiburg.de via SSH: {{{ ssh @login.informatik.uni-freiburg.de }}} === Cluster login === There are three login nodes: * kislogin1.rz.ki.privat * kislogin2.rz.ki.privat * kislogin3.rz.ki.privat Log in via SSH: {{{ ssh @kislogin1.rz.ki.privat }}} == Status information == Use the command {{{sfree}}} to get status information for the cluster partitions. It shows all accessible partitions and the number of used and total GPUs per partition. You can also watch the partitions and your jobs in the dashboard: https://kislurm-dashboard.informatik.uni-freiburg.de/d/spTRj8IMz/kislurm2?orgId=1 === Workspaces === You automatically have access to your home directory on the cluster. Since the size of your home directory is limited to a few GB, we recommend using a workspace for your project. A workspace is a directory that can be accessed from all nodes of the cluster. === Creating a workspace === Use the command {{{ws_allocate}}} to create a new workspace. For help, type man ws_allocate. Example: {{{ ws_allocate -r 10 -m @informatik.uni-freiburg.de test-workspace 30 }}} This command creates a workspace -test-workspace, which expires in 30 days. 10 days before the expiration, a notification is sent to the specified email address. === Find your workspace === To list the paths of your workspaces, type {{{ws_list}}}. === Extending a workspace === When a workspace expires, all content is lost. To extend the workspace, use the command: {{{ ws_allocate -x }}} == Running jobs == === Interactive session === Use the following command to start an interactive session with 1 GPU (per default): {{{ srun -p alldlc_gpu-rtx2080 --pty bash }}} To check if you have access to the GPU, run: {{{ python3 -c "import torch; print(torch.cuda.is_available())" }}} The result should be {{{True}}}. === Submitting jobs === Write a bash file containing all instructions for your job, and then run: {{{ sbatch -p alldlc_gpu-rtx2080 }}} The output of your job will be written to a file {{{slurm-.out}}} in your current directory. To get the status of your jobs, run: {{{ sacct --user=$USER }}} To list all your running jobs, run: {{{ squeue --user=$USER }}} == Accessing GitHub == To access GitHub via SSH, add the following lines to the file {{{~/.ssh/config}}} {{{ Host github.com ProxyCommand ssh -q login.informatik.uni-freiburg.de nc %h %p }}}