Slurm GPU Resources (Kebnekaise)

Slurm GPU Resources (Kebnekaise)

We have two types of GPU cards available on Kebnekaise, NVIDIA Tesla K80 (Kepler) and NVIDIA Tesla V100 (Volta).

To request GPU resources one has to include a GRES in the submit file. The general format is:

#SBATCH --gres=gpu:<type-of-card>:x

where <type-of-card> is either k80 or v100 and x = 1, 2, or 4 (4 only for the K80 type).

The K80 enabled nodes contain either two or four K80 cards, each K80 card contains two gpu engines.

The V100 enabled nodes contain two V100 cards each.

On the dual card nodes one can request either a single card (x = 1) or both (x = 2). For each requested card, a whole CPU socket (14 cores) is also dedicated on the same node. Each card is connected to the PCI-express bus of the corresponding CPU socket.

When requesting two or four K80 cards, one must also use "--exclusive" (this does not apply when requesting V100 cards).

One can activate Nvidia Multi Process Service (MPS), if so required, by using:

#SBATCH --gres=gpu:k80:x,xmps

If the code that is going to run on the allocated resources expects the gpus to be in exclusive mode (default is shared), this can be selected with "gpuexcl", like this:

#SBATCH --gres=gpu:v100:x,gpuexcl

 

Updated: 2020-07-14, 21:10