Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
public:usage:goethe-hlr [2020/05/15 10:48] – [The test Partition: Your First Job Script] geier | public:usage:goethe-hlr [2020/12/09 21:46] – [Hyper-Threading] keiling | ||
---|---|---|---|
Line 9: | Line 9: | ||
<note important> | <note important> | ||
- | You may receive a warning from the system that something with the security is wrong. We switched the old LOEWE Cluster IP to our new GOETHE Cluster. If you used the LOEWE Cluster in the past you receive | + | You may receive a warning from the system that something with the security is wrong. We switched the old LOEWE Cluster IP to our new GOETHE Cluster. If you used the LOEWE Cluster in the past you receive |
- | If you may used linux just look up '' | + | If you may use Linux just look up '' |
On Windows systems please use/install a Windows SSH client (e.g. PuTTY, or the Cygwin ssh package). | On Windows systems please use/install a Windows SSH client (e.g. PuTTY, or the Cygwin ssh package). | ||
Line 121: | Line 121: | ||
On our systems, compute jobs and resources are managed by SLURM (Simple Linux Utility for Resource Management). The compute nodes are organized in the partition (or queue) named '' | On our systems, compute jobs and resources are managed by SLURM (Simple Linux Utility for Resource Management). The compute nodes are organized in the partition (or queue) named '' | ||
- | ^Partition^Node type^Implemented^ | + | ^Partition^Node type^GPU^Implemented^ |
- | | '' | + | | '' |
- | | '' | + | | '' |
- | | '' | + | | '' |
- | | '' | + | | '' |
Nodes are used **exclusively**, | Nodes are used **exclusively**, | ||
Line 152: | Line 152: | ||
The following instructions shall provide you with the basic information you need to get started with SLURM on our systems. However, the official SLURM documentation covers some more use cases (also in more detail). Please read the SLURM man pages (e.g. '' | The following instructions shall provide you with the basic information you need to get started with SLURM on our systems. However, the official SLURM documentation covers some more use cases (also in more detail). Please read the SLURM man pages (e.g. '' | ||
- | Helpful SLURM link: [[https:// | + | Helpful SLURM links: [[https:// |
+ | SLURM documentation: | ||
==== The test Partition: Your First Job Script ==== | ==== The test Partition: Your First Job Script ==== | ||
Line 190: | Line 190: | ||
==== Job Monitoring ==== | ==== Job Monitoring ==== | ||
- | For job monitoring (to check the current state of your jobs) you can use the '' | + | For job monitoring (to check the current state of your jobs) you can use the '' |
If you need to cancel a job, you can use the '' | If you need to cancel a job, you can use the '' | ||
==== Node Types And Constraints ==== | ==== Node Types And Constraints ==== | ||
+ | |||
+ | <note important> | ||
On Goethe-HLR **four different types** of compute nodes are available. There are | On Goethe-HLR **four different types** of compute nodes are available. There are | ||
- | ^Number^Type^Vendor^CPU^Cores per CPU^Cores per Node^Hyper-Threads per Node^RAM [GB]^ | + | ^Number^Type^Vendor^CPU^GPU^Cores per CPU^Cores per Node^Hyper-Threads |
- | |412|dual-socket |Intel|Xeon Skylake Gold 6148 | + | |412|dual-socket |Intel|Xeon Skylake Gold 6148 |none|20|40|80|192| |
- | |72 |dual-socket |Intel|Xeon Skylake Gold 6148 | + | |72 |dual-socket |Intel|Xeon Skylake Gold 6148 |none|20|40|80|772| |
- | |139|dual-socket |Intel|Xeon Broadwell E5-2640 v4|10|20|40|128| | + | |139|dual-socket |Intel|Xeon Broadwell E5-2640 v4|none|10|20|40|128| |
- | |47 |dual-socket | + | |112|dual-socket |AMD |EPYC 7452 |8x MI50 \\ 16GB|32|64|128|512| |
In order to separate the node types, we employ the concept of partitions. We provide three partitions | In order to separate the node types, we employ the concept of partitions. We provide three partitions | ||
Line 207: | Line 209: | ||
|general1|''# | |general1|''# | ||
|general2|''# | |general2|''# | ||
- | |gpu|''# | + | |gpu|''# |
|test|''# | |test|''# | ||
Line 221: | Line 223: | ||
| '' | | '' | ||
+ | For the partition '' | ||
+ | |||
+ | ^Limit^Value^Description^ | ||
+ | | '' | ||
+ | | '' | ||
+ | | '' | ||
+ | | '' | ||
==== GPU Jobs ==== | ==== GPU Jobs ==== | ||
- | Currently there are no GPU nodes available. In future: if you want to use GPUs in your calculations, | + | Since december 2020 gpu nodes are part of the cluster. Select |
< | < | ||
==== Hyper-Threading ==== | ==== Hyper-Threading ==== | ||
Line 229: | Line 238: | ||
On compute nodes you can use Hyper-Threading. That means, in addition to each physical CPU core a virtual core is available. SLURM identifies all physical and virtual cores of a node, so that you have 80 logical CPU cores on an Intel Skylake node, 40 logical CPU cores on an Intel Broadwell or Ivy Bridge node, and 24 logical CPU cores on a GPU node. If you don't want to use HT, you can disable it by adding | On compute nodes you can use Hyper-Threading. That means, in addition to each physical CPU core a virtual core is available. SLURM identifies all physical and virtual cores of a node, so that you have 80 logical CPU cores on an Intel Skylake node, 40 logical CPU cores on an Intel Broadwell or Ivy Bridge node, and 24 logical CPU cores on a GPU node. If you don't want to use HT, you can disable it by adding | ||
- | ^Node type^sbatch command^ | + | ^Node type^hyperthreading=OFF^# |
- | |Skylake|''# | + | |Skylake |
- | |Broadwell / Ivy Bridge|''# | + | |Broadwell / Ivy Bridge|''# |
+ | |AMD EPYC 7452 | ||
to your job script. Then you'll get half the threads per node (which will correspond to the number of cores). This can be beneficial in some cases (some jobs may run faster and/or more stable). | to your job script. Then you'll get half the threads per node (which will correspond to the number of cores). This can be beneficial in some cases (some jobs may run faster and/or more stable). |