This is an old revision of the document!


LOEWE-CSC

The LOEWE-CSC is a general-purpose computer cluster based on several CPU and GPU architectures. The system was installed in Industriepark Höchst in late 2010 and has been constantly upgraded since then.

loewe-3d-2.jpg

Cluster performance:

  • CPUs performance (dp1)): 226 TFlop/s (peak)
  • GPUs performance (dp): 597 TFlop/s (peak)
  • Cluster performance HPL (2010): 299.3 TFlop/s
  • Energy efficiency Green500 (2010): 740.78 MFlop/s/Watt

Hardware:

  • 825 compute nodes in 36 water-cooled racks, two server rooms,
  • thousands of CPU cores and hundreds of GPGPU accelerators,
  • 64-128 of RAM per node and over 2 PB aggregated disk capacity,
  • QDR and FDR InfiniBand interconnects,
  • a parallel scratch filesystem with a capacity of 764 TB and an aggregated bandwidth of 10 GB/s.
Nodes CPU Cores
per CPU
Cores
per Node
HT2) RAM
per Node
GPU
438 2xAMD Opteron 6172 12 24 - 64 GB AMD HD 5800 (1GB)
198 2xIntel Xeon E5-2670 v2 10 20 40 128 GB n/a
139 2xIntel Xeon E5-2640 v4 10 20 40 128 GB n/a
50 2xIntel Xeon E5-2630 v2 6 12 24 128 GB 2xAMD S10000
(12GB)


loewe1.jpg loewe2.jpg loewe-csc-04b.jpg

InfiniBand Network

Most of the compute nodes (the ones in the main server room) are connected through a mixed 4X QDR-FDR InfiniBand 2:1 blocking fat-tree network (FDR at the spine level, FDR and QDR at the edge level). In the second (smaller) server room there are two non-blocking InfiniBand “islands” (one per cabinet), one running at 4X QDR and one at 4X FDR. The connection between the rooms is a 20 Gb/s Ethernet. Compute jobs are scheduled in a way that avoids mixed allocations, i.e. jobs are either kept within the fat-tree or on an island.

1)
double precision
2)
Intel Hyper-Threads per Node
public/service/loewe-csc.1477489097.txt.gz · Last modified: 2016/10/26 15:38 by jankowiak
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0