HPC AND SIMULATIONS
transition from 64-bit computing on Solaris to Linux ,
Lustre administration, and
Part of the work was done under the
Slovenian Agency for Research and Development
J1-6545: "High performance computing algorithms in theoretical physics" (2004 - 2007).
With processor consolidation (x86_64)
has become simpler but remains important for efficiency (e.g. FPU
- Reinventing the 64-bit computing
The transition from
64-bit Solaris on Sparc
to Solaris and 64-bit Linux on x86_64 has met with some obstacles for
large memory programs:
- On x86_64, only the Intel ifort compiler supports
all HPC features like
quadruple precision arithmetic.
- As of 2007, GNU gfortran was just beginning to become reliable.
- In 2007 there was
no NAG Library for ifort, so
linking steps had to be tested.
The crossing of the 2 GB memory
boundary was not trivial. Some programs has to be
recompiled dynamically otherwise
they were limited to about 5 GB program size on
RHEL4/x86_64. This was resolved with the help of
I. Reid from NAG. Although there are no limits now on the code size,
the resident size may still be limited to 5 GB, resulting in paging
if the code is very non-local.
- The low efficiency of linear algebra on i386/x64 systems has
improved in recent years as
ATLAS has approached
83% CPU efficiency.
on x86_64 was resolved with the help of R. van der Pas from the
Sun Microsystems Compiler group.
- Code development
Code running for weeks should be optimized.
- For example, my CFHHM code for the precise
solution of the three-body problem in atomic physics still exhibits a record
of up to
76 percent CPU cycles usage on (pipelined)
- Well written programs manage 10 - 30 percent CPU efficiency, but too many are below even that.
- It is becoming difficult to get absolute measurements because,
since the times of
hardware operation counters have not been
Recent workstations have 3D graphics cards (e.g. the Quadro FX)
providing hardware acceleration.
- By hacking the source of the ATI Linux graphics driver, I managed to
enable the DVI video port on a SuSE 9 Linux PC (this works out of the
box in SuSE 10).
Development of a
for the simulation of damaged aircraft led to an
simulation interface which the optimized simulation engine allows
to run with up to about 1000 FPS (Core i7) making maximum use of both CPU and GPU.
Stutter in command input processing on long streams prompted
asking the developers of the Unix program
to clean it up for recent Linux kernels.
Updated Geomview was made
Fedora Core 5 Linux distribution
in 2006 (maintainer is the KDE project leader
Geomview is now being used by our
Biophysics group on both RHEL and
- System administration
System administration covers the management of Unix/Linux systems.
In 2008, I managed the installation of the first Lustre cluster in
The focus is on keeping computational servers up and running,
making sure we have a
consistent set of compilers and numerical libraries
available, and help user applications run
in an optimal way.
Software includes NAG, Mathematica, and Matlab.
Unification of user environment: the
same software is installed across different operating systems:
- Part of workstations (Sparc and x86_64) are running Solaris; GNU software is
installed from repositories
- Part of workstations (x86_64) are running Red Hat Enterprise WS Linux.
Unification of server environment:
- Most servers and workstations are incorporated into a heterogeneous
with data bandwidth ranging from 64 Gbps between 16 SMP CPUs, 4 Gbps
on Infiniband and 1 Gbps Ethernet. Grid is based on the
Lustre file system
and Sun Grid
- The department owns
(record uptime of 620 days) and
large-memory SMP servers.
- There is a SunRay installation
for about 13 users.
We also run a local DNS server.
Information for users is available on the
grid system help pages.
Some computational links
- NOTE. Don't compile LAPACK routines from source files.
- ATLAS numerical library
NAG Library manuals (NAG site; for local manuals see
infopages). Local NAG
installation calls Atlas routines where available.
- Lapack Utilities Library (source files)
- Blas 1, 2, 3
- Blas massive parallel (source files)