HPC AND SIMULATIONS
Some fireplace stories.
Part of the work was done under the
Slovenian Agency for Research and Development
J1-6545: "High performance computing algorithms in theoretical physics" (2004 - 2007).
With processor consolidation (x86_64) and better LAPACK compilations
has seemingly moved out of view but was crucial for efficiency in the 2000's (e.g. FPU
- For example, my CFHHM code for the precise
solution of the three-body problem in atomic physics still exhibits a record
of up to
76 percent CPU cycles usage on (pipelined)
- Well written programs manage 10 - 30 percent CPU efficiency, but too many are below even that.
- It is becoming difficult to get absolute measurements because,
since the times of
hardware operation counters have not been
- Reinventing the 64-bit computing
The transition from
64-bit Solaris on Sparc
to Solaris and 64-bit Linux on x86_64 was nontrivial for
large memory programs.
- On x86_64, in the 2000's only the Intel ifort compiler supported
all HPC features like quadruple precision arithmetic.
- As of 2007, GNU gfortran was just beginning to become reliable.
- In 2007 there was
no NAG Library for ifort, so
The crossing of the 2 GB memory boundary was not trivial. Some programs has to be recompiled dynamically otherwise
they were limited to about 5 GB program size on
RHEL4/x86_64. This was resolved with the help of
I. Reid from NAG. Although there were then no limits on the code size,
the resident size may have been still limited to 5 GB, resulting in wild paging
if the code was very non-local.
- The low efficiency of linear algebra on i386/x64 systems has improved in recent years as
ATLAS has approached 83% CPU efficiency.
on x86_64 was resolved with the help of
R. van der Pas from the
Sun Microsystems Compiler group.
- Code development
Some free/open source stories.
- By hacking the source of the ATI Linux graphics driver, I managed to
enable the DVI port on a SuSE 9 PC (this worked out of the box in SuSE 10).
Development of a
for the simulation of damaged aircraft led to an
simulation interface which the optimized simulation engine allows to run with
up to about 1000 FPS on a Core i7 making maximum use of both CPU and GPU.
Stutter in command input processing on long streams prompted
asking the developers of the Unix program
to clean it up for recent Linux kernels.
After my testing, updated Geomview was made
Fedora Core 5 Linux distribution
in 2006 (maintainer is the KDE project leader
Geomview is now being used by our
Biophysics group on both RHEL and
- System administration
After 2008, I managed the first
Lustre cluster in Southern Europe.
- Disks appear as a single enormous partition from any node, no user setup needed.
- It had two metadata servers in fallback configuration and several multi-CPU,
large-memory servers, which were assigned automatically without users having
to do anything, expect perhaps a few lines of openMP, much simplet than MPI.
- Lustre has since become the most widely used distributed data system, as it is