Quantum Monte Carlo timings From: tables sent by D. Veberic 99/01/21, 99/05/31. Additions by R. Krivec: the RC, RF columns. Updated 99/06/01 Summary (1) qmc performance is most strongly correlated with the THEORETICAL MFLOPS ratings, except on SGI Origin 2000 and Pentium II, where it is by a factor of three faster than expected from this correlation. (This probably applies to Sparc also.) (2) The qmc program in this test uses very little memory. Table I. qmc (D. Veberic). RC is 10**5 divided by the CPU time and clock frequency. RF is 10**5 divided by the CPU time and theoretical MFLOPS rating (which is clock * FP pipeline). ------------------------------------------------------------------ time rsize name RC RF [s] [MB] ------------------------------------------------------------------ Intel P2 400MHz 73 dexter 3.4 3.4 SGI Origin 2000 250MHz 74 1.59 test8 5.4 2.7 *1) Alpha SX 533MHz 90 1.56 brenta 2.1 1.1 SGI PCL 75-90MHz *2) KCC +K3 251 4.8 1.2 CC -mips3 -n32 -O3 288 1.53 saturn 4.2 1.1 CC -r8000 -mips4 -64 -O3 295 4.1 1.0 CC -Ofast=ip21 304 4.0 1.0 Sun E4000 336 MHz *3) KCC +K3 115 2.6 1.3 CC -fast 122 dune 2.4 1.2 ------------------------------------------------------------------ Intel P2 400MHz 81 f9pc30 3.1 3.1 SGI Indy 200MHz 240 1.02 tethys 2.1 1.1 *4) SGI O2 180MHz 249 1.05 calypso 2.2 1.1 SGI O2 195MHz 255 iapetus 2.2 1.1 SGI Indy 180MHz 306 1.05 dione 1.8 0.9 SGI Indy 150MHz 317 uranus 2.1 1.1 SGI Indy 133MHz 372 atlas 2.2 1.1 SGI Indy 150MHz 400 1.05 mimas 1.7 0.8 HP B132 133MHz 456 deimos 1.6 ? SGI Indy 133MHz 481 1.05 janus 1.6 0.8 SGI Indy 100MHz 523 1.02 rhea 1.9 0.9 Intel P 100MHz 813 burana 1.2 1.2 HP 712 1322 callisto 1.3 1.3 HP 710 1671 0.39 leda 1.2 1.2 ------------------------------------------------------------------ *1) Two-way FP pipeline. See also *2). *2) Four-way FP pipeline. NOTE: The timing very probably used bad compiler options, because theoretically SGI Origin 2000 is only about 50% faster than Power Challenge (see *1)). Effective clock of 83 MHz used for RC, RF because of lack of information on which processor the jobs were run. *3) Two-way FP pipeline. NOTE: FPU unit may not have functioned properly. *4) Assuming a two-way pipeline for R4xxx, R5xxx. This is a simplistic assumption, conditionally valid for R5000.