SGI F77 COMPILER BASIC TEST 99/02/01 OBJECTIVE To measure a simple vector/parallel REAL*8 program speed using basic compiler optimization options. DESCRIPTION Machine: uranus.ijs.si (Indy R5000 SC/150, IP22: 1 x 150 MHz, 64 MB RAM). Program: Matrix multiplication using nested DO loops. Matrix dimensions: ND = 800 (reservation). Timing: dtime (tempd.f) Program details: P25: I-loop inside (vector style), no directives: PROGRAM P25 ... IMPLICIT REAL*8 (A-H,O-Z) PARAMETER (ND = 800, NIN = 5, NOUT = 6) DIMENSION A(ND,ND), B(ND,ND), C(ND,ND) ... DO 26 J = 1,N DO 24 K = 1,N DO 22 I = 1,N C(I,J) = C(I,J) + A(I,K) * B(K,J) 22 CONTINUE 24 CONTINUE 26 CONTINUE ... RESULTS Table I. Additional optimization options. ND = 400, N = 400 (128 M operations). -------------------------------------------------------------------------- Program Compiler call CPU time MFLOPS -------------------------------------------------------------------------- p25 f77 -O3 8 16 -O3 -Ofast=ip22 2 64 -O3 -r5000 -mips3 -n32 2.3 56 -O3 -r5000 -mips4 -n32 1.5 85 -O3 -r5000 -mips4 -n32 -LNO:ou=4 1.5 85 -O3 -r5000 -mips4 -n32 -LNO:ou=6 1.5 85 -------------------------------------------------------------------------- Table II. As Table I, but ND = 800, N = 800 (1024 M operations). -------------------------------------------------------------------------- Program Compiler call CPU time MFLOPS -------------------------------------------------------------------------- p25 f77 -O3 71 14 -O3 -Ofast=ip22 15 68 -Ofast=ip22 15 68 -Ofast=ip22 -LNO:ou=6:cs1=32k:cs2=512k -TENV:X=4 16 64 -O3 -r5000 -mips3 -n32 21 49 -O3 -r5000 -mips4 -n32 17 60 -O3 -r5000 -mips4 -n32 -LNO:ou=4 14 73 -LNO:ou=6 13 79 -LNO:ou=6 -TENV:X=4 13 79 -LNO:ou=6:cs1=32k:cs2=512k -TENV:X=4 13 79 -LNO:ou=8 16 64 -------------------------------------------------------------------------- Table III. As Table II, but p25v.f, calling DGEMM. -------------------------------------------------------------------------- Program Compiler call CPU time MFLOPS -------------------------------------------------------------------------- p25v --------------------------------------------------------------------------