Performance of Beta version on X86

We compared performance results of Auto-Parallelizer beta version with the results of one of the most efficient x86 platform compiler - icc 11.0.074, as well as results of the most widely spread compiler - gcc (version 4.3.1). The comparison were done on 6 berchmarks from SPEC/CPU2006 and on 6 benchmarks from NAS Parallel Benchmarks 3.3. The following host was used for comparison:
2 x Intel Xeon quad-core processors X5365 @ 3Ghz with 8Gb memory onboard

Measurements were performed on hosts provided by the Joint Supercomputer Center.

Compilation flags:

icc icc -O2 -ipo -no-prec-div
icc + parallel icc -O2 -parallel -ipo -no-prec-div
gcc gcc -O2
utl [see below]

Measurements on SPEC/CPU2006 benchmarks

Utl options, that were used for SPEC/CPU2006 benchmarks compiling:

410.bwaves -Ws,--alias-fortran -Ws,--strict-types
436.cactusADM -Ws,--alias-fortran -Ws,--strict-types (for FORTRAN sources)
437.leslie3d -Ws,--alias-fortran -Ws,--strict-types
459.GemsFDT -Ws,--inter-module -Ws,--alias-fortran -Ws,--strict-types
462.libquantum -Ws,--inter-module -Ws,--pto-wilson
470.lbm -Ws,--inter-module -Ws,--pto-wilson

Below you will find the comparison results: firstly - as a diagram, and after that as a table with measurement results.

Performance
Performance

Measurements on NAS Parallel Benchmarks

Utl options, that were used for NAS Parallel Benchmarks benchmarks compiling:

BT -Ws,--strict-types -Ws,--alias-fortran -Ws,--opt-force -Ws,--inter-module -Ws,--inline -Ws,--localize -Ws,--lowerscope
CG -Ws,--alias-fortran -Ws,--inter-module -Ws,--inline
EP -Ws,--strict-types -Ws,--alias-fortran -Ws,--inter-module -Ws,--inline -Ws,--lowerscope
MG -Ws,--strict-types -Ws,--alias-fortran -Ws,--inter-module -Ws,--inline
SP -Ws,--strict-types -Ws,--alias-fortran -Ws,--inter-module -Ws,--inline -Ws,--localize -Ws,--lowerscope -Ws,--inline
UA -Ws,--strict-types -Ws,--alias-fortran -Ws,--inter-module -Ws,--inline

Below you will find the comparison results: firstly - as a diagram, and after that as a table with measurement results.

Performance
Performance

* - MG and CG benchmarks were measured with input data of B class. This was done to reduce measuring error, because these benchmarks work too fast on data of A class.
The other benchmarks where measured on input data of A class.