Skip to content

Whetstone results

ict edited this page Jul 10, 2020 · 21 revisions

This page stores various results obtained from running the Whetstone benchmark included in this package. Bold text indicates the best result from a given compiler.

Jump to: Embedded Systems, Laptops and Portables, Desktops/PCs, Workstations (+non-unix)

Embedded Systems

HP t5325

The t5325 is a miniscule low-power thin client unveiled by HP in late 2009, designed around a Marvell Kirkwood 88F6281 system-on-a-chip implementing a Marvell designed ARMv5TE-compliant "Sheeva" processor core clocked at 1.2 GHz with independent 16 KiB instruction and data caches and a 256 KiB unified secondary cache. Derived from the ARM926EJ-S, the Sheeva core does not feature an on-chip floating point unit. All tests are performed under the HP "ThinPro" operating system, a lightly customized variant of Debian Lenny, on a system not specifically configured for benchmarking.

With no floating point unit of any kind, the t5325's Kirkwood processor returns abysmal results on Whetstone and similar FP-heavy applications despite its reasonable 1.2 GHz clock frequency and on-die secondary cache, even with maximal optimization.

GCC 4.2.4: 25,000 loops

Options Duration Rating
none 56 seconds 44.6 MWIPS
-O1 51 seconds 49.0 MWIPS
-O2 51 seconds 49.0 MWIPS
-O3 41 seconds 61.0 MWIPS
-O3 -ffast-math 30 seconds 83.3 MWIPS

Laptops and Portables

Panasonic ToughBook U1

The ToughBook U1 is a unique and highly ruggedized UMPC released by Panasonic in 2008, and built around Intel's hyper-threaded "Silverthorne" Atom microprocessor, featuring a 32 KiB instruction cache and a 24 KiB data cache, along with a unified 512 KiB secondary cache. The Z520 model featured in the U1 has a 1.33 GHz clock frequency. All tests are performed under Windows XP with the Cygwin environment, on a system not specifically configured for benchmarking.

GCC 5.4.0: 100,000 loops

Options Duration Rating
none 31 seconds 322.6 MWIPS
-O1 26 seconds 384.6 MWIPS
-O2 15 seconds 666.7 MWIPS
-O3 14 seconds 714.3 MWIPS
-Ofast 6 seconds 1666.7 MWIPS

GCC 5.4.0: 1,000,000 loops

Options Duration Rating
none 1292 seconds 77.4 MWIPS
-O1 779 seconds 128.4 MWIPS
-O2 686 seconds 145.8 MWIPS
-O3 633 seconds 158.0 MWIPS
-Ofast 281 seconds 355.9 MWIPS

Dell Latitude E6420

The E6420 is a midrange 14-inch business notebook introduced in early 2012, this particular configuration features Intel's dual-core, hyper-threaded Core i5 "Sandy Bridge" microprocessor with 32+32 KiB per-core instruction and data caches, 256 KiB per-core second level cache, a 3 MiB shared tertiary cache and a standard 2.6 GHz clock and maximum frequency of 3.3 GHz. All tests are performed under CentOS 7.5.1804 on a system not specifically configured for benchmarking.

GCC 4.8.5: 500,000 loops

Options Duration Rating
none 17 seconds 2941.2 MWIPS
-O1 13 seconds 3846.2 MWIPS
-O2 10 seconds 5000.0 MWIPS
-O3 11 seconds 4545.5 MWIPS

GCC 4.8.5: 1,000,000 loops

Options Duration Rating
none 48 seconds 2083.3 MWIPS
-O1 39 seconds 2564.1 MWIPS
-O2 35 seconds 2857.1 MWIPS
-O3 35 seconds 2857.1 MWIPS

Note: 100,000 loop results omitted due to possible inaccuracy Note: -Ofast omitted due to possible over-optimization affecting program output

Desktops/Personal Computers

Lenovo 3000 J115 (7387-26U)

Released in late 2006 as one of Lenovo's first entries into the United States market under their own name; a fairly average entry-level PC built around AMD's dual-core Athlon 64 X2 microprocessor with 64+64 KiB shared instruction and data caches, 512 KiB per-core second level cache and a 2 GHz clock frequency (model 3800+). All tests are performed under CentOS 7.5.1804 on a system not specifically configured for benchmarking.

All results reflect single-threaded execution. This version of Whetstone does not take any advantage of multi-core processors.

GCC 4.8.5: 250,000 loops

Options Duration Rating
none 16 seconds 1562.5 MWIPS
-O1 10 seconds 2500.0 MWIPS
-O2 7 seconds 3571.4 MWIPS
-O3 6 seconds 4166.7 MWIPS

GCC 4.8.5: 1,000,000 loops

Options Duration Rating
none 196 seconds 510.2 MWIPS
-O1 151 seconds 662.3 MWIPS
-O2 137 seconds 729.9 MWIPS
-O3 136 seconds 735.3 MWIPS

Note: 100,000 loop results omitted due to possible inaccuracy Note: -Ofast omitted due to possible over-optimization affecting program output

Workstations

Apple Power Mac G5 2.3DC

The mid-range offering of Apple's final generation of PowerPC-based professional systems, the 2.3DC was introduced in October 2005 and was designed around IBM's new dual-core 64-bit PowerPC 970MP processor, which featured two PowerPC 970 cores each with 32 KiB data cache, 64 KiB instruction cache, and a unified 1 MiB secondary cache, all running at a clock frequency of 2.3 GHz.

All tests are performed under Mac OS 10.4.11, on a system not specifically configured for benchmarking.

Apple GCC 4.0.1: 250,000 loops

Options Duration Rating
none 37 seconds 675.7 MWIPS
-O1 14 seconds 1785.6 MWIPS
-O2 15 seconds 1666.7 MWIPS
-O3 9 seconds 2777.8 MWIPS

Apple GCC 4.0.1: 1,000,000 loops

Options Duration Rating
none 153 seconds 653.6 MWIPS
-O1 66 seconds 1515.2 MWIPS
-O2 67 seconds 1492.5 MWIPS
-O3 45 seconds 2222.2 MWIPS

HP VISUALIZE C3000 (9000/785/C3000)

A mid-range Unix workstation released in 1999, based on HP's indigenous PA-8500 microprocessor with 1 MiB of on-die data cache, 512 KiB of on-die instruction cache and a clock frequency of 400 MHz. All tests are performed under HP-UX 11.11 (11i v1) on a system not specifically configured for benchmarking.

HP C B.11.11.16: 100,000 loops

Options Duration Rating
none 76 seconds 131.6 MWIPS
+O1 59 seconds 169.5 MWIPS
+O2 40 seconds 250.0 MWIPS
+O3 27 seconds 370.4 MWIPS
+O4 27 seconds 370.4 MWIPS
-fast 11 seconds 909.1 MWIPS

HP C B.11.11.16: 250,000 loops

Options Duration Rating
none 478 seconds 52.3 MWIPS
+O1 429 seconds 58.3 MWIPS
+O2 381 seconds 65.6 MWIPS
+O3 350 seconds 71.4 MWIPS
+O4 351 seconds 71.2 MWIPS
-fast 28 seconds 892.9 MWIPS

Note: -fast may be over-optimizing

Note: HP C +O2 is roughly equivalent to GCC -O1

GCC 4.2.3: 100,000 loops

Options Duration Rating
none 69 seconds 144.9 MWIPS
-O1 45 seconds 222.2 MWIPS
-O2 31 seconds 322.6 MWIPS
-O3 26 seconds 384.6 MWIPS
-O3 -ffast-math 24 seconds 416.7 MWIPS

Note: -Ofast is only available in GCC >=4.7

Non-Unix Workstations

The following results are from workstations not running a Unix-like operating system, compatibility environment or otherwise lacking the proper accommodations to build the LINPACK sources provided in this package as-is. Source/build tweaks are noted on a per-system basis.

DEC VAXstation 4000 VLC

An entry-level workstation introduced by DEC in 1991, and the smallest full-featured VAX ever built. The VAXstation 4000 VLC is designed around DEC's highly integrated CVAX "SOC" microprocessor with a 1 KiB shared primary instruction/data cache, a novel 8 KiB DRAM secondary cache, and a 25 MHz clock frequency.

All tests are performed under OpenVMS 6.1 in an environment not specifically configured for benchmarking. No source modifications were required to build this file in a VMS environment.

DEC C/C++ 1.2: 1,000 loops

Options Duration Rating
none 24 seconds 4.2 MWIPS
/OPTIMIZE=ALL 24 seconds 4.2 MWIPS

DEC compiler optimizations do not seem to meaningfully impact performance of this benchmark.