Document: MMXmax.txt File Group: Classic Benchmarks Creation Date: 14 October 1997 Revision Date: Title: MMX Memory Speed Test Keywords: BENCHMARK PERFORMANCE MMX Abstract: This document contains details an additional program for the MMX suite that is more dependent on memory bandwidth than MMX instruction speed. Only a DOS version MMXMAX.EXE is supplied but this can be run via an Operating System DOS window if required. Note: The program is still under test and should be treated as beta test software. Please submit feedback directly to Roy, or as a posting in Section 12 message board. The programs has been tested via a 200 MHz Intel MMX CPU. Contributor: Roy_Longbottom@compuserve.com MMX MEMORY SPEED TEST 0. SUMMARY The test was compiled using Watcom C/C++ 11 which has MMX facilities. Its main purpose is to measure memory bandwidth when using MMX instructions. The other MMX tests are more dependent on processor speed. The program executes a macro containing 512 MMX register load instructions from adjacent quad word memory addresses. Memory size tested varies from 4KB to 4 MB to measure data transfer rates from L1 cache, L2 cache and RAM. Speed ratings in terms of Millions of Operations Per Second (MOPS) are produced and data transfer rates in Millions of bytes per Second (MB/S). The latter can be compared with MEMSPEED results. Optional run time parameters can be included to run using four user defined memory sizes. Running time should be less than two minutes on any PC. Only a DOS version MMXMAX.EXE has been produced as only a run via a clean DOS boot produces the fastest, most consistent results. However, the program can be run via an Operating System DOS window if required. Results are displayed as the program is running and saved in file MMXRES.TXT which should appear in the same directory as the EXE files. A run time option allows a user defined log file to be used as an alternative. This DOS version requires DOS4GW.EXE the protected run-time program. Before running, all other applications should be closed but, preferably, the program should be run from a clean DOS boot. Type MMXMAX at the DOS command prompt. MMXMAX H displays details of optional parameters and example results. Results should be sent to Roy_Longbottom@compuserve.com and details of the system under test should be included. The configuration details can be provided via program SYSTEMxx.EXE as supplied with the Classic Benchmarks. 1. INTRODUCTION The MMX benchmarks in MMXSPEED.ZIP provided useful measurements of performance of CPUs with MMX facilities but mainly on processing speed. It was felt that an additional program would be useful that was dependent on memory bandwidth. 2. MMX INSTRUCTIONS Information on MMX instructions are given in MMXSPEED.TXT and MMXRESLT.TXT or XLS (in MMXMEM.ZIP). This program uses a single macro containing 512 instructions that load data from sequential quad word (64 bits or 8 bytes) memory addresses to MMX registers. 3. PROGRAM DETAILS The program uses a single array of 524288 64 bit MMX quad words, requiring 4 megabytes of storage. The main timing loop is executed eleven times with maximum loop counts (x 1024) of 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048 and 4096 with an increment of 512. The function used passes the starting byte address to the MMX macro and this appears in register eax. The 512 instructions used load to all 8 MMX registers are as follows: "movq mm0,[eax]" \ "movq mm1,[eax+8]" \ "movq mm2,[eax+16]" \ "movq mm3,[eax+24]" \ "movq mm4,[eax+32]" \ "movq mm5,[eax+40]" \ "movq mm6,[eax+48]" \ "movq mm7,[eax+56]" \ "movq mm0,[eax+64]" \ to "movq mm6,[eax+4080]" \ "movq mm7,[eax+4088]" An outer loop repeats each test for around 5 seconds. In calibrating to determine the number of passes for 5 seconds, data should be pre-loaded into cache, if appropriate. 4. DISPLAYED RESULTS When the program is running, the results are displayed using the following format. The Size column indicates the number of MMX instructions executed (or quad words loaded) in the timing loop. Passes is number of times that the outer loop is run. MOPS is Size x Passes / Secs / 1000000. MB/Sec is calculated as MOPS x 8 (divide by 1048576 for megabytes per second). The following is for a home built PC with a Pentium MMX 200 MHz CPU, Micronics TX mainboard, 32 MB SDRAM and 512 KB pipeline burst cache, via DOS 6.2. MMX Memory speed Test DOS Version 1 Sat Oct 11 14:11:39 1997 Copyright (C) 1997, Roy Longbottom Via Watcom Version 11, log file MMXres.txt Size KB Used Passes Secs MOPS MB/Sec 512 4 1904761 4.99 195.4 1564 1024 8 952380 4.99 195.4 1564 2048 16 404040 5.06 163.5 1308 4096 32 40404 5.00 33.1 265 8192 64 20202 5.00 33.1 265 16384 128 10204 5.00 33.4 267 32768 256 5031 4.94 33.4 267 65536 512 2515 4.94 33.4 267 131072 1024 675 5.00 17.7 142 262144 2048 337 5.00 17.7 141 524288 4096 168 5.00 17.6 141 This demonstrates almost what might be the maximum speed obtainable via L1 cache with MOPS nearly equal to the CPU MHz and 1600 millions of bytes per second. The L2 cache measurements (32 KB to 512 KB) probably indicate the cache to CPU bus speed of 66.7 MHz (x 4 bytes), MOPS being for 8 bytes. The SDRAM RAM speed appears to generate a bus utilisation of 53%. 5. RESULTS FILE Results are appended to file MMXRES.TXT which should appear in the same directory as the EXE file. The results are in the same format as above. MMXRES.TXT is the default file used by the other MMX tests. 6. RUNNING INSTRUCTIONS Preferably the PC should be booted with DOS with no drivers or memory managers and MMXMAX or MMXMAX.EXE typed at the command prompt. If MMX facilities are not available the program should display: MMX Memory speed Test DOS Version 1 Sat Oct 11 14:11:39 1997 Copyright (C) 1997, Roy Longbottom Via Watcom Version 11, log file MMXres.txt Program will not run. MMX facilities not detected. Press any key to close. The detection facility appears to work with Intel processors but an invalid operation code indication is likely when run on a Cyrix CPU. 7. OPTIONAL RUN TIME PARAMETERS One to seven run time parameters can be included in the command that executes the program with the following format: MMXMAX CN LF Val1 Val2 Val3 Val4 Text CN - C or Close (default) to close the window when finished N or NoClose leave the window open. This is only for use when running via Windows. LF - Log file name, default is MMXres.txt Val1 to 4 - One to four memory sizes to use in Kbytes with increments of 4 (adjusted by program by rounding down). The minimum value is 4 and maximum 4096. Text - A brief description of the test. Note that all other parameters must be included before the text. The text is included in the results file. Example commands are: MMXMAX C Mylog.txt user defined log MMXMAX C Mylog.txt 8 1 test 8 KB MMXMAX C Mylog.txt 8 8 2 tests of 8 KB MMXMAX C Mylog.txt 4 12 256 4096 4 tests MMXMAX C Mylog.txt 4 4 4 4 My PC 4 tests + comment Error - MMXMAX Mylog.txt should display the help file Error - MMXMAX C 4 4 4 4 will generate a log file named 4 and run three tests. Error - MMXMAX C out.log 8 8 My PC should ignore the text and run an extra 4 KB test. 8. HELP FACILITY If the program is executed with the wrong first parameter (e.g. MMXMAX x - or ? /h etc.), the parameter options are displayed, plus an example set of results as above. 9. SUBMITTING RESULTS Results should be sent to Roy_Longbottom@compuserve.com and the following information on the system under test should be included. PC Supplier/model CPU chip Clock MHz Cache size Chipset & H/W Options (including RAM size and type) OS/DOS version Run by (name) Company/Location E-Mail address The configuration details can be provided via program SYSTEMxx.EXE as supplied with the Classic Benchmarks.