--------------------------------------------------------------------- QLife 1.3 November 11, 1992 David Stafford --------------------------------------------------------------------- QLife is my entry in the PC-Technique magazine optimization challenge. QLife is written for Borland C 3.1 but it shouldn't be too difficult to convert it to work with Microsoft C. To build QLife run the BUILD.BAT batch file with the size of the life grid on the command line (see below). The command-line options are: WIDTH 32 - Sets the width of the life grid to 96 cells (divided by 3). HEIGHT 96 - Sets the height of the life grid to 96 cells. NOCOUNTER - Turns off the generation counter. (optional) NODRAW - Turns off drawing of the cell map. (optional) GEN 1000 - Calculates 1000 generations. (optional) These MUST be in UPPER CASE. For example, the minimum you really need is "WIDTH 40 HEIGHT 120". I used "WIDTH 46 HEIGHT 138 NOCOUNTER NODRAW GEN 7000" during testing. If you have selected the GEN option you will have to press a key to exit QLife when it is finished. This is so I could visually compare the result of N generations under QLife with N generations under Abrash's original life program. You should be aware that the program from the magazine contains a small bug which may make it appear that they do not generate identical results. The original program does not display a cell until it changes so if a cell is alive on the first generation and never dies then it will never be displayed. This bug is not present in QLife. You should have no touble running QLife with cell grids up to 210x200. You MUST have a VGA and AT LEAST a 386 to run QLife. The 386 features which it uses are not integral to the algorithm (they're a convenience for the code) so feel free to modify QLife to run on lesser CPUs if you wish. QLife works best if you have a large CPU cache (256k is recommended). The subdirectories contain compiled versions of both the original life program from PC-Techniques and QLife. The programs have been modified slightly to display the generation count (and check for a keypress) only once per 100 generations. This significantly reduces time spent in I/O. Each directory contains a program for a different grid size. There is only one program in the 201x200 directory because the original program cannot work with a grid of that size. --------------------------------------------------------------------- How it works --------------------------------------------------------------------- Each three cells in the life grid are packed into two bytes: OABCabcX XXYYYZZZ O : 0 if an internal cell, 1 if on an edge (must handle wrapping) ABC : the life/death cell states for the next generation abc : the life/death cell states for the current generation XXX : neighbor count for cell 'A' YYY : neighbor count for cell 'B' ZZZ : neighbor count for cell 'C' So, it is convenient if the width of the cell array is an even multiple of three. There's nothing in the algorithm which prevents it from supporting any arbitrary size but the code is a bit simpler this way. So if you want a 200x200 grid I recommend just using a 201x200 grid and be happy with the extra free column. Otherwise the edge wrapping code gets more complex. Since every cell has from zero to eight neighbors you may be wondering how I can manage to keep track of them with only three bits. Each cell really has only a maximum of seven neighbors since we only need to keep track of neighbors OUTSIDE of the current cell word. That is- if cell 'B' changes state then we don't need to reflect this in the neighbor counts of cells 'A' and 'C'. Updating is made a little faster. The basic idea is to maintain a "change list". This is an array of pointers into the cell array. Each element points to a word which changes in the next generation. This way we don't have to waste time scanning every cell since most of them do not change. For a 201x200 cell array the change list will be, at maximum, 26,800 bytes (plus one more word to terminate the list). That's only if EVERY cell changes in one generation. In practice it is much less. Two passes are made through the change list. The first pass updates the cell display on the screen, corrects the life/death status of each cell and updates the neighbor counts for the adjacent cells. There are some efficiencies gained by using cell triples rather than individual cells since we usually don't need to examine all eight neighbors. The second pass checks (and possibly resets) the life/death status for the cells and their neighbors to build the change list for the next generation. Processing each word is a little complex but very fast. A 64k block of code exists with routines on each 256-byte boundary. Generally speaking, the entry point corresponds to the high byte of the cell word. This byte contains the life/death values and a bit to indicate if this is an edge condition. During the first pass we take the high byte of the cell, AND it with 0xFE and jump to that address. During the second pass we take the high byte, OR it with 0x01 and jump to that address. Determining which changes must be made to a cell triple is easy and surprisingly quick. There's no counting! Instead, I use a 64k lookup table indexed by the cell triple itself. The value is equal to what the high byte should be in the next generation. If this value is equal to the current high byte then no changes are necessary to the cell. Otherwise it is placed in the change list. Look at the code in the Test() and Fix() functions to see how this is done. How each segment is used ------------------------ CS : 64k code (table). DS : DGROUP (1st pass). 64k cell life/death classification table (2nd pass). ES : Both change lists (so we can use STOSW). SS : DGROUP. The life cell grid and row/column table. Indexed via BP. FS : Video segment. GS : Unused. This code can be taken a little further but I've reached the point of diminishing returns (and time). Plus, I've already bumped up against the 400 line limit and have had to squeeze things into an uncomfortable state. One obvious improvement would be to replace the calculation to determine the screen offset with a table lookup. The contest isn't timing the I/O so I've left this optimization out.