CSE 30 -- Assignment 6 -- Dec 8


The very-fast implementation of life has the following performance:
$ timing circuit5.asm

xterm-size

circuit5.asm 2.77071

palmpilot

circuit5.asm 2.25902

WinCE

circuit5.asm 1.96242
The C code from which the fast assembly code was derived is also now available. The technique used is sometimes called ``bit-slicing'', and essentially what is going on is that 32-pixels are worked on at once in parallel.

Stephane's solution is also available. He was asked to restrain himself and use only those tricks that we expect a good student in the class would know to use, and it achieves the following performance:

$ timing cs30f1/xform.asm

xterm-size

cs30f1/xform.asm 20.06285

palmpilot

cs30f1/xform.asm 17.42992

WinCE

cs30f1/xform.asm 15.45578
Stephane used table-lookup not for counting the neighbors, but instead used a 9-bit index, including the current state of the center pixel, to compute the future state of the pixel (its ``fate'') in one lookup. For some of his code, he found that there is a simple repetitive pattern, and he used an auxilliary C program to ``automatically'' generate it the MIPS assembly language code.
[ CSE home | CSE talks | bsy's home page | webster i/f | yahoo | lycos | altavista | pgp key svr | spam | commerce ]
picture of bsy

bsy@cse.ucsd.edu, last updated Mon Dec 8 21:56:51 PST 1997.

email bsy


Don't make me hand over my privacy keys!