52 lines
		
	
	
		
			1.6 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			52 lines
		
	
	
		
			1.6 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
This is a patched version of zlib, modified to use
 | 
						|
Pentium-Pro-optimized assembly code in the deflation algorithm. The
 | 
						|
files changed/added by this patch are:
 | 
						|
 | 
						|
README.686
 | 
						|
match.S
 | 
						|
 | 
						|
The speedup that this patch provides varies, depending on whether the
 | 
						|
compiler used to build the original version of zlib falls afoul of the
 | 
						|
PPro's speed traps. My own tests show a speedup of around 10-20% at
 | 
						|
the default compression level, and 20-30% using -9, against a version
 | 
						|
compiled using gcc 2.7.2.3. Your mileage may vary.
 | 
						|
 | 
						|
Note that this code has been tailored for the PPro/PII in particular,
 | 
						|
and will not perform particuarly well on a Pentium.
 | 
						|
 | 
						|
If you are using an assembler other than GNU as, you will have to
 | 
						|
translate match.S to use your assembler's syntax. (Have fun.)
 | 
						|
 | 
						|
Brian Raiter
 | 
						|
breadbox@muppetlabs.com
 | 
						|
April, 1998
 | 
						|
 | 
						|
 | 
						|
Added for zlib 1.1.3:
 | 
						|
 | 
						|
The patches come from
 | 
						|
http://www.muppetlabs.com/~breadbox/software/assembly.html
 | 
						|
 | 
						|
To compile zlib with this asm file, copy match.S to the zlib directory
 | 
						|
then do:
 | 
						|
 | 
						|
CFLAGS="-O3 -DASMV" ./configure
 | 
						|
make OBJA=match.o
 | 
						|
 | 
						|
 | 
						|
Update:
 | 
						|
 | 
						|
I've been ignoring these assembly routines for years, believing that
 | 
						|
gcc's generated code had caught up with it sometime around gcc 2.95
 | 
						|
and the major rearchitecting of the Pentium 4. However, I recently
 | 
						|
learned that, despite what I believed, this code still has some life
 | 
						|
in it. On the Pentium 4 and AMD64 chips, it continues to run about 8%
 | 
						|
faster than the code produced by gcc 4.1.
 | 
						|
 | 
						|
In acknowledgement of its continuing usefulness, I've altered the
 | 
						|
license to match that of the rest of zlib. Share and Enjoy!
 | 
						|
 | 
						|
Brian Raiter
 | 
						|
breadbox@muppetlabs.com
 | 
						|
April, 2007
 |