|
|
 | | From: | cr88192 | | Subject: | RLE+LZ77 speed | | Date: | Thu, 20 Jan 2005 18:11:29 GMT |
|
|
 | I did a simple encoder, and got it working after a while (lots of minor bugs were being a hassle...).
I did an RLE+LZ77 encoder, which uses the basic format:
where compressed data is one of:
Compressor currently compresses at a rate of around 3MB/s on my computer (no big deal, the decompressor is what I am worried about).
The decompressor decompresses at about 250MB/s on my computer, which is imo quite decent. at first I thought I was getting a lot slower speeds, but then found that I was not syncing the timer close enough to the start of the function (printf is supprisingly expensive...).
Computer: Athlon 64 3200+, 1GB DDR400 RAM
Given the lack of entropy coding or similar, the general compression ration is lame. I am able to compress source code to about 30% original size (dunno how that compares). actually, my algo assumes raw data at present. I could do a different version of the algo for plaintext (exploiting the general abscence of codes in the 127..255 range), which could save some on space (about 1 byte/run or similar). I am, however, unsure if this would make a signifigant difference.
maybe I should keep in mind that the whole point is decompression speed rather than compressor efficiency...
maybe later I can mess with an lz78 variant.
anyways, how does this compare with what is "typical" for these set of algos?...
at present no real obvious way stands out to either speed it up or increase compressor efficiency. I am using memcpy and memset for the filling and copying operations as they came out generally faster than things like: while(l--)*ct++=k; and: while(l--)*ct++=*s2++;
the single char case, however, only copies 1 byte at a time.
I had noticed by looking at hexdumps of the data, something seems fammiliar: apparently I have seen data like this before. I wonder though. my memory seems to point to some basic dos stuff, eg, either command.com or some other basic dos files. quite possibly dos was using raw lz77 in some fundamental way, or maybe I am just wrong. it would make sense though, presumably dos was designed on the assumption of being short on memory/disk space.
comments?...
(oh well, going to sleep...).
|
|
 | | From: | Matt Mahoney | | Subject: | Re: RLE+LZ77 speed | | Date: | 20 Jan 2005 17:02:53 -0800 |
|
|
 | cr88192 wrote: > at present no real obvious way stands out to either speed it up or increase > compressor efficiency. > I am using memcpy and memset for the filling and copying operations as they > came out generally faster than things like: > while(l--)*ct++=k; > and: > while(l--)*ct++=*s2++; > > the single char case, however, only copies 1 byte at a time.
On my old PC (750 MHz Duron) memcpy() and memmove() are about the fastest way to copy arrays, at least large ones. They are about the same speed as REP MOVSD and only a little slower than unrolled loops using MOVQ (MMX), which copies 8 bytes at a time. On a newer processor you could use MOVDQ (SSE, 16 bytes), but I found that on newer machines that the memory bus is the bottleneck. Even MOVSB is just as fast, although on older PCs MOVSB is 4 times slower than MOVSD. All of these are still faster than writing loops in C.
I didn't run any tests with small array copies, as would be common in LZ77. You might gain some speed using custom assembler by avoiding the overhead of a function call to memcpy()/memset().
-- Matt Mahoney
|
|
 | | From: | cr88192 | | Subject: | Re: RLE+LZ77 speed | | Date: | Fri, 21 Jan 2005 02:32:56 GMT |
|
|
 | "Matt Mahoney" wrote in message news:1106269373.893897.163440@c13g2000cwb.googlegroups.com... > cr88192 wrote: >> at present no real obvious way stands out to either speed it up or > increase >> compressor efficiency. >> I am using memcpy and memset for the filling and copying operations > as they >> came out generally faster than things like: >> while(l--)*ct++=k; >> and: >> while(l--)*ct++=*s2++; >> >> the single char case, however, only copies 1 byte at a time. > > On my old PC (750 MHz Duron) memcpy() and memmove() are about the > fastest way to copy arrays, at least large ones. They are about the > same speed as REP MOVSD and only a little slower than unrolled loops > using MOVQ (MMX), which copies 8 bytes at a time. On a newer processor > you could use MOVDQ (SSE, 16 bytes), but I found that on newer machines > that the memory bus is the bottleneck. Even MOVSB is just as fast, > although on older PCs MOVSB is 4 times slower than MOVSD. All of these > are still faster than writing loops in C. > yes, this matches with observation.
> I didn't run any tests with small array copies, as would be common in > LZ77. You might gain some speed using custom assembler by avoiding the > overhead of a function call to memcpy()/memset(). > yes.
somehow though I am suspecting that it is now probably "fast enough". it is possibly faster than reading it from disk would be anyways. maybe, maybe not, afaik the drives themselves use compression so more compressed data might physically take more space on the platter, but I am unsure.
assembler could speed it up some, but I am not sure it is necessary or worth the cost. of course, even if I did use assembler, I could allways fall back to the c version if I can't use the assembled version for whatever reason...
I am mostly just curious how fast others can get their decompressors though...
|
|
 | | From: | Ojala Pasi 'Albert' | | Subject: | Re: RLE+LZ77 speed | | Date: | Fri, 21 Jan 2005 09:31:41 +0000 (UTC) |
|
|
 | In article , cr88192 wrote: >I did a simple encoder, and got it working after a while (lots of minor bugs >were being a hassle...). > >I did an RLE+LZ77 encoder, which uses the basic format: > [..] >Given the lack of entropy coding or similar, the general compression ration >is lame. I am able to compress source code to about 30% original size (dunno >how that compares). [..] >at present no real obvious way stands out to either speed it up or increase >compressor efficiency.
As you are interested in RLE and LZ77, have you taken a look at http://www.iki.fi/a1bert/Dev/pucrunch/ ? An Optimizing Hybrid LZ77 RLE Data Compression Program, aka Improving Compression Ratio for Low-Resource Decompression
You can compare your compression ratio and decompression speed to Pucrunch. I don't expect Pucrunch to decompress 250MBps though..
-Pasi -- "And you must know we do not really change over time; we are as flowers unfolding; we merely become more nearly ourselves." -- Khayman in "The Queen of the Damned"
|
|
|