knowledge-database (beta)

Current group: comp.compression

Lossless compression of existing JPEG files

Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Aslan Kral
 Re: Lossless compression of existing JPEG files  
Jeff Gilchrist
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Fabio Buffoni
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Fulcrum
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Jeff Gilchrist
 Re: Lossless compression of existing JPEG files  
Severian
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
code_wrong
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
newstome at comcast.net
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
code_wrong
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Jeff Gilchrist
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Uwe Herklotz
 Re: Lossless compression of existing JPEG files  
Alexis Gallet
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Konsta Karsisto
 LZ77 choice of codes to convey more information  
John Reiser
 Re: Lossless compression of existing JPEG files  
Jeff Gilchrist
 Re: Lossless compression of existing JPEG files  
SuperFly
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Fabio Buffoni
 Re: Lossless compression of existing JPEG files  
Fabio Buffoni
 Re: Lossless compression of existing JPEG files  
SuperFly
 Re: Lossless compression of existing JPEG files  
Fulcrum
 Re: Lossless compression of existing JPEG files  
Fabio Buffoni
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Michel Bardiaux
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Severian
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Malcolm Taylor
 Re: Lossless compression of existing JPEG files  
Errol Smith
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
SuperFly
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
SuperFly
 Re: Lossless compression of existing JPEG files  
Malcolm Taylor
 Re: Lossless compression of existing JPEG files  
Michael Collins
 Re: Lossless compression of existing JPEG files  
Malcolm Taylor
 Re: Lossless compression of existing JPEG files  
Errol Smith
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Jeff Gilchrist
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Aleks Jakulin
 Re: Lossless compression of existing JPEG files  
Aleks Jakulin
 Re: Lossless compression of existing JPEG files  
Errol Smith
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Errol Smith
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Jeff Gilchrist
 Re: Lossless compression of existing JPEG files  
Severian
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Anton Kratz
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
Anton Kratz
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Aslan Kral
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Malcolm Taylor
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Alexis Gallet
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Aleks Jakulin
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
newstome at comcast.net
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
SuperFly
 Re: Lossless compression of existing JPEG files  
Phil Carmody
 Re: Lossless compression of existing JPEG files  
SuperFly
 Re: Lossless compression of existing JPEG files  
Aleks Jakulin
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Aleks Jakulin
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Aleks Jakulin
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Thomas Richter
 Re: Lossless compression of existing JPEG files  
newstome at comcast.net
 Re: Lossless compression of existing JPEG files  
Aleks Jakulin
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Uwe Herklotz
 Re: Lossless compression of existing JPEG files  
Darryl Lovato
 Re: Lossless compression of existing JPEG files  
Fulcrum
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Malcolm Taylor
 Re: Lossless compression of existing JPEG files  
Guido Vollbeding
 Re: Lossless compression of existing JPEG files  
Matt Mahoney
 Re: Lossless compression of existing JPEG files  
Malcolm Taylor
 Re: Lossless compression of existing JPEG files  
Jeff Gilchrist
 Re: LZ77 choice of codes to convey more information  
matmahoney at yahoo.com
From:Darryl Lovato
Subject:Lossless compression of existing JPEG files
Date:Fri, 07 Jan 2005 19:36:23 GMT
Yesterday, Allume Systems, a division of IMSI (and creators of the popular
"StuffIt" compression technology) announced a new technology which allows
users and developers to losslessly recompress JPEG files an average of 30%
smaller than the original JPEG file (as well as other Compressed data
types/files), WITHOUT additional data loss.

While the "Compression" of existing compressed files has thus far been
viewed as "impossible", the company has acquired and further developed, and
submitted patents, on a technology which allows for Jpeg to be further
compressed. The method is applicable to other compressed data types (Zip,
MPEG, MP3 and others) to be losslessly re-compressed.

This technology results in a smaller file than the original compressed data
with no data loss.

Working Pre-release test tools have been sent to (and verified by)
independent compression test sites, including:




The new technology does NOT break any Information Theory Laws, and will be
shipped later this qtr in commercial products as well as be available for
licensing. The new technology does NOT compress "random files", but rather
previously "compressed files" and "compressed parts" of files. The
technology IS NOT recursive.

The company has filed patents on the new technologies.

The press releases regarding the technology can be found here:

l>


Additionally, a white paper has been posted which details the companies
expansion into image compression from it's traditional lossless
archiving/text compression focus, along with results of the technology.

http://www.stuffit.com/imagecompression/

These technologies will be included in future versions of the StuffIt
product line as well as new products and services, and technology licenses
available from Allume and IMSI.

The core technology will also be licensed to companies in the Medical,
Camera, Camera Phone, Image management, internet acceleration, and many
other product areas.

- Darryl
From:Aslan Kral
Subject:Re: Lossless compression of existing JPEG files
Date:Mon, 17 Jan 2005 12:36:02 +0200

"Darryl Lovato" , haber iletisinde sunlari
yazdi:BE0424B5.15860%dlovato@allume.com...
> The new technology does NOT break any Information Theory Laws, and will be
> shipped later this qtr in commercial products as well as be available for
> licensing. The new technology does NOT compress "random files", but
rather
> previously "compressed files" and "compressed parts" of files. The
> technology IS NOT recursive.
>

Guys, I am not JPEG expert but may I ask a simple question?

I guess this is a special transformation that creates some redundant data
out of some special data
(lossy compressed data), which you would further compress. But what kind of
transformation would it turn out to be to create more redundant data out of
more lossy compressed data?

My guess is they do not decode it but reorganize the data in a special way.
From:Jeff Gilchrist
Subject:Re: Lossless compression of existing JPEG files
Date:17 Jan 2005 12:27:02 -0800
SuperFly wrote:

> If you compress a jpeg file, decompress it and end up with an
> identical copy of the original file. The method is obviously
> lossless and you can be sure the steganographed message is still
> in there because it's the same file.

Yes, but they have to transform the data from the original JPEG into
the compressed form somehow. In doing so, can you make some subtle
changes (like the ones suggested by Matt Mahoney) that might not be
picked up by their compressor? So far I have not found anything but I
haven't tried what Matt was talking about either.

> I don't know if your testing license allows it. But like several
> other people suggested, it might be a good idea to test the
> performance of the compressor with an arithmetic encoded jpeg file.

I'm only speculating but I think the software they gave me only works
with recognizable JPEG formats. It works fine with DCT/Huffman but
will not process an arithmetic encoded JPEG, it gives an error and
quits.

Regards,
Jeff.
From:Matt Mahoney
Subject:Re: Lossless compression of existing JPEG files
Date:19 Jan 2005 13:06:20 -0800
SuperFly wrote:
> On 18 Jan 2005 18:30:33 -0800, "Matt Mahoney"
> wrote:
>
> >Malcolm Taylor wrote:
> >> Also you have to be careful that the decoder is
> >> not filtering the output with a deblocking filter or similar as
this
> >> adds information (in an entropy sense at least).
> >
> >Yeah that could be a problem, but I'd be surprised if PSP did this
on a
> >high quality image. If I understand it right, it would fill in the
> >first 5 AC coefficients by interpolating from the DC coefficients of
> >the adjacent blocks. But I think deblocking would not be needed
here.
> >Here are the quantization tables in a10.jpg (zigzag order). It's
> >about 4-5 bits/pixel more than you need for average quality.
Usually
> >these would be around 16 to 100.
>
> I think almost all image software uses some kind of lowpass
> (unblock/smooth) filter to make jpeg images somewhat more
presentable.
> This adds data and essentially un-quantises the image. Which makes it
> harder to compress and would also explain the bad raw-bmp compression
> results from a10.bmp/a10.raw (in both YUV and RGB model).

I think a deblocking filter should make the image more compressible,
not less. Any deterministic function must either decreases the entropy
of its input or maintain it, but never increase it (the opposite of
thermodynamic entropy).

-- Matt Mahoney
From:Matt Mahoney
Subject:Re: Lossless compression of existing JPEG files
Date:18 Jan 2005 07:50:29 -0800
Fabio Buffoni wrote:
> > Results for my a10.jpg file:
> > On my AMD Athlon 1800+ a10.jpg 842468 -> 643403 76.37% in about 4
sec
>
> A couple of tests recompressing a10.jpg using jpegcrop:
>
> 856024 (huffman default)
> 824441 (huffman optimized)
> 758340 (order-0 arithmetic)
>
> 780872 (progressive - huffman default)
> 780872 (progressive - huffman optimized)
> 737555 (progressive - arithmetic)
>
> 643403 seem to be a quite good work.

I did some more experiments with this file. First I converted to a
..bmp file (using Paint Shop Pro 4.0, although this probably doesn't
matter), and then I tried compressing it with the top 2 programs for
..bmp files on the maximumcompression.com benchmark. These are WinRK
2.0.1 (first) and BMF 2.0 (second). For WinRK I used PWCM mode, 128 MB
(192 MB thrashes on my 256 MB PC). Here are the results.

A10 JPG 842,468
A10 BMP 2,986,038
A10 BMF 1,540,116
A10 RK 1,117,540

I am rather surprised at the poor results. BMF does quite well on
rafale.bmp, which was also converted from a JPEG file. I confirmed the
benchmark result.

RAFALE BMP 4,149,414
RAFALE BMF 669,016

I didn't test PAQAR because I assume that WinRK PWCM is based on the
same algorithm (and it's slower than WinRK, which was 25 minutes at 750
MHz). Also, PAQAR does not have a specific model for BMP files. I
don't know if WinRK does.

I think these results show that there is something more than just
better image modeling going on. I think there is modeling specific to
JPEG (i.e. modeling the DCT coefficients), which is lost in the
conversion to raw pixels.

-- Matt Mahoney
From:Fabio Buffoni
Subject:Re: Lossless compression of existing JPEG files
Date:Tue, 18 Jan 2005 17:16:37 +0100


> I am rather surprised at the poor results. BMF does quite well on
> rafale.bmp, which was also converted from a JPEG file. I confirmed the
> benchmark result.

I did these tests too. Of course you lose the information on dct
coefficients and their distribution.
I think you can get the same results by modifing the original bmp.
Just take A10.BMP and reset the least significant bit of all pixels. The
visual quality wont change significatively and winrk probably will
compress it to approx 750k.

> Also, PAQAR does not have a specific model for BMP files. I
> don't know if WinRK does.

PAQ's BMP compression algo can be greatly improved simply keeping the
recordmodel fixed to the width (multiplied by 3 if it's a 24bit image)
of the bmp. I remember that using this simple trick my paq607fb did
compress rafale approx to 615k.

FB
From:Matt Mahoney
Subject:Re: Lossless compression of existing JPEG files
Date:14 Jan 2005 06:24:54 -0800
SuperFly wrote:
> And if we use the maximum compression example the question should be:
> if we decompress the a10.jpg back to raw data, is there any image
> codec out there that can losslessly compress it to +/- 643.403 bytes
> and decompress it back to raw data. If there isn't we're dealing with
> something special, otherwise not.

I think there is more to it than that. The decompressor also has to
produce a bit for bit identical jpeg file. Just recompressing the
image with jpeg isn't guaranteed to do that. Also, I don't think that
switching from Huffman codes to arithmetic would improve the
compression by 24% unless there were a lot of 1 bit codes.

I'm not an expert on jpeg, but the way I understand it is this: 8x8 DCT
-> quantization (lossy) -> Huffman coding. Correct me if I'm wrong,
but jpeg doesn't seem to exploit any reduncancy between adjacent 8x8
blocks. I think this is where the big gains can be made.
-- Matt Mahoney
From:Fulcrum
Subject:Re: Lossless compression of existing JPEG files
Date:9 Jan 2005 07:38:44 -0800
> Agreed, the only test/benchmarking site that includes a JPEG as
> a test file for lossless compression (www.maximumcompression).
> It did so (it appears) in order to test worst case performance
> of programs, because this has previously been viewed as
> "impossible" - nobody has pulled it off.
> - Darryl

No, I didn't add it to only test worst case behaviour (I would have
tested the 1 million random digit file instead). But I agree it's a
nice site effect to see some compressors expand the file by 23% after
compressing :)

A long time before I started my site I already noticed some simple
compressors like szip and arj where able to (slightly) compress
jpg-files, and other more advanced compressors like ACE had much more
difficulty to do so. That made it an interessting testcase to add to
the site.


Regards,
Werner Bergmans
From:Matt Mahoney
Subject:Re: Lossless compression of existing JPEG files
Date:14 Jan 2005 09:55:43 -0800
newstome@comcast.net wrote:
> "Knowing it is possible" isn't new knowledge in the least.

Well, I have to disagree. People will not try to solve a problem
unless they believe they can succeed. The jpeg benchmark at
maximumcompression.com has been posted for some time, but nobody ever
got very far with it because we all "know" that compressed data can't
be compressed again, so nobody bothered to try. But now that we know
it is possible I think it won't be long before others implement the
"obvious" solution. Now the only question is can you do better than
643,403 bytes?

-- Matt Mahoney
From:Jeff Gilchrist
Subject:Re: Lossless compression of existing JPEG files
Date:8 Jan 2005 03:48:35 -0800
Hi Severian,

Actually I do mention processor types and speeds. If you re-read my
post you will find:

"Test Machine: P4 1.8GHz, 512MB RAM, Win2000"

The sample files are special cases. I saw around 25% compression with
my own files. They only claim 30% on average. I was just pointing out
what the algorithm can do. The sample images do not look like "shit"
already, but even if they did not look that great, you would not want
to lose any more quality.

>From what the company has said, they will be including the algorithm in
their Stuffit archiving software so it will be used for what you
suggest (archival storage, etc...).

Regards,
Jeff.
From:Severian
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 09 Jan 2005 00:51:39 GMT
On 8 Jan 2005 03:48:35 -0800, "Jeff Gilchrist"
wrote:

>Hi Severian,
>
>Actually I do mention processor types and speeds. If you re-read my
>post you will find:
>
>"Test Machine: P4 1.8GHz, 512MB RAM, Win2000"

Sorry, I missed it, even though I looked for it.

>The sample files are special cases. I saw around 25% compression with
>my own files. They only claim 30% on average. I was just pointing out
>what the algorithm can do. The sample images do not look like "shit"
>already, but even if they did not look that great, you would not want
>to lose any more quality.

I haven't seen them, but I find it hard to believe the second example
is useful; perhaps it's an inappropriate file for JPEG compression in
the first place.

As far as losing more quality, that is true.

>>From what the company has said, they will be including the algorithm in
>their Stuffit archiving software so it will be used for what you
>suggest (archival storage, etc...).

Yes, but it also makes their announcement a bit less amazing than they
seem to want everyone to believe!

Anyone seriously dealing with images would archive the originals
(losslessly compressed). I'm not sure I understand the market for this
new compression. Porn collectors? Google.com and archive.org?

It just doesn't seem to be as big a deal as they make it out to be.

--
Sev
From:Phil Carmody
Subject:Re: Lossless compression of existing JPEG files
Date:09 Jan 2005 15:51:03 +0200
Severian writes:
> Anyone seriously dealing with images would archive the originals
> (losslessly compressed). I'm not sure I understand the market for this
> new compression. Porn collectors? Google.com and archive.org?
>
> It just doesn't seem to be as big a deal as they make it out to be.

I think it's more a geek thing. One-upmanship.

I remember people who used LHA, or ARJ or LZH or whatever it was that
was better than PKZip back in the late 80s or early 90s, waltzing around
the place pretending to be oh-so-superior to the lamo's that used the
utterly mediocre PKZip.

Never underestimate the gadget-affinity of geeks.

Phil
--
The gun is good. The penis is evil... Go forth and kill.
From:code_wrong
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 9 Jan 2005 16:21:56 -0000

"Phil Carmody" wrote in message
news:878y7290e0.fsf@nonospaz.fatphil.org...
> Severian writes:
>> Anyone seriously dealing with images would archive the originals
>> (losslessly compressed). I'm not sure I understand the market for this
>> new compression. Porn collectors? Google.com and archive.org?
>>
>> It just doesn't seem to be as big a deal as they make it out to be.
>
> I think it's more a geek thing. One-upmanship.
>
> I remember people who used LHA, or ARJ or LZH or whatever it was that
> was better than PKZip back in the late 80s or early 90s, waltzing around
> the place pretending to be oh-so-superior to the lamo's that used the
> utterly mediocre PKZip.
>
> Never underestimate the gadget-affinity of geeks.

eh? What are you talking about?
This is a million dollar breakthrough
From:Phil Carmody
Subject:Re: Lossless compression of existing JPEG files
Date:10 Jan 2005 15:16:33 +0200
"code_wrong" writes:

> "Phil Carmody" wrote in message
> news:878y7290e0.fsf@nonospaz.fatphil.org...
> > Severian writes:
> >> Anyone seriously dealing with images would archive the originals
> >> (losslessly compressed). I'm not sure I understand the market for this
> >> new compression. Porn collectors? Google.com and archive.org?
> >>
> >> It just doesn't seem to be as big a deal as they make it out to be.
> >
> > I think it's more a geek thing. One-upmanship.
> >
> > I remember people who used LHA, or ARJ or LZH or whatever it was that
> > was better than PKZip back in the late 80s or early 90s, waltzing around
> > the place pretending to be oh-so-superior to the lamo's that used the
> > utterly mediocre PKZip.
> >
> > Never underestimate the gadget-affinity of geeks.
>
> eh? What are you talking about?
> This is a million dollar breakthrough

Yes, you're right, no gadget has ever made a company anything like
a million dollars. Thank you so much for pointing out this undisputable
fact to me and the rest of comp.compression.

Phil
--
The gun is good. The penis is evil... Go forth and kill.
From:Darryl Lovato
Subject:Re: Lossless compression of existing JPEG files
Date:Mon, 10 Jan 2005 16:02:10 GMT
On 1/10/05 5:16 AM, in article 87hdlp5sr2.fsf@nonospaz.fatphil.org, "Phil
Carmody" wrote:

> "code_wrong" writes:
>
>> "Phil Carmody" wrote in message
>> news:878y7290e0.fsf@nonospaz.fatphil.org...
>>> Severian writes:
>>>> Anyone seriously dealing with images would archive the originals
>>>> (losslessly compressed). I'm not sure I understand the market for this
>>>> new compression. Porn collectors? Google.com and archive.org?
>>>>
>>>> It just doesn't seem to be as big a deal as they make it out to be.
>>>
>>> I think it's more a geek thing. One-upmanship.
>>>
>>> I remember people who used LHA, or ARJ or LZH or whatever it was that
>>> was better than PKZip back in the late 80s or early 90s, waltzing around
>>> the place pretending to be oh-so-superior to the lamo's that used the
>>> utterly mediocre PKZip.
>>>
>>> Never underestimate the gadget-affinity of geeks.
>>
>> eh? What are you talking about?
>> This is a million dollar breakthrough
>
> Yes, you're right, no gadget has ever made a company anything like
> a million dollars. Thank you so much for pointing out this undisputable
> fact to me and the rest of comp.compression.

I personally don't mind if what we invented and filed patents for, is called
a "gadget". I'm sure there were people that called the first light bulb a
gadget, the first jet engine a gadget, the first telephone a gadget, the
first personal computer, etc.

We have something that compresses possibly "billions" of files that were
previously "uncompressible" (1-3% isn't significant compression).

These files in question are generally "larger" than the average file size,
they are commonly stored on hard drives, flash cards, backed up, sent via
e-mail, and downloaded/viewed from the web. Now they can be compressed
20-40% on average, and in some cases up to 90%.

This is a very useful, and valuable, "gadget". :-)

And yes, a driving force is "one-upmanship" it's called commercial
competition. There is nothing wrong with trying to make your product better
than the competition. It benefits users, and has happened from the
beginning of time - since "Commerce" was first invented.

- Darryl

> Phil
From:Phil Carmody
Subject:Re: Lossless compression of existing JPEG files
Date:11 Jan 2005 03:12:23 +0200
Darryl Lovato writes:
> On 1/10/05 5:16 AM, in article 87hdlp5sr2.fsf@nonospaz.fatphil.org, "Phil
> Carmody" wrote:

My message id should be in your references header, there's no need to have it
in the body text too.

> > "code_wrong" writes:
> >> "Phil Carmody" wrote in message
> >> news:878y7290e0.fsf@nonospaz.fatphil.org...
> >>> Never underestimate the gadget-affinity of geeks.
> >>
> >> eh? What are you talking about?
> >> This is a million dollar breakthrough
> >
> > Yes, you're right, no gadget has ever made a company anything like
> > a million dollars. Thank you so much for pointing out this undisputable
> > fact to me and the rest of comp.compression.

I forgot possibly the biggest gadget fad of them all in recent years -
cameras on mobile phones.

> I personally don't mind if what we invented and filed patents for, is called
> a "gadget". I'm sure there were people that called the first light bulb a
> gadget, the first jet engine a gadget, the first telephone a gadget, the
> first personal computer, etc.

Indeed.

> This is a very useful, and valuable, "gadget". :-)
>
> And yes, a driving force is "one-upmanship" it's called commercial
> competition. There is nothing wrong with trying to make your product better
> than the competition. It benefits users, and has happened from the
> beginning of time - since "Commerce" was first invented.

Absolutely. Geek pockets can be fairly deep, and there are certainly large
numbers of them. However, they can be exceptionally fickle too, so even
the best gadget can fail in the market. That's what VC was invented for.

Phil
--
The gun is good. The penis is evil... Go forth and kill.
From:newstome at comcast.net
Subject:Re: Lossless compression of existing JPEG files
Date:Wed, 12 Jan 2005 09:25:21 -0600
Phil Carmody wrote:
> Darryl Lovato writes:
>> On 1/10/05 5:16 AM, in article 87hdlp5sr2.fsf@nonospaz.fatphil.org, "Phil
>> Carmody" wrote:
>
> My message id should be in your references header, there's no need to have it
> in the body text too.
>
>> > "code_wrong" writes:
>> >> "Phil Carmody" wrote in message
>> >> news:878y7290e0.fsf@nonospaz.fatphil.org...
>> >>> Never underestimate the gadget-affinity of geeks.
>> >>
>> >> eh? What are you talking about?
>> >> This is a million dollar breakthrough
>> >
>> > Yes, you're right, no gadget has ever made a company anything like
>> > a million dollars. Thank you so much for pointing out this undisputable
>> > fact to me and the rest of comp.compression.
>
> I forgot possibly the biggest gadget fad of them all in recent years -
> cameras on mobile phones.

Unfortunately, taking over 5 seconds on a desktop processor puts it
out of the reasonable range for embedded processors like in phones.

If they can get the time down by an order of magnitude, then things
might start getting more interesting. Under half a second on a
desktop system would mean that you could browse compressed images on a
desktop system fairly painlessly (imagine -- burning 30% more images
on a CD -- great for people who take a lot of digital pictures). But
I'm not going to deal with a lag in image rendering just to fit a
little more on a CD. If you could get it under a second or two on an
embedded processor, then maybe it could be used to increase capacity
directly in things like cameras and phones -- but processor-intensive
algorithms take too long and burn too much power/battery on embedded
devices.

Anyway, I think this is a great development. It's just that they have
too much hype in their original press release (impossible to compress
already compressed data? Pshaw), and they need to improve efficiency
a bit....

--

That's News To Me!
newstome@comcast.net
From:Darryl Lovato
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 09 Jan 2005 01:59:45 GMT
On 1/8/05 4:51 PM, in article dlv0u09m1jv8t7r2fqmn0107lceau8mve3@4ax.com,
"Severian" wrote:

> On 8 Jan 2005 03:48:35 -0800, "Jeff Gilchrist"
> wrote:
>
>> Hi Severian,
>>
>> Actually I do mention processor types and speeds. If you re-read my
>> post you will find:
>>
>> "Test Machine: P4 1.8GHz, 512MB RAM, Win2000"
>
> Sorry, I missed it, even though I looked for it.
>
>> The sample files are special cases. I saw around 25% compression with
>> my own files. They only claim 30% on average. I was just pointing out
>> what the algorithm can do. The sample images do not look like "shit"
>> already, but even if they did not look that great, you would not want
>> to lose any more quality.
>
> I haven't seen them, but I find it hard to believe the second example
> is useful; perhaps it's an inappropriate file for JPEG compression in
> the first place.

The two sample files I sent Jeff were:

A portrait of a pretty young lady. (fully clothed) :-)
A NASA/Space Image of a star cluster.

The JPEG compression was not high enough to make noticeable artifacts in
either image. As stated elsewhere, the more compression jpeg gets, the
further % reduction we can get, so higher jpeg settings do make a
difference, but the nature of the original pre-jpeg'd image also makes a
difference.

> As far as losing more quality, that is true.
>
>>> From what the company has said, they will be including the algorithm in
>> their Stuffit archiving software so it will be used for what you
>> suggest (archival storage, etc...).
>
> Yes, but it also makes their announcement a bit less amazing than they
> seem to want everyone to believe!

Tons of users send and store jpegs. Being able to compress them, even 20%
(about the least we get), without creating additional image loss adds up.

Many times, the user no longer has the original image (camera's storing
directly into jpeg on the compact flash card, etc). You can always take an
image, and recompress it with a higher jpeg compression setting, but that
makes the original JPEG "loss" permanent, AND adds more loss. Our
technology allows you to reduce the size without messing with the quality at
all.

> Anyone seriously dealing with images would archive the originals
> (losslessly compressed). I'm not sure I understand the market for this
> new compression. Porn collectors? Google.com and archive.org?

I'm sure it will be used for all the above :-)

> It just doesn't seem to be as big a deal as they make it out to be.

I suppose it depends on the user. Going from 1-3% compression of JPEGs to
20-40% (in some fringe cases much more) is a pretty significant advancement
to the field of lossless compression IMHO. Especially given the average
"large size" and "large popularity" of these files.

You have no idea how many people, in the past, put a JPEG (or many jpeg
files) into a StuffIt archive, then complain to us that we barely compressed
it at all. Even "reviewers", that should know better, do this more often
than you would think. The technology we announced will be applied to other
compressed file types as well.

- Darryl

> --
> Sev
From:code_wrong
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 9 Jan 2005 02:24:58 -0000

"Darryl Lovato" wrote in message
news:BE05CF2A.158FB%dlovato@allume.com...
> On 1/8/05 4:51 PM, in article dlv0u09m1jv8t7r2fqmn0107lceau8mve3@4ax.com,
> "Severian" wrote:
>
>> On 8 Jan 2005 03:48:35 -0800, "Jeff Gilchrist"
>> wrote:
>>
>>> Hi Severian,
>>>
>>> Actually I do mention processor types and speeds. If you re-read my
>>> post you will find:
>>>
>>> "Test Machine: P4 1.8GHz, 512MB RAM, Win2000"
>>
>> Sorry, I missed it, even though I looked for it.
>>
>>> The sample files are special cases. I saw around 25% compression with
>>> my own files. They only claim 30% on average. I was just pointing out
>>> what the algorithm can do. The sample images do not look like "shit"
>>> already, but even if they did not look that great, you would not want
>>> to lose any more quality.
>>
>> I haven't seen them, but I find it hard to believe the second example
>> is useful; perhaps it's an inappropriate file for JPEG compression in
>> the first place.
>
> The two sample files I sent Jeff were:
>
> A portrait of a pretty young lady. (fully clothed) :-)
> A NASA/Space Image of a star cluster.
>
> The JPEG compression was not high enough to make noticeable artifacts in
> either image. As stated elsewhere, the more compression jpeg gets, the
> further % reduction we can get, so higher jpeg settings do make a
> difference, but the nature of the original pre-jpeg'd image also makes a
> difference.
>
>> As far as losing more quality, that is true.
>>
>>>> From what the company has said, they will be including the algorithm in
>>> their Stuffit archiving software so it will be used for what you
>>> suggest (archival storage, etc...).
>>
>> Yes, but it also makes their announcement a bit less amazing than they
>> seem to want everyone to believe!
>
> Tons of users send and store jpegs. Being able to compress them, even 20%
> (about the least we get), without creating additional image loss adds up.
>
> Many times, the user no longer has the original image (camera's storing
> directly into jpeg on the compact flash card, etc). You can always take
> an
> image, and recompress it with a higher jpeg compression setting, but that
> makes the original JPEG "loss" permanent, AND adds more loss. Our
> technology allows you to reduce the size without messing with the quality
> at
> all.
>
>> Anyone seriously dealing with images would archive the originals
>> (losslessly compressed). I'm not sure I understand the market for this
>> new compression. Porn collectors? Google.com and archive.org?
>
> I'm sure it will be used for all the above :-)
>
>> It just doesn't seem to be as big a deal as they make it out to be.
>
> I suppose it depends on the user. Going from 1-3% compression of JPEGs to
> 20-40% (in some fringe cases much more) is a pretty significant
> advancement
> to the field of lossless compression IMHO. Especially given the average
> "large size" and "large popularity" of these files.
>
> You have no idea how many people, in the past, put a JPEG (or many jpeg
> files) into a StuffIt archive, then complain to us that we barely
> compressed
> it at all. Even "reviewers", that should know better, do this more often
> than you would think. The technology we announced will be applied to
> other
> compressed file types as well.

It brilliant .. when can we see the algorithm ;-)
or the patents even
From:Darryl Lovato
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 09 Jan 2005 02:53:25 GMT
On 1/8/05 6:24 PM, in article 1105237477.60604.0@demeter.uk.clara.net,
"code_wrong" wrote:



>>> It just doesn't seem to be as big a deal as they make it out to be.
>>
>> I suppose it depends on the user. Going from 1-3% compression of JPEGs to
>> 20-40% (in some fringe cases much more) is a pretty significant
>> advancement
>> to the field of lossless compression IMHO. Especially given the average
>> "large size" and "large popularity" of these files.
>>
>> You have no idea how many people, in the past, put a JPEG (or many jpeg
>> files) into a StuffIt archive, then complain to us that we barely
>> compressed
>> it at all. Even "reviewers", that should know better, do this more often
>> than you would think. The technology we announced will be applied to
>> other
>> compressed file types as well.
>
> It brilliant .. when can we see the algorithm ;-)
> or the patents even

Our patent lawyer has advised us to not give out details of the inventions
at this time. There are actually 2 patents.

I assume you can all see the patents when the patent office posts it to
their site - it describes the processes involved. I'm not really sure when
it will be posted on their site, the full utility patent application was
submitted prior to our announcement (and a provisional application a few
months ago as well).

The first consumer versions of software (StuffIt, etc) that include the new
technologies/inventions will be shipped late this qtr. Licensed versions
will be available shortly thereafter.

You'll have to take our word for it (and more importantly Werner Bergmans,
and Jeff Gilchrists' independent verification) for now.

It works.

- Darryl
From:Jeff Gilchrist
Subject:Re: Lossless compression of existing JPEG files
Date:8 Jan 2005 03:52:56 -0800
Phil,

Their JPEG algorithm will be part of a general purpose compressor so it
seems like a good comparison to me. Also, many people are not
"experts" in compression so do not even realize that programs such as
ZIP and RAR get almost no compression from image data like JPEG. I was
also posting the details so people could see time-wise how the
algorithm compares in speed.

Regards,
Jeff.
From:Matt Mahoney
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 09 Jan 2005 00:39:50 GMT
"Darryl Lovato" wrote in message
news:BE0424B5.15860%dlovato@allume.com...
> Yesterday, Allume Systems, a division of IMSI (and creators of the
popular
> "StuffIt" compression technology) announced a new technology which allows
> users and developers to losslessly recompress JPEG files an average of 30%
> smaller than the original JPEG file (as well as other Compressed data
> types/files), WITHOUT additional data loss.
>
> While the "Compression" of existing compressed files has thus far been
> viewed as "impossible", the company has acquired and further developed,
and
> submitted patents, on a technology which allows for Jpeg to be further
> compressed. The method is applicable to other compressed data types (Zip,
> MPEG, MP3 and others) to be losslessly re-compressed.

Interesting. I guess the idea might be to uncompress the data, then
compress it with a better model. This is probably tricker for lossy formats
like JPEG than lossless ones like Zip.

> Working Pre-release test tools have been sent to (and verified by)
> independent compression test sites, including:
>
>

The benchmark has one JPEG file, so far not yet updated. The best
compression currently is 3.5% (WinRK 2.0). (There is also a .bmp file that
was converted from a JPEG so it has JPEG like artifacts that make it easier
to compress losslessly.)

>

This benchmark hasn't been updated since 2002 and doesn't have any JPEG
files. However I'm not aware of any other benchmarks that do.

-- Matt Mahoney
From:Darryl Lovato
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 09 Jan 2005 04:11:02 GMT
On 1/8/05 4:39 PM, in article
q3%Dd.1638$KJ2.170@newsread3.news.atl.earthlink.net, "Matt Mahoney"
wrote:



>> Working Pre-release test tools have been sent to (and verified by)
>> independent compression test sites, including:
>>
>>
>
> The benchmark has one JPEG file, so far not yet updated. The best
> compression currently is 3.5% (WinRK 2.0). (There is also a .bmp file that
> was converted from a JPEG so it has JPEG like artifacts that make it easier
> to compress losslessly.)
>
>>
>
> This benchmark hasn't been updated since 2002 and doesn't have any JPEG
> files.

They posted "verification" as a reply directly to the newsgroup as a
follow-up to my original message - I'm not sure when their web sites will be
updated.

> However I'm not aware of any other benchmarks that do.

Agreed, the only test/benchmarking site that includes a JPEG as a test file
for lossless compression (www.maximumcompression). It did so (it appears)
in order to test worst case performance of programs, because this has
previously been viewed as "impossible" - nobody has pulled it off.

Just an understandable, but unfortunate, association of lossless compression
of "compressed data" and lossless compression of "random data" together.

Part of my job is to look into "compression claims" - no matter how crazy
they might seem (I've talked to a lot of people in the past because of this
- I won't mention names, but we all know who they are) - on the odd chance
that someone did have something.

It's my responsibility to make sure we don't "miss something" important.
Anyway, I understand the questions, and pessimism, which is why we felt it
was important to send working test tools to Jeff and Werner for independent
verification. Having an open mind is a good thing - without it, I would
have passed on this invention - that turned out, works! :-)

- Darryl

> -- Matt Mahoney
>
>
From:Uwe Herklotz
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 9 Jan 2005 09:52:20 +0100
"Matt Mahoney" wrote:

> Interesting. I guess the idea might be to uncompress the data, then
> compress it with a better model. This is probably tricker for lossy
formats
> like JPEG than lossless ones like Zip.

I also thought about such an idea. But even for Zip files it seems
to be very difficult. Uncompressing the data and recompressing with
a better model is easy. But how to ensure that this can be reversed
i.e. it must be possible to recover the original Zip file. Without
knowing the parameters of the original Zip compression this is very
difficult, maybe impossible at all.

Regards
Uwe
From:Alexis Gallet
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 9 Jan 2005 12:10:42 +0100
"Uwe Herklotz" <_no_spam_@yahoo.com> wrote:

> I also thought about such an idea. But even for Zip files it seems
> to be very difficult. Uncompressing the data and recompressing with
> a better model is easy. But how to ensure that this can be reversed
> i.e. it must be possible to recover the original Zip file. Without
> knowing the parameters of the original Zip compression this is very
> difficult, maybe impossible at all.

Indeed ! Even with given parameters (eg size of the sliding window), I think
there are (in general) several valid zip files that correspond to the same
original file.

On the other hand, the JPEG case looks much easier to me, because I don't
think there is this uniqueness issue in the JPEG format.

On the compression side, you would have to :
1) entropy decode the JPEG file (ie, Huffman decode all the coeffs, then
DPCM decode the DC coeffs and RLE decode the AC coeffs) - but don't
unquantize the coeffs neither perform an inverse DCT
2) reencode the coeffs using a context-adaptive arithmetic coder, with
contexts tuned for 8x8 DCT data. Prior to that, one could further
decorrelate the DC coeffs by applying once or twice a reversible wavelet
filter (eg the 5/3 integer filter) on them. But the major compression gain
would imo be obtained by using the correlation between the AC coeffs of
adjacent blocks (same frequency & adjacent spatial locations), which is
completely ignored in JPEG.
3) don't forget to copy in the archive the huffman tables from the header of
the original JPEG file. Although they aren't needed to recover losslessly
the image itself, they are needed to recover a JPEG file which is
bit-identical to the original JPEG.

On the decompression side :
1) entropy decode the archive
2) reencode it with DPCM/RLE followed by Huffman, using the original JPEG
file's tables

Still, I'm not sure this scheme would yield a 20% improvement on the
high-quality JPEGs (quality factor >= 80)... I think that's the part where
Allume's results are the most impressive. And the slowness of their
algorithm (according to Jeff Gilchrist's figures) suggests that they are
doing something more sophisticated than what I'm guessing... Anyone got an
idea ?

Regards,
Alexis Gallet
From:Matt Mahoney
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 09 Jan 2005 18:01:35 GMT
"Uwe Herklotz" <_no_spam_@yahoo.com> wrote in message
news:34c9mhF49196rU1@individual.net...
> "Matt Mahoney" wrote:
>
> > Interesting. I guess the idea might be to uncompress the data, then
> > compress it with a better model. This is probably tricker for lossy
> formats
> > like JPEG than lossless ones like Zip.
>
> I also thought about such an idea. But even for Zip files it seems
> to be very difficult. Uncompressing the data and recompressing with
> a better model is easy. But how to ensure that this can be reversed
> i.e. it must be possible to recover the original Zip file. Without
> knowing the parameters of the original Zip compression this is very
> difficult, maybe impossible at all.
>
> Regards
> Uwe

I agree it would be difficult. Perhaps from the zip file you could tell
which program and version and options produced it, then regenerate the zip
file using the exact same algorithm. In theory you could do this by
unzipping and rezipping for each case until you find a match.

The real problem occurs when you come across an unknown format. Given a
string like

..ABC...ABC...ABC

LZ77 allows the third ABC to be coded as a pointer to the first or second
copy, or as literals, or as a mix of literals and pointers. All of these
would unzip correctly. The general solution would be to record which choice
was made, but this would negate most of the compression savings.

I recall a paper which proposed steganographic zip files using the choice of
coding to hide information. I doubt that such files could be compressed
further using this method.

-- Matt Mahoney
From:Konsta Karsisto
Subject:Re: Lossless compression of existing JPEG files
Date:Sun, 09 Jan 2005 23:59:20 GMT
Matt Mahoney wrote:
> ..ABC...ABC...ABC
>
> LZ77 allows the third ABC to be coded as a pointer to the first or second
> copy, or as literals, or as a mix of literals and pointers. ...

> I recall a paper which proposed steganographic zip files using the choice of
> coding to hide information.

There was a paper in DCC in 2003 or earlier where they used
this redundancy for, I think, error correction. Or, it could
have been something else, too. ;-) Unfortunately, I couldn't
find the reference.


--
KKK
From:John Reiser
Subject:LZ77 choice of codes to convey more information
Date:Wed, 12 Jan 2005 10:05:10 -0800
Matt Mahoney wrote:
> [snip] Given a string like
>
> ..ABC...ABC...ABC
>
> LZ77 allows the third ABC to be coded as a pointer to the first or second
> copy, or as literals, or as a mix of literals and pointers. All of these
> would unzip correctly. The general solution would be to record which choice
> was made, but this would negate most of the compression savings.
>
> I recall a paper which proposed steganographic zip files using the choice of
> coding to hide information. [snip]

Please provide more info about that paper.

For a fixed encoder, even if you require that the coded output for the whole
string be the shortest possible, then there still are choices. The choices
form a lattice, the paths through the lattice can be enumerated, and choosing
a specific path conveys log2(total_paths) additional bits of information.
In practice the amount can be 0.1% to a few percent. But it's hardly hidden,
because there are a few obvious canonical paths: always choose the smallest
offset, favor a literal over a match of the same cost (or vice versa), etc.

--
From:Jeff Gilchrist
Subject:Re: Lossless compression of existing JPEG files
Date:16 Jan 2005 08:40:21 -0800
Matt Mahoney wrote:

> They claim it's lossless. Steganography would be a good test of
> this.

Good idea. To test this out, I grabbed OutGuess 2.0
(http://www.outguess.org/) and steganographically hid a message within
one of the test JPG files:

../outguess -k "willthiswork" -d input.txt DSCN3974.jpg stegged.jpg

The original DSCN3974.jpg test image is 1114198 bytes. The stegged
image stegged.jpg became much smaller at 329258 bytes.

Compressing the stegged image with the Allume JPEG algorithm, brought
that size down to 229783 bytes (30% smaller). As in previous tests,
when I uncompressed the JPEG I got back an identical file (SHA-1
confirmed).

I used outguess again to try and retrieve the encoded message and I was
successful:

../outguess -k "willthiswork" -r stegged.jpg output.txt

The decoded message matched the encoded one, bit for bit (SHA-1).

e3b723bc1571d1a432978c705d8ec7a38e868faa *input.txt
e3b723bc1571d1a432978c705d8ec7a38e868faa *output.txt
So it looks like their algorithm truly is lossless.

Jeff.
From:SuperFly
Subject:Re: Lossless compression of existing JPEG files
Date:Mon, 17 Jan 2005 20:33:31 +0100
On 16 Jan 2005 08:40:21 -0800, "Jeff Gilchrist"
wrote:

>Matt Mahoney wrote:
>
>> They claim it's lossless. Steganography would be a good test of
>> this.
>
>Good idea. To test this out, I grabbed OutGuess 2.0
>(http://www.outguess.org/) and steganographically hid a message within
>one of the test JPG files:

[snip]

>The decoded message matched the encoded one, bit for bit (SHA-1).
>
>e3b723bc1571d1a432978c705d8ec7a38e868faa *input.txt
>e3b723bc1571d1a432978c705d8ec7a38e868faa *output.txt
>So it looks like their algorithm truly is lossless.

If you compress a jpeg file, decompress it and end up with an
identical copy of the original file. The method is obviously lossless
and you can be sure the steganographed message is still in there
because it's the same file.

I don't know if your testing license allows it. But like several other
people suggested, it might be a good idea to test the performance of
the compressor with an arithmetic encoded jpeg file. And then try how
it handles "broken" jpeg files. By systematically injecting random
data in the the jpeg header, huffman data and huffman tables.

Just to see what the compressor is, and is not capable of.

-SF-
From:Matt Mahoney
Subject:Re: Lossless compression of existing JPEG files
Date:14 Jan 2005 20:30:05 -0800
Fabio Buffoni wrote:
> > Results for my a10.jpg file:
> > On my AMD Athlon 1800+ a10.jpg 842468 -> 643403 76.37% in about 4
sec
>
> A couple of tests recompressing a10.jpg using jpegcrop:
>
> 856024 (huffman default)
> 824441 (huffman optimized)
> 758340 (order-0 arithmetic)
>
> 780872 (progressive - huffman default)
> 780872 (progressive - huffman optimized)
> 737555 (progressive - arithmetic)

But can you reverse it and get an identical file?

> 643403 seem to be a quite good work.

Especially since the image was apparently compressed with high quality
setting, which makes it harder to compress losslessly because not much
noise was removed. I looked at the quantization tables and the values
are all in the range 1-3 (DC = 1). Also the chroma is not downsampled.
(Also it seems to be nonstandard if I read the JPEG standard right -
the 4 Huffman and 2 quantization tables are concatentated without
separate headers for each table).

Still, you could improve over the JPEG version of arithmetic coding.
It uses a multiplication free algorithm, essentially rounding the
binary decision probability of the least probable symbol to the nearest
power of 2. That's got to cost a few percent.

> I'm wondering if the algorithm is completely lossless or if it
preserves
> only pixels colors. How does it work if the image has steganography
> information hidden in it?
>
> FB
They claim it's lossless. Steganography would be a good test of this.
From:Fabio Buffoni
Subject:Re: Lossless compression of existing JPEG files
Date:Tue, 18 Jan 2005 15:52:05 +0100
> But can you reverse it and get an identical file?

I think you can rebuild the original jpeg. Here is my idea.
If the jpeg is a valid one, you can pack it before the huffman coding
step. Instead of rebuilding the huffman tree(s) you can directly save it
(in a compressed for of course) so to be able to recover the original
file.. The overhead should be non significant. This method of course
will also be able to keep information in redundant huffman steganography.

FB
From:Fabio Buffoni
Subject:Re: Lossless compression of existing JPEG files
Date:Mon, 17 Jan 2005 19:30:44 +0100

> But can you reverse it and get an identical file?

Of course I cant.

Jeff said the images were all bit for bit identical. This means that the
compressor should resist also to redundant huffman code steganography.
Their work really impresses me.

I'd like to test the program using invalid jpegs, and see how does the
compression vary when 1..n bits/bytes are randomly changed in data stream.

A second test could be also done removing the huffman coder from a jpeg
compression program and trying to pack the result with paq or winrk.
This has the problem you cant reverse it, but you can figure out how
good is their work.
And i think it's very good, both technically and commercially.

FB
From:SuperFly
Subject:Re: Lossless compression of existing JPEG files
Date:Sat, 15 Jan 2005 11:59:19 +0100
On 14 Jan 2005 20:30:05 -0800, "Matt Mahoney"
wrote:

>> 856024 (huffman default)
>> 824441 (huffman optimized)
>> 758340 (order-0 arithmetic)
>>
>> 780872 (progressive - huffman default)
>> 780872 (progressive - huffman optimized)
>> 737555 (progressive - arithmetic)
>
>But can you reverse it and get an identical file?
>
>> 643403 seem to be a quite good work.
>
>Especially since the image was apparently compressed with high quality
>setting, which makes it harder to compress losslessly because not much
>noise was removed. I looked at the quantization tables and the values
>are all in the range 1-3 (DC = 1). Also the chroma is not downsampled.
>(Also it seems to be nonstandard if I read the JPEG standard right -
>the 4 Huffman and 2 quantization tables are concatentated without
>separate headers for each table).

I tested a raw bmp version of the a10.jpg file using several image
suites and image formats (including jpeg2000) and noticed that nothing
i used could get it smaller then +/-720.000 lossless.

But I did notice, jpeg2000 could get it down to +/-600.000 when using
a 98% quality setting. And even much smaller with just a few extra
percents quality loss.

I also noticed jpeg2000 produced far better visual results then
regular jpeg. I think the jpeg2000 wavelet function just preserves
more data then a jpeg dct given the same input data. And i couldn't
distinguish the 98% jpeg2000 compressed image from the original with
the bare eye. So i wouldn't be surprised if the original jpg could be
theoretically build back from a 95%/98% jpeg2000 file. But to be sure
i think one would need to know how much actual data is
stored/preserved in a jpeg/jpeg2000 file using a certain quality
level. Which is a question for the (jpeg) image experts ..

However if a 98% quality-level jpeg2000 file hasn't preserved enough
data to build back the ??% quality-level jpeg a10.jpg file, I agree
they have outperformed everything that's out there by +/-10%. Which is
pretty amazing especially if you consider they claim they can do the
same with audio and video formats. And i'd certainly like to know how
they did that, and what else it possible with their method.

-SF-
From:Fulcrum
Subject:Re: Lossless compression of existing JPEG files
Date:9 Jan 2005 07:22:24 -0800
> > Working Pre-release test tools have been sent to (and verified by)
> > independent compression test sites, including:
> >
> >
>
> The benchmark has one JPEG file, so far not yet updated. The best
> compression currently is 3.5% (WinRK 2.0). (There is also a .bmp
file that
> was converted from a JPEG so it has JPEG like artifacts that make it
easier
> to compress losslessly.)
>
> -- Matt Mahoney

I will update my website when the new Stuffit with this technology is
shipped. The application I tested is just an experimental testbed to
compress jpeg's.

Results for my a10.jpg file:
On my AMD Athlon 1800+ a10.jpg 842468 -> 643403 76.37% in about 4 sec
---
Regards,
Werner Bergmans
From:Fabio Buffoni
Subject:Re: Lossless compression of existing JPEG files
Date:Fri, 14 Jan 2005 19:05:24 +0100
> Results for my a10.jpg file:
> On my AMD Athlon 1800+ a10.jpg 842468 -> 643403 76.37% in about 4 sec

A couple of tests recompressing a10.jpg using jpegcrop:

856024 (huffman default)
824441 (huffman optimized)
758340 (order-0 arithmetic)

780872 (progressive - huffman default)
780872 (progressive - huffman optimized)
737555 (progressive - arithmetic)

643403 seem to be a quite good work.

I'm wondering if the algorithm is completely lossless or if it preserves
only pixels colors. How does it work if the image has steganography
information hidden in it?

FB
From:Guido Vollbeding
Subject:Re: Lossless compression of existing JPEG files
Date:Mon, 17 Jan 2005 11:10:48 +0100
Fabio Buffoni wrote:
>
> > Results for my a10.jpg file:
> > On my AMD Athlon 1800+ a10.jpg 842468 -> 643403 76.37% in about 4 sec
>
> A couple of tests recompressing a10.jpg using jpegcrop:

Yes, that's easy to use and test :).

> 856024 (huffman default)
> 824441 (huffman optimized)
> 758340 (order-0 arithmetic)
>
> 780872 (progressive - huffman default)
> 780872 (progressive - huffman optimized)

Progressive Huffman always includes "optimized", so they must be the same.

> 737555 (progressive - arithmetic)
>
> 643403 seem to be a quite good work.

That's about 13% less than the "state of the art" (progressive arithmetic).
Nice, but that's about the same amount as the advantage of arithmetic
over Huffman. Now see the history: *Despite* that advantage of
arithmetic over Huffman coding, the arithmetic coding variant is
until today virtually unused - due to the patent encumbering!
Again: You *could* reduce your JPEG image space requirements by about
15% *immediately* by using arithmetic instead of Huffman coding, but
*nobody* is doing so.
So why should one assume that people will use another method under these
circumstances?
Another story would be if a new *unencumbered* method would be presented,
but so it's not very impressive.
(BTW, the JPEG arithmetic coding patents are to expire in about 5 or so
years - we will see what this changes in usage...)

Regards
Guido
From:Michel Bardiaux
Subject:Re: Lossless compression of existing JPEG files
Date:Mon, 17 Jan 2005 13:44:59 +0100
Guido Vollbeding wrote:
> Fabio Buffoni wrote:
>
>>>Results for my a10.jpg file:
>>>On my AMD Athlon 1800+ a10.jpg 842468 -> 643403 76.37% in about 4 sec
>>
>>A couple of tests recompressing a10.jpg using jpegcrop:
>
>
> Yes, that's easy to use and test :).
>
>
>>856024 (huffman default)
>>824441 (huffman optimized)
>>758340 (order-0 arithmetic)
>>
>>780872 (progressive - huffman default)
>>780872 (progressive - huffman optimized)
>
>
> Progressive Huffman always includes "optimized", so they must be the same.
>
>
>>737555 (progressive - arithmetic)
>>
>>643403 seem to be a quite good work.
>
>
> That's about 13% less than the "state of the art" (progressive arithmetic).
> Nice, but that's about the same amount as the advantage of arithmetic
> over Huffman. Now see the history: *Despite* that advantage of
> arithmetic over Huffman coding, the arithmetic coding variant is
> until today virtually unused - due to the patent encumbering!
> Again: You *could* reduce your JPEG image space requirements by about
> 15% *immediately* by using arithmetic instead of Huffman coding, but
> *nobody* is doing so.
> So why should one assume that people will use another method under these
> circumstances?
> Another story would be if a new *unencumbered* method would be presented,
> but so it's not very impressive.

IIRC both the arithmetic coder *and* decoder are patented. If the touted
new recoder is patented the same way, it is likely to go the same way as
GIF-LZ, RSA, and the flip-top toothpaste tube: you can do without at an
acceptable cost, so you do without.

The developpers of this new codec should license the decoder for free,
and allow use of the encoder for development and demonstration purposes,
so there would be an incentive to include the decoder in many software
products.

> (BTW, the JPEG arithmetic coding patents are to expire in about 5 or so
> years - we will see what this changes in usage...)

If my take on arithmetic coding is right, probably not, since every
software piece dealing with JPEG would have to be upgraded to include
the arithmetic decoder.

>
> Regards
> Guido


--
Michel Bardiaux
Peaktime Belgium S.A. Bd. du Souverain, 191 B-1160 Bruxelles
Tel : +32 2 790.29.41
From:Guido Vollbeding
Subject:Re: Lossless compression of existing JPEG files
Date:Mon, 17 Jan 2005 14:02:58 +0100
Michel Bardiaux wrote:
>
> > (BTW, the JPEG arithmetic coding patents are to expire in about 5 or so
> > years - we will see what this changes in usage...)
>
> If my take on arithmetic coding is right, probably not, since every
> software piece dealing with JPEG would have to be upgraded to include
> the arithmetic decoder.

Which isn't difficult at all - almost all applications use the IJG codec
as a "black-box" anyway, and so you just grab a new version with enabled
arithmetic coding support and you are done - no other adaptation necessary!
(It's available today in my Jpegcrop program and enhanced library source.)

Regards
Guido
From:Thomas Richter
Subject:Re: Lossless compression of existing JPEG files
Date:17 Jan 2005 13:37:37 GMT
Hi Guide,

> Which isn't difficult at all - almost all applications use the IJG codec
> as a "black-box" anyway,

Where do you take your numbers from? Clearly, IJG is *popular*, and it
might be used for a lot of low-cost end-user applications, but it is
not that heavily used in the B-B market segment for various
reasons. Companies like Kodak or Siemens don't use IJG, for
example. Besides, JPEG is also part of a lot of other standards
(PDF,DICOM to name two) and these standards might also demand various
restrictions on the layout of the JPEG stream, or may require updating
the corresponding "host software". In other words, it is not at all as
easy as you may wish.

The market works a bit different: If people can "compress" their JPEGs
using a custom compressor like "Zip" or "StuffIt", things might turn
out easier for them - even though from an scientific p.o.v. this usage
pattern is pretty pointless, and this feels quite "wrong" from my side
as well. However, such is life.

So long,
Thomas
From:Guido Vollbeding
Subject:Re: Lossless compression of existing JPEG files
Date:Mon, 17 Jan 2005 15:21:24 +0100
Thomas Richter wrote:
>
> > Which isn't difficult at all - almost all applications use the IJG codec
> > as a "black-box" anyway,
>
> Where do you take your numbers from? Clearly, IJG is *popular*, and it
> might be used for a lot of low-cost end-user applications, but it is
> not that heavily used in the B-B market segment for various
> reasons. Companies like Kodak or Siemens don't use IJG, for
> example. Besides, JPEG is also part of a lot of other standards
> (PDF,DICOM to name two) and these standards might also demand various
> restrictions on the layout of the JPEG stream, or may require updating
> the corresponding "host software". In other words, it is not at all as
> easy as you may wish.

OK, I don't speak for the B-B market, I rather speak for the free and
open source segment.
But, contrary to your position (and perhaps other than in the J2K area),
I think that IJG is a *superior* JPEG implementation compared to many
commercial ones. If you ask Tom Lane (organizer IJG) or me, we could
tell you dozens of stories of erroneous proprietary JPEG implementations
which we had to deal with for interoperability issues.
The IJG implementation is only inferior in my eyes compared to my
enhanced version, and given the new features it brings and future
potential, people will be well advised to prefer such implementation
over a proprietary one.
It was, it is, and it will be the open source IJG JPEG implementation
which brings the most advanced JPEG features to the user. Why do you
think that people today are able to rotate or crop their digicam JPEGs
losslessly? This is only due to the IJG jpegtran features introduced
in 1998! I'm not aware of any commercial entity which ever presented
such features independent from or before IJG. And more features are
to come...
I tell you this: The commercial market is not particularly interested
in practical JPEG improvements, because THEY CAN'T MAKE MONEY FROM IT!
So they rather pursue other "technologies" such as Wavelets and JPEG2000
in particular, which are unsuited and inferior, but which have a much
greater potential TO MAKE MONEY FROM.
If you are on this latter train, beware! You won't be able actually
and in the forseeable future to compete with widely used JPEG and IJG
features in particular. Your only chance will be to convince naive
people in other obsolete businesses ("digital cinema" as I've seen on
the commercial JPEG [JPEG2000] site, or established technical "medicine"
as you mentioned earlier, which is undoubtedly the greatest [and
obsolete] business on earth).

Regards
Guido
From:Severian
Subject:Re: Lossless compression of existing JPEG files
Date:Tue, 18 Jan 2005 14:30:51 GMT
On Mon, 17 Jan 2005 15:21:24 +0100, Guido Vollbeding
wrote:

>Thomas Richter wrote:
>>
>> > Which isn't difficult at all - almost all applications us