knowledge-database (beta)

Current group: comp.compression

Re: too much information!

Re: too much information!  
John Baez
 Re: too much information!  
Vic Drastik
 Re: too much information!  
Phil Carmody
From:John Baez
Subject:Re: too much information!
Date:Tue, 11 Jan 2005 06:31:19 +0000 (UTC)
In article <87u0po39mw.fsf@nonospaz.fatphil.org>,
Phil Carmody wrote:

>Be warned. RAND's data is complete crap. It's trivially
>compressible by a non-negligible fraction of a percent due
>to incompetant unbiassing. See a comp.compression archive
>for estimates how non-random it is.

Interesting. I haven't found those estimates yet, and would like
a more specific reference, but this reminds me of Stanislaw Lem's
SF comedy "His Master's Voice", where alien radio transmissions
go unnoticed at first and the punch cards holding the data get
thrown out... then used to compile a table of random numbers...
but then, by analyzing the data in the table, they realize it's
a fascinating communication! I don't want to give the rest of the
story away, but it's a wacky reflection on the concepts of randomness
and information.

....................................................................

"Generation of random numbers is too important to be left
to chance..."
Donald Knuth

How many exabytes of information are there in a raindrop?
See http://math.ucr.edu/home/baez/information.html
From:Vic Drastik
Subject:Re: too much information!
Date:Wed, 12 Jan 2005 08:57:53 +1100

"John Baez" wrote in message
news:crvrrn$rov$1@glue.ucr.edu...
> In article <87u0po39mw.fsf@nonospaz.fatphil.org>,
> Phil Carmody wrote:
>
> >Be warned. RAND's data is complete crap. It's trivially
> >compressible by a non-negligible fraction of a percent due
> >to incompetant unbiassing. See a comp.compression archive
> >for estimates how non-random it is.
>
> Interesting. I haven't found those estimates yet, and would like
> a more specific reference, but this reminds me of Stanislaw Lem's
> SF comedy "His Master's Voice", where alien radio transmissions
> go unnoticed at first and the punch cards holding the data get
> thrown out... then used to compile a table of random numbers...
> but then, by analyzing the data in the table, they realize it's
> a fascinating communication! I don't want to give the rest of the
> story away, but it's a wacky reflection on the concepts of randomness
> and information.
>
> ...................................................................
>
> "Generation of random numbers is too important to be left
> to chance..."
> Donald Knuth


This is a quote from Coveyou, not Knuth. See the following or google
http://www.quotationspage.com/quote/461.html


What Knuth *did* say is

"Random numbers should not be generated with a method chosen at random."
—Donald E. Knuth


Vic
From:Phil Carmody
Subject:Re: too much information!
Date:11 Jan 2005 10:09:47 +0200
baez@galaxy.ucr.edu (John Baez) writes:

> In article <87u0po39mw.fsf@nonospaz.fatphil.org>,
> Phil Carmody wrote:
>
> >Be warned. RAND's data is complete crap. It's trivially
> >compressible by a non-negligible fraction of a percent due
> >to incompetant unbiassing. See a comp.compression archive
> >for estimates how non-random it is.
>
> Interesting. I haven't found those estimates yet, and would like
> a more specific reference,

Matt was the mastermind behind the analysis and I pulled out some
explicit entropy rate figures based on his 50-columns theory in
Message-ID: <87vfhyv1wc.fsf@nonospaz.fatphil.org>

It appears that there's maybe only ~3.3213 bits per digit rather
than 3.3219, perhaps less. That could correspond to about 180
characters of redundancy in the whole million digit file. I consider
that to be quite a lot.

The analysis was not taken particularly deep, the thread contains
enough clues how to progress (I was hoping Matt would continue the
investigation, as the idea was his baby).


Phil
--
The gun is good. The penis is evil... Go forth and kill.
   

Copyright © 2006 knowledge-database   -   All rights reserved