|
|
 | | From: | John Baez | | Subject: | Re: too much information! | | Date: | Tue, 11 Jan 2005 06:31:19 +0000 (UTC) |
|
|
 | In article <87u0po39mw.fsf@nonospaz.fatphil.org>, Phil Carmody wrote:
>Be warned. RAND's data is complete crap. It's trivially >compressible by a non-negligible fraction of a percent due >to incompetant unbiassing. See a comp.compression archive >for estimates how non-random it is.
Interesting. I haven't found those estimates yet, and would like a more specific reference, but this reminds me of Stanislaw Lem's SF comedy "His Master's Voice", where alien radio transmissions go unnoticed at first and the punch cards holding the data get thrown out... then used to compile a table of random numbers... but then, by analyzing the data in the table, they realize it's a fascinating communication! I don't want to give the rest of the story away, but it's a wacky reflection on the concepts of randomness and information.
....................................................................
"Generation of random numbers is too important to be left to chance..." Donald Knuth
How many exabytes of information are there in a raindrop? See http://math.ucr.edu/home/baez/information.html
|
|
 | | From: | Vic Drastik | | Subject: | Re: too much information! | | Date: | Wed, 12 Jan 2005 08:57:53 +1100 |
|
|
 | "John Baez" wrote in message news:crvrrn$rov$1@glue.ucr.edu... > In article <87u0po39mw.fsf@nonospaz.fatphil.org>, > Phil Carmody wrote: > > >Be warned. RAND's data is complete crap. It's trivially > >compressible by a non-negligible fraction of a percent due > >to incompetant unbiassing. See a comp.compression archive > >for estimates how non-random it is. > > Interesting. I haven't found those estimates yet, and would like > a more specific reference, but this reminds me of Stanislaw Lem's > SF comedy "His Master's Voice", where alien radio transmissions > go unnoticed at first and the punch cards holding the data get > thrown out... then used to compile a table of random numbers... > but then, by analyzing the data in the table, they realize it's > a fascinating communication! I don't want to give the rest of the > story away, but it's a wacky reflection on the concepts of randomness > and information. > > ................................................................... > > "Generation of random numbers is too important to be left > to chance..." > Donald Knuth
This is a quote from Coveyou, not Knuth. See the following or google http://www.quotationspage.com/quote/461.html
What Knuth *did* say is
"Random numbers should not be generated with a method chosen at random." —Donald E. Knuth
Vic
|
|
 | | From: | Phil Carmody | | Subject: | Re: too much information! | | Date: | 11 Jan 2005 10:09:47 +0200 |
|
|
 | baez@galaxy.ucr.edu (John Baez) writes:
> In article <87u0po39mw.fsf@nonospaz.fatphil.org>, > Phil Carmody wrote: > > >Be warned. RAND's data is complete crap. It's trivially > >compressible by a non-negligible fraction of a percent due > >to incompetant unbiassing. See a comp.compression archive > >for estimates how non-random it is. > > Interesting. I haven't found those estimates yet, and would like > a more specific reference,
Matt was the mastermind behind the analysis and I pulled out some explicit entropy rate figures based on his 50-columns theory in Message-ID: <87vfhyv1wc.fsf@nonospaz.fatphil.org>
It appears that there's maybe only ~3.3213 bits per digit rather than 3.3219, perhaps less. That could correspond to about 180 characters of redundancy in the whole million digit file. I consider that to be quite a lot.
The analysis was not taken particularly deep, the thread contains enough clues how to progress (I was hoping Matt would continue the investigation, as the idea was his baby).
Phil -- The gun is good. The penis is evil... Go forth and kill.
|
|
|