Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #58077 > unrolled thread

Algorithm that makes maximum compression of completly diffused data.

Started byjonas.thornvall@gmail.com
First post2013-10-30 11:21 -0700
Last post2013-11-03 04:50 -0500
Articles 20 on this page of 72 — 22 participants

Back to article view | Back to comp.lang.python


Contents

  Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 11:21 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 18:53 +0000
      Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:01 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:18 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:22 -0700
            Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:31 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:23 -0700
            Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:35 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Ethan Furman <ethan@stoneleaf.us> - 2013-11-02 21:26 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-10-30 20:28 +0100
        Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-10-30 21:30 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-10-31 05:54 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 21:52 +0000
        Re: Algorithm that makes maximum compression of completly diffused data. Tim Chase <python.list@tim.thechases.com> - 2013-10-30 18:01 -0500
        Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-10-31 10:41 +1100
    Re: Algorithm that makes maximum compression of completly diffused data. Dan Stromberg <drsalists@gmail.com> - 2013-10-30 12:29 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Tim Delaney <timothy.c.delaney@gmail.com> - 2013-10-31 06:35 +1100
      Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:47 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Modulok <modulok@gmail.com> - 2013-10-30 13:46 -0600
      Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:47 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Gene Heskett <gheskett@wdtv.com> - 2013-10-30 16:32 -0400
        Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-02 14:31 -0700
          Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-02 14:37 -0700
          Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-03 03:17 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-03 15:10 +1100
            Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-11-03 15:34 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-11-03 15:51 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-03 19:40 -0800
            Re: Algorithm that makes maximum compression of completly diffused data. Tim Chase <python.list@tim.thechases.com> - 2013-11-04 07:08 -0600
          Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 05:53 -0800
            Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 06:00 -0800
            Re: Algorithm that makes maximum compression of completly diffused
 data. Dave Angel <davea@davea.name> - 2013-11-04 08:27 -0600
              Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-04 06:46 -0800
              Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 14:34 -0800
                Re: Algorithm that makes maximum compression of completly diffused
 data. Dave Angel <davea@davea.name> - 2013-11-04 19:29 -0600
                Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-05 04:33 +0000
                  Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-05 04:36 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-07 00:05 -0800
              Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 10:59 -0800
              Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-07 11:22 -0800
              Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 09:26 +1100
                Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-07 18:05 -0800
                  Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:17 +1100
                    Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-07 18:25 -0800
                      Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-07 18:36 -0800
                      Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:36 +1100
                      Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 18:43 -0800
                        Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-08 04:47 +0000
                          Re: Algorithm that makes maximum compression of completly diffused   data. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-11-08 20:09 +1300
                            Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 18:21 +1100
                        Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-08 07:48 -0800
                          Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-08 07:57 -0800
                          Re: Algorithm that makes maximum compression of completly diffused data. Ian Kelly <ian.g.kelly@gmail.com> - 2013-11-08 11:48 -0700
                      Re: Algorithm that makes maximum compression of completly diffused data. "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> - 2013-11-07 21:43 -0500
                      Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 14:05 +1100
                        Re: Algorithm that makes maximum compression of completly diffused data. Roy Smith <roy@panix.com> - 2013-11-07 22:08 -0500
                      Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 14:24 +1100
                      Re: Algorithm that makes maximum compression of completly diffused data. "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> - 2013-11-07 23:05 -0500
                      Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 15:06 +1100
                      Re: Algorithm that makes maximum compression of completly diffused
 data. Dave Angel <davea@davea.name> - 2013-11-07 22:12 -0600
                      Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-08 05:32 +0000
                  Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 18:24 -0800
                    Re: Algorithm that makes maximum compression of completly diffused   data. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-11-08 20:16 +1300
                  Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:27 +1100
        Re: Algorithm that makes maximum compression of completly diffused data. Ethan Furman <ethan@stoneleaf.us> - 2013-11-02 21:26 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-02 23:09 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Michael Torrie <torriem@gmail.com> - 2013-11-03 08:14 -0700
      Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:49 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Grant Edwards <invalid@invalid.invalid> - 2013-10-30 21:18 +0000
    Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-10-30 14:26 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Dave Angel <davea@davea.name> - 2013-10-31 03:22 +0000
    Re: Algorithm that makes maximum compression of completly diffused data. Gene Heskett <gheskett@wdtv.com> - 2013-11-03 04:50 -0500

Page 2 of 4 — ← Prev page 1 [2] 3 4  Next page →


#58117

FromGene Heskett <gheskett@wdtv.com>
Date2013-10-30 16:32 -0400
Message-ID<mailman.1847.1383166927.18130.python-list@python.org>
In reply to#58103
On Wednesday 30 October 2013 16:29:12 jonas.thornvall@gmail.com did opine:

> Den onsdagen den 30:e oktober 2013 kl. 20:46:57 UTC+1 skrev Modulok:
> > On Wed, Oct 30, 2013 at 12:21 PM,  <jonas.t...@gmail.com> wrote:
> > 
> > 
> > 
> > I am searching for the program or algorithm that makes the best
> > possible of completly (diffused data/random noise) and wonder what
> > the state of art compression is.
> > 
> > 
> > 
> > 
> > I understand this is not the correct forum but since i think i have an
> > algorithm that can do this very good, and do not know where to turn
> > for such question i was thinking to start here.
> > 
> > 
> > 
> > It is of course lossless compression i am speaking of.
> > 
> > --
> > 
> > https://mail.python.org/mailman/listinfo/python-list
> > 
> > 
> >  
> > 
> > >> I am searching for the program or algorithm that makes the best
> > >> possible of completly (diffused data/random noise) and wonder what
> > >> the state of art
> > >> 
> > >> compression is.
> > 
> > None. If the data to be compressed is truly homogeneous, random noise
> > as you describe (for example a 100mb file read from cryptographically
> > secure random
> > 
> > bit generator such as /dev/random on *nix systems), the
> > state-of-the-art lossless compression is zero and will remain that
> > way for the foreseeable
> > 
> > future.
> > 
> > 
> > There is no lossless algorithm that will reduce truly random (high
> > entropy) data by any significant margin. In classical information
> > theory, such an
> > 
> > algorithm can never be invented. See: Kolmogorov complexity
> > 
> > 
> > Real world data is rarely completely random. You would have to test
> > various
> > 
> > algorithms on the data set in question. Small things such as
> > non-obvious statistical clumping can make a big difference in the
> > compression ratio from
> > 
> > one algorithm to another. Data that might look "random", might not
> > actually be random in the entropy sense of the word.
> > 
> > >> I understand this is not the correct forum but since i think i have
> > >> an algorithm that can do this very good, and do not know where to
> > >> turn for such
> > >> 
> > >> question i was thinking to start here.
> > 
> > Not to sound like a downer, but I would wager that the data you're
> > testing your
> > 
> > algorithm on is not as truly random as you imply or is not a large
> > enough body of test data to draw such conclusions from. It's akin to
> > inventing a perpetual
> > 
> > motion machine or an inertial propulsion engine or any other
> > classically impossible solutions. (This only applies to truly random
> > data.)
> > 
> > 
> > 
> > -Modulok-
> 
> Well then i have news for you.

Congratulations Jonas.  My kill file for this list used to have only one 
name, but now has 2.  Unfortunately I will still see the backscatter of 
others trying to tell you how to post and interact with this list.  But for 
now, this person with, since we are quoting IQ's here, a tested 147 says 
good by.

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)

BOFH excuse #275:

Bit rot
A pen in the hand of this president is far more
dangerous than 200 million guns in the hands of
         law-abiding citizens.

[toc] | [prev] | [next] | [standalone]


#58347

FromTim Roberts <timr@probo.com>
Date2013-11-02 14:31 -0700
Message-ID<s9ra79dvvnd01rpuqg9hfh6bija8u2q49h@4ax.com>
In reply to#58103
jonas.thornvall@gmail.com wrote:
>
>Well then i have news for you.

Well, then, why don't you share it?

Let me try to get you to understand WHY what you say is impossible.  Let's
say you do have a function f(x) that can produce a compressed output y for
any given x, such that y is always smaller than x.  If that were true, then
I could call f() recursively:
    f(f(...f(f(f(f(f(x)))))...))
and eventually the result get down to a single bit.  I hope it is clear
that there's no way to restore a single bit back into different source
texts.

Here's another way to look at it.  If f(x) is smaller than x for every x,
that means there MUST me multiple values of x that produce the same f(x).
Do you see?  If x is three bits and f(x) is two bits, that means there are
8 possible values for x but only 4 values for f(x).  So, given an f(x), you
cannot tell which value of x it came from.  You have lost information.
-- 
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

[toc] | [prev] | [next] | [standalone]


#58349

FromMark Janssen <dreamingforward@gmail.com>
Date2013-11-02 14:37 -0700
Message-ID<mailman.1958.1383428272.18130.python-list@python.org>
In reply to#58347
> Let me try to get you to understand WHY what you say is impossible.  Let's
> say you do have a function f(x) that can produce a compressed output y for
> any given x, such that y is always smaller than x.  If that were true, then
> I could call f() recursively:
>     f(f(...f(f(f(f(f(x)))))...))
> and eventually the result get down to a single bit.  I hope it is clear
> that there's no way to restore a single bit back into different source
> texts.

Hey, that's a nice proof!

Cheers,

Mark Janssen

[toc] | [prev] | [next] | [standalone]


#58358

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-11-03 03:17 +0000
Message-ID<5275c065$0$29972$c3e8da3$5496439d@news.astraweb.com>
In reply to#58347
On Sat, 02 Nov 2013 14:31:09 -0700, Tim Roberts wrote:

> jonas.thornvall@gmail.com wrote:
>>
>>Well then i have news for you.
> 
> Well, then, why don't you share it?
> 
> Let me try to get you to understand WHY what you say is impossible.
[snip reasons]

Expanding on Tim's post... the first scenario Tim gives, applying the 
compression repeatedly to its own output until you eventually get a 
single byte, can be overcome if there are data sets that are unchanged by 
compression. That is, if f() is the compression function:

f(f(f(data))) = f(f(data)) == f(data) == data

for some values of data. So if you start with some other data, and 
compress it, then compress it again, and again, eventually you end up 
with one of these attractors, after which repeated compression doesn't 
give you any more benefit.

[Aside: the attractors aren't necessarily a single point. The attractor 
could be a cycle of two or more points, f(x) == y and f(y) == x. Or you 
could even have a "strange attractor", which brings us to chaos theory.] 

But your second reason, better known as the pigeonhole principle, 
demonstrates that for any lossless compression method, there must be data 
sets that actually expand the data. It doesn't matter how cleverly you 
compress the data, you can't fit 20kg of potatoes in a 10kg bag, so to 
speak. Suppose your compression algorithm compresses a single byte into a 
nybble (four bits). There are 256 different input data sets (0x00, 
0x01, ... 0xFF) and only 16 different outputs (0x0, 0x1, ... 0xF). There 
is no way for 256 pigeons to fit in 16 pigeon holes unless you put two or 
more pigeons in at least one hole. Ergo, if the compression algorithm is 
lossless, *some* data must be expanded rather than compressed.

Alternatively, the compression may be lossy. Or both!

Obviously data isn't compressed a byte at a time, that would be silly. 
But the principle still applies.

There is a way to apparently get around these limits: store data 
externally, perhaps inside the compression application itself. Then, if 
you just look at the compressed file (the "data.zip" equivalent, although 
I stress that zip compression is *not* like this), you might think it has 
shrunk quite a lot. But when you include the hidden data, the compression 
is not quite so impressive...


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#58360

FromChris Angelico <rosuav@gmail.com>
Date2013-11-03 15:10 +1100
Message-ID<mailman.1965.1383451840.18130.python-list@python.org>
In reply to#58358
On Sun, Nov 3, 2013 at 2:17 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> There is a way to apparently get around these limits: store data
> externally, perhaps inside the compression application itself. Then, if
> you just look at the compressed file (the "data.zip" equivalent, although
> I stress that zip compression is *not* like this), you might think it has
> shrunk quite a lot. But when you include the hidden data, the compression
> is not quite so impressive...

Storing externally is, of course, a very useful thing - it's just not
compression. For instance, a git repository (and quite likely a
Mercurial one too) keeps track of files as blobs of data referenced by
their cryptographic hashes. The current state of a directory can be
described by listing file names with their object hashes, which is a
very compact notation; but it doesn't have _all_ the information, and
the "decompression" process involves fetching file contents from the
library. It's a tool in the arsenal, but it's not compression in the
same way that PK-ZIP is.

With real compression algorithms, there's usually an "out" clause that
detects that the algorithm's doing a bad job. The file format for
PK-ZIP and, I expect, every other archiving+compression file format,
has allowances for some of the files to be stored uncompressed. That
way, there's no need for any file to be enlarged by more than some
signature - say, a one-byte "compression type" marker, or even a
one-bit mark somewhere. But enlarged they must be, for the reasons
Steven explained.

ChrisA

[toc] | [prev] | [next] | [standalone]


#58399

FromJoshua Landau <joshua@landau.ws>
Date2013-11-03 15:34 +0000
Message-ID<mailman.1986.1383492895.18130.python-list@python.org>
In reply to#58358
On 3 November 2013 03:17, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Sat, 02 Nov 2013 14:31:09 -0700, Tim Roberts wrote:
>
>> jonas.thornvall@gmail.com wrote:
>>>
>>>Well then i have news for you.
>>
>> Well, then, why don't you share it?
>>
>> Let me try to get you to understand WHY what you say is impossible.
> [snip reasons]
>
> But your second reason, better known as the pigeonhole principle,
> demonstrates that for any lossless compression method, there must be data
> sets that actually expand the data. It doesn't matter how cleverly you
> compress the data, you can't fit 20kg of potatoes in a 10kg bag, so to
> speak. Suppose your compression algorithm compresses a single byte into a
> nybble (four bits). There are 256 different input data sets (0x00,
> 0x01, ... 0xFF) and only 16 different outputs (0x0, 0x1, ... 0xF). There
> is no way for 256 pigeons to fit in 16 pigeon holes unless you put two or
> more pigeons in at least one hole. Ergo, if the compression algorithm is
> lossless, *some* data must be expanded rather than compressed.

You have to be careful to make this totally rigorous, too.

Note that I *can* make a "compression" algorithm that takes any
length-n sequence and compresses all but one sequence by at least one
bit, and does not ever expand the data.

"00" -> ""
"01" -> "0"
"10" -> "1"
"11" -> "00"

This, obviously, is just 'cause the length is an extra piece of data,
but sometimes you have to store that anyway ;). So if I have a list of
N length-Y lists containing only 1s or 0s, I can genuinely compress
the whole structure by N log2 Y items.

[toc] | [prev] | [next] | [standalone]


#58400

FromJoshua Landau <joshua@landau.ws>
Date2013-11-03 15:51 +0000
Message-ID<mailman.1987.1383493918.18130.python-list@python.org>
In reply to#58358
On 3 November 2013 15:34, Joshua Landau <joshua@landau.ws> wrote:
>I can genuinely compress
> the whole structure by N log2 Y items.

By which I mean 2N items.

[toc] | [prev] | [next] | [standalone]


#58426

FromMark Janssen <dreamingforward@gmail.com>
Date2013-11-03 19:40 -0800
Message-ID<mailman.2004.1383536444.18130.python-list@python.org>
In reply to#58358
> Note that I *can* make a "compression" algorithm that takes any
> length-n sequence and compresses all but one sequence by at least one
> bit, and does not ever expand the data.
>
> "00" -> ""
> "01" -> "0"
> "10" -> "1"
> "11" -> "00"
>
> This, obviously, is just 'cause the length is an extra piece of data,
> but sometimes you have to store that anyway ;).

And how many bits will you use to store the length?

> So if I have a list of
> N length-Y lists containing only 1s or 0s, I can genuinely compress
> the whole structure by N log2 Y items.

But you cheated by using a piece of information from "outside the
system": length.  A generic compression algorithm doesn't have this
information beforehand.
-- 
MarkJ
Tacoma, Washington

[toc] | [prev] | [next] | [standalone]


#58442

FromTim Chase <python.list@tim.thechases.com>
Date2013-11-04 07:08 -0600
Message-ID<mailman.2021.1383570396.18130.python-list@python.org>
In reply to#58358
On 2013-11-03 19:40, Mark Janssen wrote:
> But you cheated by using a piece of information from "outside the
> system": length.  A generic compression algorithm doesn't have this
> information beforehand.

By cheating with outside information, you can perfectly compress any
one data-set down to 1 bit.  Just store the data in the program, then
store 1 bit of "is this file the data we have stored in the
program?".  Granted, in modern OSes, you waste 7 extra bits since
they require you to write an entire byte at a time. :-)

And with that, you could even have an empty file and test for a file
extension. ;-)

-tkc


[toc] | [prev] | [next] | [standalone]


#58443

Fromjonas.thornvall@gmail.com
Date2013-11-04 05:53 -0800
Message-ID<9d998707-a6e9-4911-a585-c2310e4a2b31@googlegroups.com>
In reply to#58347
Den lördagen den 2:e november 2013 kl. 22:31:09 UTC+1 skrev Tim Roberts:
> jonas.thornvall@gmail.com wrote:
> 
> >
> 
> >Well then i have news for you.
> 
> 
> 
> Well, then, why don't you share it?
> 
> 
> 
> Let me try to get you to understand WHY what you say is impossible.  Let's
> 
> say you do have a function f(x) that can produce a compressed output y for
> 
> any given x, such that y is always smaller than x.  If that were true, then
> 
> I could call f() recursively:
> 
>     f(f(...f(f(f(f(f(x)))))...))
> 
> and eventually the result get down to a single bit.  I hope it is clear
> 
> that there's no way to restore a single bit back into different source
> 
> texts.
> 
> 
> 
> Here's another way to look at it.  If f(x) is smaller than x for every x,
> 
> that means there MUST me multiple values of x that produce the same f(x).
> 
> Do you see?  If x is three bits and f(x) is two bits, that means there are
> 
> 8 possible values for x but only 4 values for f(x).  So, given an f(x), you
> 
> cannot tell which value of x it came from.  You have lost information.
> 
> -- 
> 
> Tim Roberts, timr@probo.com
> 
> Providenza & Boekelheide, Inc.

Well let me try to explain why it is working and i have implemented one.
I only need to refresh my memory it was almost 15 years ago.
This is not the solution but this is why it is working.
65536=256^2=16^4=***4^8***=2^16

Yes i am aware that 256 is a single byte 8 bits, but the approach is valid anyway.

[toc] | [prev] | [next] | [standalone]


#58444

Fromjonas.thornvall@gmail.com
Date2013-11-04 06:00 -0800
Message-ID<c24273c6-b6d5-44c4-a04e-c01c9ac9a7cc@googlegroups.com>
In reply to#58443
Den måndagen den 4:e november 2013 kl. 14:53:28 UTC+1 skrev jonas.t...@gmail.com:
> Den lördagen den 2:e november 2013 kl. 22:31:09 UTC+1 skrev Tim Roberts:
> 
> > jonas.thornvall@gmail.com wrote:
> 
> > 
> 
> > >
> 
> > 
> 
> > >Well then i have news for you.
> 
> > 
> 
> > 
> 
> > 
> 
> > Well, then, why don't you share it?
> 
> > 
> 
> > 
> 
> > 
> 
> > Let me try to get you to understand WHY what you say is impossible.  Let's
> 
> > 
> 
> > say you do have a function f(x) that can produce a compressed output y for
> 
> > 
> 
> > any given x, such that y is always smaller than x.  If that were true, then
> 
> > 
> 
> > I could call f() recursively:
> 
> > 
> 
> >     f(f(...f(f(f(f(f(x)))))...))
> 
> > 
> 
> > and eventually the result get down to a single bit.  I hope it is clear
> 
> > 
> 
> > that there's no way to restore a single bit back into different source
> 
> > 
> 
> > texts.
> 
> > 
> 
> > 
> 
> > 
> 
> > Here's another way to look at it.  If f(x) is smaller than x for every x,
> 
> > 
> 
> > that means there MUST me multiple values of x that produce the same f(x).
> 
> > 
> 
> > Do you see?  If x is three bits and f(x) is two bits, that means there are
> 
> > 
> 
> > 8 possible values for x but only 4 values for f(x).  So, given an f(x), you
> 
> > 
> 
> > cannot tell which value of x it came from.  You have lost information.
> 
> > 
> 
> > -- 
> 
> > 
> 
> > Tim Roberts, timr@probo.com
> 
> > 
> 
> > Providenza & Boekelheide, Inc.
> 
> 
> 
> Well let me try to explain why it is working and i have implemented one.
> 
> I only need to refresh my memory it was almost 15 years ago.
> 
> This is not the solution but this is why it is working.
> 
> 65536=256^2=16^4=***4^8***=2^16
> 
> 
> 
> Yes i am aware that 256 is a single byte 8 bits, but the approach is valid anyway.

You see if the program behave intelligent some small operations can perform wonder on a number.

[toc] | [prev] | [next] | [standalone]


#58445 — Re: Algorithm that makes maximum compression of completly diffused data.

FromDave Angel <davea@davea.name>
Date2013-11-04 08:27 -0600
SubjectRe: Algorithm that makes maximum compression of completly diffused data.
Message-ID<mailman.2022.1383575253.18130.python-list@python.org>
In reply to#58443
On Mon, 4 Nov 2013 05:53:28 -0800 (PST), jonas.thornvall@gmail.com 
wrote:
> Den lördagen den 2:e november 2013 kl. 22:31:09 UTC+1 skrev Tim 
Roberts:
> > Here's another way to look at it.  If f(x) is smaller than x for 
every x,




> > that means there MUST me multiple values of x that produce the 
same f(x).




> > Do you see?  If x is three bits and f(x) is two bits, that means 
there are




> > 8 possible values for x but only 4 values for f(x).  So, given an 
f(x), y=




> > cannot tell which value of x it came from.  You have lost 
information.




> Well let me try to explain why it is working and i have implemented 
one.
> I only need to refresh my memory it was almost 15 years ago.
> This is not the solution but this is why it is working.
> 65536=256^2=16^4=***4^8***=2^16


> Yes i am aware that 256 is a single byte 8 bits, but the approach 
is valid =
> anyway.

And e ^ (I * pi) == -1
So what. ?

Better file that patent, before the patent office realizes the 
analogy to the perpetual motion machine.

-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#58446

Fromrusi <rustompmody@gmail.com>
Date2013-11-04 06:46 -0800
Message-ID<71d2cbde-a660-4a2e-8ca1-cd373e1e9aad@googlegroups.com>
In reply to#58445
On Monday, November 4, 2013 7:57:19 PM UTC+5:30, Dave Angel wrote:
> On Mon, 4 Nov 2013 05:53:28 -0800 (PST), Jonas wrote:
> > Well let me try to explain why it is working and i have implemented one.
> > I only need to refresh my memory it was almost 15 years ago.
> > This is not the solution but this is why it is working.
> > 65536=256^2=16^4=***4^8***=2^16

> > Yes i am aware that 256 is a single byte 8 bits, but the approach 
> > is valid anyway.

> And e ^ (I * pi) == -1
> So what. ?

> Better file that patent, before the patent office realizes the 
> analogy to the perpetual motion machine.

Now I am too much of a dim-wit to comprehend all this compressified
profundification but I think I have a (dim) clue how such a patent
will be obtained:

Just make a submission in such an unreadable double (then triple,
quadruple...) spaced format that the patent office will grant it in
exasperation.

[toc] | [prev] | [next] | [standalone]


#58464

Fromjonas.thornvall@gmail.com
Date2013-11-04 14:34 -0800
Message-ID<52206c20-b779-42e2-9d6e-41189fe7c4c9@googlegroups.com>
In reply to#58445
Den måndagen den 4:e november 2013 kl. 15:27:19 UTC+1 skrev Dave Angel:
> On Mon, 4 Nov 2013 05:53:28 -0800 (PST), jonas.thornvall@gmail.com 
> 
> wrote:
> 
> > Den lördagen den 2:e november 2013 kl. 22:31:09 UTC+1 skrev Tim 
> 
> Roberts:
> 
> > > Here's another way to look at it.  If f(x) is smaller than x for 
> 
> every x,
> 
> 
> 
> 
> 
> 
> 
> 
> 
> > > that means there MUST me multiple values of x that produce the 
> 
> same f(x).
> 
> 
> 
> 
> 
> 
> 
> 
> 
> > > Do you see?  If x is three bits and f(x) is two bits, that means 
> 
> there are
> 
> 
> 
> 
> 
> 
> 
> 
> 
> > > 8 possible values for x but only 4 values for f(x).  So, given an 
> 
> f(x), y=
> 
> 
> 
> 
> 
> 
> 
> 
> 
> > > cannot tell which value of x it came from.  You have lost 
> 
> information.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> > Well let me try to explain why it is working and i have implemented 
> 
> one.
> 
> > I only need to refresh my memory it was almost 15 years ago.
> 
> > This is not the solution but this is why it is working.
> 
> > 65536=256^2=16^4=***4^8***=2^16
> 
> 
> 
> 
> 
> > Yes i am aware that 256 is a single byte 8 bits, but the approach 
> 
> is valid =
> 
> > anyway.
> 
> 
> 
> And e ^ (I * pi) == -1
> 
> So what. ?
> 

e is an approximation... and your idea is not general for any n.
 
> 
> Better file that patent, before the patent office realizes the 
> 
> analogy to the perpetual motion machine.
> 
> 
> 
> -- 
> 
> DaveA

[toc] | [prev] | [next] | [standalone]


#58474 — Re: Algorithm that makes maximum compression of completly diffused data.

FromDave Angel <davea@davea.name>
Date2013-11-04 19:29 -0600
SubjectRe: Algorithm that makes maximum compression of completly diffused data.
Message-ID<mailman.2038.1383615000.18130.python-list@python.org>
In reply to#58464
On Mon, 4 Nov 2013 14:34:23 -0800 (PST), jonas.thornvall@gmail.com 
wrote:
> e is an approximation... and your idea is not general for any n.

e is certainly not an approximation,  and I never mentioned n.

-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#58477

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-11-05 04:33 +0000
Message-ID<5278752a$0$29972$c3e8da3$5496439d@news.astraweb.com>
In reply to#58464
On Mon, 04 Nov 2013 14:34:23 -0800, jonas.thornvall wrote:

> Den måndagen den 4:e november 2013 kl. 15:27:19 UTC+1 skrev Dave Angel:
>> On Mon, 4 Nov 2013 05:53:28 -0800 (PST), jonas.thornvall@gmail.com
>> wrote:
[...]
>> > This is not the solution but this is why it is working.
>> 
>> > 65536=256^2=16^4=***4^8***=2^16

"this" being Jonas' alleged lossless compression method capable of 
compressing random data.


>> > Yes i am aware that 256 is a single byte 8 bits, but the approach
>> is valid anyway.

I must say, I cannot see the connection between the fact that 256**2 == 
2**16 and compression of random data. I might as well state that I have 
squared the circle, and offer as proof that 3+4 == 5+2.



>> And e ^ (I * pi) == -1
>> 
>> So what. ?
>> 
>> 
> e is an approximation... and your idea is not general for any n.

e is not an approximation, it is a symbolic name for an exact  
transcendental number which cannot be specified exactly by any finite 
number of decimal places.

Your comment about "n" is irrelevant, since Euler's Identity e**(i*pi)=-1 
has nothing to do with "n". But in case you actually meant "i", again you 
are mistaken. i is a symbolic name for an exact number.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#58478

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-11-05 04:36 +0000
Message-ID<527875dc$0$29972$c3e8da3$5496439d@news.astraweb.com>
In reply to#58477
On Tue, 05 Nov 2013 04:33:46 +0000, Steven D'Aprano wrote:

> On Mon, 04 Nov 2013 14:34:23 -0800, jonas.thornvall wrote:
> 
>> Den måndagen den 4:e november 2013 kl. 15:27:19 UTC+1 skrev Dave Angel:
>>> On Mon, 4 Nov 2013 05:53:28 -0800 (PST), jonas.thornvall@gmail.com
>>> wrote:
> [...]
>>> > This is not the solution but this is why it is working.
>>> 
>>> > 65536=256^2=16^4=***4^8***=2^16
> 
> "this" being Jonas' alleged lossless compression method capable of
> compressing random data.

/facepalm

I meant "it", not "this", as in "why it (the compression method) is 
working". Sorry for any confusion.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#58634

FromTim Roberts <timr@probo.com>
Date2013-11-07 00:05 -0800
Message-ID<q7im799jqj9ap3jb94lnqka1m0hnn9bnpv@4ax.com>
In reply to#58443
jonas.thornvall@gmail.com wrote:
>
>Well let me try to explain why it is working and i have implemented one.
>I only need to refresh my memory it was almost 15 years ago.
>This is not the solution but this is why it is working.
>65536=256^2=16^4=***4^8***=2^16

All of those values are indeed the same, and yet that is completely
unrelated to compression.  Did you honestly believe this was actually
explaining anything?

>Yes i am aware that 256 is a single byte 8 bits, but the approach is valid anyway.

This whole post is a most amazing collection of non sequiturs.  If you
would like to describe your compression scheme, there really are people
here who would be interested in reading it (although that number gets less
and less as this thread goes on).
-- 
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

[toc] | [prev] | [next] | [standalone]


#58690

FromMark Janssen <dreamingforward@gmail.com>
Date2013-11-07 10:59 -0800
Message-ID<mailman.2151.1383850790.18130.python-list@python.org>
In reply to#58634
>>Well let me try to explain why it is working and i have implemented one.
>>I only need to refresh my memory it was almost 15 years ago.
>>This is not the solution but this is why it is working.
>>65536=256^2=16^4=***4^8***=2^16
>
> All of those values are indeed the same, and yet that is completely
> unrelated to compression.  Did you honestly believe this was actually
> explaining anything?

I think the idea is that you could take any arbitrary input sequence,
view it as a large number, and then find what exponential equation can
produce that result.  The equation becomes the "compression".

MarkJ
Tacoma, Washington

[toc] | [prev] | [next] | [standalone]


#58694

FromTim Roberts <timr@probo.com>
Date2013-11-07 11:22 -0800
Message-ID<mailman.2152.1383852240.18130.python-list@python.org>
In reply to#58634
Mark Janssen wrote:
>>> Well let me try to explain why it is working and i have implemented one.
>>> I only need to refresh my memory it was almost 15 years ago.
>>> This is not the solution but this is why it is working.
>>> 65536=256^2=16^4=***4^8***=2^16
>>
>> I think the idea is that you could take any arbitrary input sequence,
>> view it as a large number, and then find what exponential equation can
>> produce that result.  The equation becomes the "compression".

Interesting -- I hadn't noticed that.  Of course, now you have the
problem of expressing 4^8 in a compact way.

-- 
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

[toc] | [prev] | [next] | [standalone]


Page 2 of 4 — ← Prev page 1 [2] 3 4  Next page →

Back to top | Article view | comp.lang.python


csiph-web