Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #58077 > unrolled thread

Algorithm that makes maximum compression of completly diffused data.

Started byjonas.thornvall@gmail.com
First post2013-10-30 11:21 -0700
Last post2013-11-03 04:50 -0500
Articles 20 on this page of 72 — 22 participants

Back to article view | Back to comp.lang.python


Contents

  Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 11:21 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 18:53 +0000
      Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:01 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:18 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:22 -0700
            Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:31 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:23 -0700
            Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:35 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Ethan Furman <ethan@stoneleaf.us> - 2013-11-02 21:26 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-10-30 20:28 +0100
        Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-10-30 21:30 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-10-31 05:54 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 21:52 +0000
        Re: Algorithm that makes maximum compression of completly diffused data. Tim Chase <python.list@tim.thechases.com> - 2013-10-30 18:01 -0500
        Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-10-31 10:41 +1100
    Re: Algorithm that makes maximum compression of completly diffused data. Dan Stromberg <drsalists@gmail.com> - 2013-10-30 12:29 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Tim Delaney <timothy.c.delaney@gmail.com> - 2013-10-31 06:35 +1100
      Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:47 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Modulok <modulok@gmail.com> - 2013-10-30 13:46 -0600
      Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:47 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Gene Heskett <gheskett@wdtv.com> - 2013-10-30 16:32 -0400
        Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-02 14:31 -0700
          Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-02 14:37 -0700
          Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-03 03:17 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-03 15:10 +1100
            Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-11-03 15:34 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-11-03 15:51 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-03 19:40 -0800
            Re: Algorithm that makes maximum compression of completly diffused data. Tim Chase <python.list@tim.thechases.com> - 2013-11-04 07:08 -0600
          Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 05:53 -0800
            Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 06:00 -0800
            Re: Algorithm that makes maximum compression of completly diffused
 data. Dave Angel <davea@davea.name> - 2013-11-04 08:27 -0600
              Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-04 06:46 -0800
              Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 14:34 -0800
                Re: Algorithm that makes maximum compression of completly diffused
 data. Dave Angel <davea@davea.name> - 2013-11-04 19:29 -0600
                Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-05 04:33 +0000
                  Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-05 04:36 +0000
            Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-07 00:05 -0800
              Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 10:59 -0800
              Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-07 11:22 -0800
              Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 09:26 +1100
                Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-07 18:05 -0800
                  Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:17 +1100
                    Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-07 18:25 -0800
                      Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-07 18:36 -0800
                      Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:36 +1100
                      Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 18:43 -0800
                        Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-08 04:47 +0000
                          Re: Algorithm that makes maximum compression of completly diffused   data. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-11-08 20:09 +1300
                            Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 18:21 +1100
                        Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-08 07:48 -0800
                          Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-08 07:57 -0800
                          Re: Algorithm that makes maximum compression of completly diffused data. Ian Kelly <ian.g.kelly@gmail.com> - 2013-11-08 11:48 -0700
                      Re: Algorithm that makes maximum compression of completly diffused data. "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> - 2013-11-07 21:43 -0500
                      Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 14:05 +1100
                        Re: Algorithm that makes maximum compression of completly diffused data. Roy Smith <roy@panix.com> - 2013-11-07 22:08 -0500
                      Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 14:24 +1100
                      Re: Algorithm that makes maximum compression of completly diffused data. "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> - 2013-11-07 23:05 -0500
                      Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 15:06 +1100
                      Re: Algorithm that makes maximum compression of completly diffused
 data. Dave Angel <davea@davea.name> - 2013-11-07 22:12 -0600
                      Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-08 05:32 +0000
                  Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 18:24 -0800
                    Re: Algorithm that makes maximum compression of completly diffused   data. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-11-08 20:16 +1300
                  Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:27 +1100
        Re: Algorithm that makes maximum compression of completly diffused data. Ethan Furman <ethan@stoneleaf.us> - 2013-11-02 21:26 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-02 23:09 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Michael Torrie <torriem@gmail.com> - 2013-11-03 08:14 -0700
      Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:49 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Grant Edwards <invalid@invalid.invalid> - 2013-10-30 21:18 +0000
    Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-10-30 14:26 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. Dave Angel <davea@davea.name> - 2013-10-31 03:22 +0000
    Re: Algorithm that makes maximum compression of completly diffused data. Gene Heskett <gheskett@wdtv.com> - 2013-11-03 04:50 -0500

Page 3 of 4 — ← Prev page 1 2 [3] 4  Next page →


#58708

FromChris Angelico <rosuav@gmail.com>
Date2013-11-08 09:26 +1100
Message-ID<mailman.2161.1383863216.18130.python-list@python.org>
In reply to#58634
On Fri, Nov 8, 2013 at 5:59 AM, Mark Janssen <dreamingforward@gmail.com> wrote:
> I think the idea is that you could take any arbitrary input sequence,
> view it as a large number, and then find what exponential equation can
> produce that result.  The equation becomes the "compression".

Interesting idea, but I don't see how this has anything to do with
compressing arbitrary data. We already have a compact notation for
representing a large number as 2^24*x+2^16*y+2^8*z+q, so I don't see
how this will be any better, other than eliding NULs.

ChrisA

[toc] | [prev] | [next] | [standalone]


#58727

Fromjonas.thornvall@gmail.com
Date2013-11-07 18:05 -0800
Message-ID<9b62770c-7ca1-4a4d-81a5-bf7251bac957@googlegroups.com>
In reply to#58708
Den torsdagen den 7:e november 2013 kl. 23:26:45 UTC+1 skrev Chris Angelico:
> On Fri, Nov 8, 2013 at 5:59 AM, Mark Janssen <dreamingforward@gmail.com> wrote:
> 
> > I think the idea is that you could take any arbitrary input sequence,
> 
> > view it as a large number, and then find what exponential equation can
> 
> > produce that result.  The equation becomes the "compression".
> 
> 
> 
> Interesting idea, but I don't see how this has anything to do with
> 
> compressing arbitrary data. We already have a compact notation for
> 
> representing a large number as 2^24*x+2^16*y+2^8*z+q, so I don't see
> 
> how this will be any better, other than eliding NULs.
> 
> 
> 
> ChrisA

I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression?

[toc] | [prev] | [next] | [standalone]


#58730

FromChris Angelico <rosuav@gmail.com>
Date2013-11-08 13:17 +1100
Message-ID<mailman.2171.1383877064.18130.python-list@python.org>
In reply to#58727
On Fri, Nov 8, 2013 at 1:05 PM,  <jonas.thornvall@gmail.com> wrote:
> I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression?

I don't care how fast. I care about the laws of physics :) You can't
stuff more data into less space without losing some of it.

Also, please lose Google Groups, or check out what other people have
said about making it less obnoxious.

ChrisA

[toc] | [prev] | [next] | [standalone]


#58733

Fromjonas.thornvall@gmail.com
Date2013-11-07 18:25 -0800
Message-ID<13c04f06-f1f2-4f67-b975-3cff28714641@googlegroups.com>
In reply to#58730
Den fredagen den 8:e november 2013 kl. 03:17:36 UTC+1 skrev Chris Angelico:
> On Fri, Nov 8, 2013 at 1:05 PM,  <jonas.thornvall@gmail.com> wrote:
> 
> > I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression?
> 
> 
> 
> I don't care how fast. I care about the laws of physics :) You can't
> 
> stuff more data into less space without losing some of it.
> 
> 
> 
> Also, please lose Google Groups, or check out what other people have
> 
> said about making it less obnoxious.
> 
> 
> 
> ChrisA

Please, you are he obnoxious, so fuck off or go learn about reformulation of problems. Every number has an infinite number of arithmetical solutions. So every number do has a shortest arithmetical encoding. And that is not the hard part to figure out, the hard part is to find a generic arithmetic encoding.

I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536.

[toc] | [prev] | [next] | [standalone]


#58736

Fromrusi <rustompmody@gmail.com>
Date2013-11-07 18:36 -0800
Message-ID<cf822290-556c-44a2-bd41-93c18b688d8b@googlegroups.com>
In reply to#58733
On Friday, November 8, 2013 7:55:18 AM UTC+5:30, jonas wrote:
> Den fredagen den 8:e november 2013 kl. 03:17:36 UTC+1 skrev Chris Angelico:
> > On Fri, Nov 8, 2013 at 1:05 PM,  jonas.thornvall wrote:
> > > I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression?

> > I don't care how fast. I care about the laws of physics :) You can't
> > stuff more data into less space without losing some of it.
> > Also, please lose Google Groups, or check out what other people have
> > said about making it less obnoxious.
> > ChrisA

> Please, you are he obnoxious, so fuck off or go learn about
> reformulation of problems. Every number has an infinite number of
> arithmetical solutions. So every number do has a shortest
> arithmetical encoding. And that is not the hard part to figure out,
> the hard part is to find a generic arithmetic encoding.

Since we are discussing profound matters such as cosmic laws,
here's my 'all-universal' law:

When you engage with a troll you get trolled.

[toc] | [prev] | [next] | [standalone]


#58737

FromChris Angelico <rosuav@gmail.com>
Date2013-11-08 13:36 +1100
Message-ID<mailman.2175.1383878218.18130.python-list@python.org>
In reply to#58733
On Fri, Nov 8, 2013 at 1:25 PM,  <jonas.thornvall@gmail.com> wrote:
> Please, you are he obnoxious, so fuck off or go learn about reformulation of problems. Every number has an infinite number of arithmetical solutions. So every number do has a shortest arithmetical encoding. And that is not the hard part to figure out, the hard part is to find a generic arithmetic encoding.
>
> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536.

I can see that 4^8 = 65536. Now how are you going to render 65537? You
claimed that you could render *any* number efficiently. What you've
proven is that a small subset of numbers can be rendered efficiently.

Maybe what you can do is render a large number of small subsets of
numbers efficiently. (Let's say all instances of integer^integer, and
all cases of Fibonacci numbers.) You'll still have some holes in
between which you can't render as tidily, and these are the ones where
the best representation is the raw one, plus a small tag saying that
it's raw. That's how most compression algorithms work.

Also, please don't swear. When I described Google Groups posts as
"obnoxious", I was using the word in a strictly correct way, and that
is not an invitation for profanity. Of course, if you want to be seen
as *yourself* obnoxious, that's one of the easier ways to accomplish
that.

ChrisA

[toc] | [prev] | [next] | [standalone]


#58739

FromMark Janssen <dreamingforward@gmail.com>
Date2013-11-07 18:43 -0800
Message-ID<mailman.2178.1383878604.18130.python-list@python.org>
In reply to#58733
>> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536.
>
> I can see that 4^8 = 65536. Now how are you going to render 65537? You
> claimed that you could render *any* number efficiently. What you've
> proven is that a small subset of numbers can be rendered efficiently.

I think the idea would be to find the prime factorization for a given
number, which has been proven to be available (and unique) for any and
every number.  Most numbers can compress given this technique.  Prime
numbers, of course, would not be compressed.

-- 
MarkJ
Tacoma, Washington

[toc] | [prev] | [next] | [standalone]


#58751

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-11-08 04:47 +0000
Message-ID<527c6cf1$0$29983$c3e8da3$5496439d@news.astraweb.com>
In reply to#58739
On Thu, 07 Nov 2013 18:43:17 -0800, Mark Janssen wrote:

>>> I am not sure if it is just stupidness or laziness that prevent you
>>> from seeing that 4^8=65536.
>>
>> I can see that 4^8 = 65536. Now how are you going to render 65537? You
>> claimed that you could render *any* number efficiently. What you've
>> proven is that a small subset of numbers can be rendered efficiently.
> 
> I think the idea would be to find the prime factorization for a given
> number, which has been proven to be available (and unique) for any and
> every number.  Most numbers can compress given this technique.

Sadly, they would not.

Let's suppose you wish to compress the number 143. So you find the prime 
factorization, 11*13=143. Have you compressed anything? No you have not 
-- the original number, 143, fits in a single byte, while it requires 
*two* bytes to deal with 11 and 13 (one byte for 11, and a second byte 
for 13).

We can try a different strategy: while 143 requires a full eight bits 
(one byte), both 11 and 13 require only four bits (one nybble) each. So 
by hard coding our expectation that the result will be two nybbles, we 
can avoid expanding the data and end up with exactly zero compression:

143 = binary 10001110 or eight bits in total
11, 13 = binary 1011 1101 or eight bits in total

But while that trick works for 143, it doesn't work for 154 (one byte, 
binary 10011010) which has prime factorization 2*7*11 (three nybbles 0010 
0111 1011). Even if you can somehow drop the leading zeroes, it requires 
nine bits versus eight.

Suppose instead of using bytes, you use decimal digits. 11 and 13 use 
four digits, while 143 uses only three. Likewise, 154 has three decimal 
digits while 2*7*11 has four digits. In both cases, there is no 
compression.

How about recording which prime number it is, instead of the actual value 
of the prime? So 154=2*7*11 expands to the 1st, 4th and 5th prime, so 
maybe we can record 1 4 5 instead of 2 7 11 and save some bits. (We'll 
save more bits the larger the prime.) Of course, to reverse the 
compression you need to keep a table of the prime numbers somewhere, and 
that's going to be a lot bigger than the space you save...

Now, I accept that, possibly, with sufficient work we might be able to 
find some cunning mechanism by which we can compress *any* integer value. 
But it won't be the same mechanism for every value! For example, we might 
compress (2**1000+1) using a run-length encoding of the binary bits, 
while compressing 14629839399572435720381495231 as its prime 
factorization (the 319th prime number, the 479th prime, the 499th prime 
six times), and 10**1000 as an encoding of the decimal digits. But we 
still need some sort of tag to distinguish between these different 
compression schemes, and once you take into account the extra information 
in the tag, there will be cases where some numbers aren't compressed at 
all.

The ability to losslessly compress *everything* is sheer pseudo-
mathematics, up there with finding an exact rational value for pi, or 
squaring the circle using only a compass and straight-edge. But the 
ability to losslessly compress *useful data* is real, and such algorithms 
operate by finding non-random patterns and redundant information in the 
data.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#58756 — Re: Algorithm that makes maximum compression of completly diffused data.

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2013-11-08 20:09 +1300
SubjectRe: Algorithm that makes maximum compression of completly diffused data.
Message-ID<be3h0tFr5c2U1@mid.individual.net>
In reply to#58751
Steven D'Aprano wrote:
> Of course, to reverse the 
> compression you need to keep a table of the prime numbers somewhere,

That's not strictly necessary -- you could calculate them
as needed. It wouldn't be *fast*, but it would work...

You've got me thinking now about how viable a compression
scheme this would be, efficiency issues aside. I suppose
it would depend on things like the average density of primes
and the average number of prime factors a number has.
Any number theorists here?

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#58768

FromChris Angelico <rosuav@gmail.com>
Date2013-11-08 18:21 +1100
Message-ID<mailman.2199.1383903345.18130.python-list@python.org>
In reply to#58756
On Fri, Nov 8, 2013 at 6:09 PM, Gregory Ewing
<greg.ewing@canterbury.ac.nz> wrote:
> You've got me thinking now about how viable a compression
> scheme this would be, efficiency issues aside. I suppose
> it would depend on things like the average density of primes
> and the average number of prime factors a number has.
> Any number theorists here?

Start by coming up with an efficient storage representation for the
primes. Don't forget that you can't use a bitfield, because eg 98 =
7*7*2, so you somehow need to represent that the 7 comes up twice. And
you can't limit the size of your primes (eg by representing 98 as
"\x07\x07\x02").

And that's without dealing with the major issue of _finding_ prime
factors for a large number.

ChrisA

[toc] | [prev] | [next] | [standalone]


#58806

Fromjonas.thornvall@gmail.com
Date2013-11-08 07:48 -0800
Message-ID<992d347f-8e90-4c6b-b9e0-e539272f9721@googlegroups.com>
In reply to#58739
Den fredagen den 8:e november 2013 kl. 03:43:17 UTC+1 skrev zipher:
> >> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536.
> 
> >
> 
> > I can see that 4^8 = 65536. Now how are you going to render 65537? You
> 
> > claimed that you could render *any* number efficiently. What you've
> 
> > proven is that a small subset of numbers can be rendered efficiently.
> 
> 
> 
> I think the idea would be to find the prime factorization for a given
> 
> number, which has been proven to be available (and unique) for any and
> 
> every number.  Most numbers can compress given this technique.  Prime
> 
> numbers, of course, would not be compressed.
> 
> 
> 
> -- 
> 
> MarkJ
> 
> Tacoma, Washington

3^2-2^2=5 

[toc] | [prev] | [next] | [standalone]


#58807

Fromrusi <rustompmody@gmail.com>
Date2013-11-08 07:57 -0800
Message-ID<4a7e5f20-f0f1-44e3-9087-ce3d919c1d44@googlegroups.com>
In reply to#58806
On Friday, November 8, 2013 9:18:05 PM UTC+5:30, jonas.t...@gmail.com wrote:
> Den fredagen den 8:e november 2013 kl. 03:43:17 UTC+1 skrev zipher:
> > >> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536.
> > > I can see that 4^8 = 65536. Now how are you going to render 65537? You
> > > claimed that you could render *any* number efficiently. What you've
> > > proven is that a small subset of numbers can be rendered efficiently.
> > I think the idea would be to find the prime factorization for a given
> > number, which has been proven to be available (and unique) for any and
> > every number.  Most numbers can compress given this technique.  Prime
> > numbers, of course, would not be compressed.


> 3^2-2^2=5

Oh my! What a profound insight!!

[toc] | [prev] | [next] | [standalone]


#58837

FromIan Kelly <ian.g.kelly@gmail.com>
Date2013-11-08 11:48 -0700
Message-ID<mailman.2242.1383936569.18130.python-list@python.org>
In reply to#58806
On Fri, Nov 8, 2013 at 8:48 AM,  <jonas.thornvall@gmail.com> wrote:
> 3^2-2^2=5

How do you intend to encode 3**2 - 2**2 in such a way that it is more
compact than simply encoding 5?  If you actually have an algorithm,
you should share it instead of dropping these cryptic one-line
non-explanations and leaving us guessing about the details.  But I'm
starting to think that you don't actually have an algorithm at all,
whereas my initial impression was that you did and were simply
mistaken about its effectiveness.

[toc] | [prev] | [next] | [standalone]


#58740

From"R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com>
Date2013-11-07 21:43 -0500
Message-ID<mailman.2177.1383878602.18130.python-list@python.org>
In reply to#58733

On Nov 7, 2013, at 21:25, jonas.thornvall@gmail.com wrote:

> Den fredagen den 8:e november 2013 kl. 03:17:36 UTC+1 skrev Chris Angelico:
>> On Fri, Nov 8, 2013 at 1:05 PM,  <jonas.thornvall@gmail.com> wrote:
>> 
>>> I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression?
>> 
>> 
>> 
>> I don't care how fast. I care about the laws of physics :) You can't
>> 
>> stuff more data into less space without losing some of it.
>> 
>> 
>> 
>> Also, please lose Google Groups, or check out what other people have
>> 
>> said about making it less obnoxious.
>> 
>> 
>> 
>> ChrisA
> 
> Please, you are he obnoxious, so fuck off or go learn about reformulation of problems. Every number has an infinite number of arithmetical solutions. So every number do has a shortest arithmetical encoding. And that is not the hard part to figure out, the hard part is to find a generic arithmetic encoding.
> 
> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536.

Chris's point is more subtle: the typical computer will store the number 65536 in a single byte, but it will also store 4 and 8 in one byte. So if your choice is between sending "65536" and "(4,8)", you actually loose efficiency in your scheme. Don't think in decimal, but in terms of information needing transfer. 

You might reply that you don't need a whole byte for 4 or 8 -- that's true. You could, e.g., just encode the fourth and eight bits of a single byte and send that: certainly gives some compression but at the cost of generality-- what would you do with 65^65? It's sort of like the mean-variance tradeoff in statistics; it's much easier to encode certain data sets (e.g. powers of two as you noted) but only if you concede your algorithm will only work for those values. 

More generally, check out the work of Claud Shannon; a very accessible and worthwhile author. 
> -- 
> https://mail.python.org/mailman/listinfo/python-list

[toc] | [prev] | [next] | [standalone]


#58741

FromChris Angelico <rosuav@gmail.com>
Date2013-11-08 14:05 +1100
Message-ID<mailman.2179.1383879958.18130.python-list@python.org>
In reply to#58733
On Fri, Nov 8, 2013 at 1:43 PM, Mark Janssen <dreamingforward@gmail.com> wrote:
>>> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536.
>>
>> I can see that 4^8 = 65536. Now how are you going to render 65537? You
>> claimed that you could render *any* number efficiently. What you've
>> proven is that a small subset of numbers can be rendered efficiently.
>
> I think the idea would be to find the prime factorization for a given
> number, which has been proven to be available (and unique) for any and
> every number.  Most numbers can compress given this technique.  Prime
> numbers, of course, would not be compressed.

Also an interesting theory.

1) How do you represent the prime factors?
2) How do you factor large numbers efficiently? Trust me, if you've
solved this one, there are a *LOT* of people who want to know. People
with money, like Visa.
3) Still doesn't deal with all possible numbers.

ChrisA

[toc] | [prev] | [next] | [standalone]


#58742

FromRoy Smith <roy@panix.com>
Date2013-11-07 22:08 -0500
Message-ID<roy-3E9D1A.22083007112013@news.panix.com>
In reply to#58741
In article <mailman.2179.1383879958.18130.python-list@python.org>,
 Chris Angelico <rosuav@gmail.com> wrote:

> 2) How do you factor large numbers efficiently? Trust me, if you've
> solved this one, there are a *LOT* of people who want to know. People
> with money, like Visa.

Not to mention the NSA.

[toc] | [prev] | [next] | [standalone]


#58744

FromChris Angelico <rosuav@gmail.com>
Date2013-11-08 14:24 +1100
Message-ID<mailman.2181.1383881070.18130.python-list@python.org>
In reply to#58733
On Fri, Nov 8, 2013 at 1:43 PM, R. Michael Weylandt
<michael.weylandt@gmail.com> <michael.weylandt@gmail.com> wrote:
> Chris's point is more subtle: the typical computer will store the number 65536 in a single byte, but it will also store 4 and 8 in one byte. So if your choice is between sending "65536" and "(4,8)", you actually loose efficiency in your scheme. Don't think in decimal, but in terms of information needing transfer.

Well, 65536 won't fit in a single byte, nor even in two (only just). A
typical binary representation of 65536 would take 3 bytes, or 4 for a
common integer type: 0x00 0x01 0x00 0x00 (big-endian). So it would be
possible to represent that more compactly, if you deem one byte each
for the base and the exponent: 0x04 0x08. However, that system allows
you to represent just 62,041 possible numbers:

>>> decomp={}
>>> for base in range(256):
    for exp in range(256):
        decomp[base**exp]=base,exp

>>> len(decomp)
62041

The trouble is, these numbers range all the way up from 0**0 == 0 to
255**255 == uhh...
>>> 255**255
46531388344983681457769984555620005635274427815488751368772861643065273360461098097690597702647394229975161523887729348709679192202790820272357752329882392140552515610822058736740145045150003072264722464746837070302159356661765043244993104360887623976285955058200326531849137668562738184397385361179287309286327712528995820702180594566008294593820621769951491324907014215176509758404760451335847252744697820515292329680698271481385779516652518207263143889034764775414387732372812840456880885163361037485452406176311868267428358492408075197688911053603714883403374930891951109790394269793978310190141201019287109375

which would decompress to (obviously) 255 bytes. So you can represent
just over sixty thousand possible 255-byte strings in two bytes with
this system.

To generalize it, you'd need to devote a bit somewhere to saying
"There are more to add in". Let's say you do this on the exponent byte
(high bit for convenience), so you have 0x04 0x08 means 65536, and
0x04 0x88 0x01 0x01 means 65536+1 = 65537. Now we have something that
generalizes; but the efficiency is gone - and there are too many ways
to encode the same value. (Bear in mind, for instance, that 0x01 0xNN
for any NN will still just represent 1, and 0x00 0xNN will represent
0. That's wasting a lot of bits.)

The application I can see for this sort of thing is not data
compression, but puzzles. There are lots of puzzles that humans find
enjoyable that represent an entire grid with less data than it
contains - for one popular example, look at Sudoku. You have a 9x9
grid where each cell could contain any of nine digits, which means a
theoretical information density of 9**81; but the entire grid can be
described by a handful of digits and heaps of blank space. This could
be a similarly-fun mathematical puzzle: 3359232 can be represented as
B1**E1 + B2**E2, where all numbers are single-digit. Find B1, B2, E1,
and E2. In this case, you're guaranteed that the end result is shorter
(four digits), but it's hardly useful for general work.

ChrisA

[toc] | [prev] | [next] | [standalone]


#58747

From"R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com>
Date2013-11-07 23:05 -0500
Message-ID<mailman.2183.1383883528.18130.python-list@python.org>
In reply to#58733

On Nov 7, 2013, at 22:24, Chris Angelico <rosuav@gmail.com> wrote:

> On Fri, Nov 8, 2013 at 1:43 PM, R. Michael Weylandt
> <michael.weylandt@gmail.com> <michael.weylandt@gmail.com> wrote:
>> Chris's point is more subtle: the typical computer will store the number 65536 in a single byte, but it will also store 4 and 8 in one byte. 
> 
> Well, 65536 won't fit in a single byte, nor even in two (only just). A
> typical binary representation of 65536 would take 3 bytes, or 4 for a
> common integer type: 0x00 0x01 0x00 0x00 (big-endian). 

Quite right -- meant C's int type, (machine word) not char. My mistake!

Michael

[toc] | [prev] | [next] | [standalone]


#58748

FromChris Angelico <rosuav@gmail.com>
Date2013-11-08 15:06 +1100
Message-ID<mailman.2184.1383883619.18130.python-list@python.org>
In reply to#58733
On Fri, Nov 8, 2013 at 3:05 PM, R. Michael Weylandt
<michael.weylandt@gmail.com> <michael.weylandt@gmail.com> wrote:
>
>
> On Nov 7, 2013, at 22:24, Chris Angelico <rosuav@gmail.com> wrote:
>
>> On Fri, Nov 8, 2013 at 1:43 PM, R. Michael Weylandt
>> <michael.weylandt@gmail.com> <michael.weylandt@gmail.com> wrote:
>>> Chris's point is more subtle: the typical computer will store the number 65536 in a single byte, but it will also store 4 and 8 in one byte.
>>
>> Well, 65536 won't fit in a single byte, nor even in two (only just). A
>> typical binary representation of 65536 would take 3 bytes, or 4 for a
>> common integer type: 0x00 0x01 0x00 0x00 (big-endian).
>
> Quite right -- meant C's int type, (machine word) not char. My mistake!

Sure. Assuming at least 32-bit words, yes, that's correct. Of course,
this is still just proving that it's {possible, not possible} to
compress specific values, while the OP claimed to compress anything.

ChrisA

[toc] | [prev] | [next] | [standalone]


#58749 — Re: Algorithm that makes maximum compression of completly diffused data.

FromDave Angel <davea@davea.name>
Date2013-11-07 22:12 -0600
SubjectRe: Algorithm that makes maximum compression of completly diffused data.
Message-ID<mailman.2185.1383883932.18130.python-list@python.org>
In reply to#58733
On Thu, 7 Nov 2013 18:43:17 -0800, Mark Janssen 
<dreamingforward@gmail.com> wrote:
> I think the idea would be to find the prime factorization for a 
given
> number, which has been proven to be available (and unique) for any 
and
> every number.  Most numbers can compress given this technique.  
Prime
> numbers, of course, would not be compressed.

No, very few numbers can be compressed using this technique. And 
primes of course expand greatly.

-- 
DaveA

[toc] | [prev] | [next] | [standalone]


Page 3 of 4 — ← Prev page 1 2 [3] 4  Next page →

Back to top | Article view | comp.lang.python


csiph-web