Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #58077 > unrolled thread
| Started by | jonas.thornvall@gmail.com |
|---|---|
| First post | 2013-10-30 11:21 -0700 |
| Last post | 2013-11-03 04:50 -0500 |
| Articles | 20 on this page of 72 — 22 participants |
Back to article view | Back to comp.lang.python
Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 11:21 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 18:53 +0000
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:01 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:18 +0000
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:22 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:31 +0000
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:23 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:35 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Ethan Furman <ethan@stoneleaf.us> - 2013-11-02 21:26 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-10-30 20:28 +0100
Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-10-30 21:30 +0000
Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-10-31 05:54 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 21:52 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Tim Chase <python.list@tim.thechases.com> - 2013-10-30 18:01 -0500
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-10-31 10:41 +1100
Re: Algorithm that makes maximum compression of completly diffused data. Dan Stromberg <drsalists@gmail.com> - 2013-10-30 12:29 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Tim Delaney <timothy.c.delaney@gmail.com> - 2013-10-31 06:35 +1100
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:47 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Modulok <modulok@gmail.com> - 2013-10-30 13:46 -0600
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:47 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Gene Heskett <gheskett@wdtv.com> - 2013-10-30 16:32 -0400
Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-02 14:31 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-02 14:37 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-03 03:17 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-03 15:10 +1100
Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-11-03 15:34 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-11-03 15:51 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-03 19:40 -0800
Re: Algorithm that makes maximum compression of completly diffused data. Tim Chase <python.list@tim.thechases.com> - 2013-11-04 07:08 -0600
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 05:53 -0800
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 06:00 -0800
Re: Algorithm that makes maximum compression of completly diffused
data. Dave Angel <davea@davea.name> - 2013-11-04 08:27 -0600
Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-04 06:46 -0800
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 14:34 -0800
Re: Algorithm that makes maximum compression of completly diffused
data. Dave Angel <davea@davea.name> - 2013-11-04 19:29 -0600
Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-05 04:33 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-05 04:36 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-07 00:05 -0800
Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 10:59 -0800
Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-07 11:22 -0800
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 09:26 +1100
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-07 18:05 -0800
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:17 +1100
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-07 18:25 -0800
Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-07 18:36 -0800
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:36 +1100
Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 18:43 -0800
Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-08 04:47 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-11-08 20:09 +1300
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 18:21 +1100
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-08 07:48 -0800
Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-08 07:57 -0800
Re: Algorithm that makes maximum compression of completly diffused data. Ian Kelly <ian.g.kelly@gmail.com> - 2013-11-08 11:48 -0700
Re: Algorithm that makes maximum compression of completly diffused data. "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> - 2013-11-07 21:43 -0500
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 14:05 +1100
Re: Algorithm that makes maximum compression of completly diffused data. Roy Smith <roy@panix.com> - 2013-11-07 22:08 -0500
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 14:24 +1100
Re: Algorithm that makes maximum compression of completly diffused data. "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> - 2013-11-07 23:05 -0500
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 15:06 +1100
Re: Algorithm that makes maximum compression of completly diffused
data. Dave Angel <davea@davea.name> - 2013-11-07 22:12 -0600
Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-08 05:32 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 18:24 -0800
Re: Algorithm that makes maximum compression of completly diffused data. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-11-08 20:16 +1300
Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:27 +1100
Re: Algorithm that makes maximum compression of completly diffused data. Ethan Furman <ethan@stoneleaf.us> - 2013-11-02 21:26 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-02 23:09 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Michael Torrie <torriem@gmail.com> - 2013-11-03 08:14 -0700
Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:49 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Grant Edwards <invalid@invalid.invalid> - 2013-10-30 21:18 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-10-30 14:26 -0700
Re: Algorithm that makes maximum compression of completly diffused data. Dave Angel <davea@davea.name> - 2013-10-31 03:22 +0000
Re: Algorithm that makes maximum compression of completly diffused data. Gene Heskett <gheskett@wdtv.com> - 2013-11-03 04:50 -0500
Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-08 09:26 +1100 |
| Message-ID | <mailman.2161.1383863216.18130.python-list@python.org> |
| In reply to | #58634 |
On Fri, Nov 8, 2013 at 5:59 AM, Mark Janssen <dreamingforward@gmail.com> wrote: > I think the idea is that you could take any arbitrary input sequence, > view it as a large number, and then find what exponential equation can > produce that result. The equation becomes the "compression". Interesting idea, but I don't see how this has anything to do with compressing arbitrary data. We already have a compact notation for representing a large number as 2^24*x+2^16*y+2^8*z+q, so I don't see how this will be any better, other than eliding NULs. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | jonas.thornvall@gmail.com |
|---|---|
| Date | 2013-11-07 18:05 -0800 |
| Message-ID | <9b62770c-7ca1-4a4d-81a5-bf7251bac957@googlegroups.com> |
| In reply to | #58708 |
Den torsdagen den 7:e november 2013 kl. 23:26:45 UTC+1 skrev Chris Angelico: > On Fri, Nov 8, 2013 at 5:59 AM, Mark Janssen <dreamingforward@gmail.com> wrote: > > > I think the idea is that you could take any arbitrary input sequence, > > > view it as a large number, and then find what exponential equation can > > > produce that result. The equation becomes the "compression". > > > > Interesting idea, but I don't see how this has anything to do with > > compressing arbitrary data. We already have a compact notation for > > representing a large number as 2^24*x+2^16*y+2^8*z+q, so I don't see > > how this will be any better, other than eliding NULs. > > > > ChrisA I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression?
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-08 13:17 +1100 |
| Message-ID | <mailman.2171.1383877064.18130.python-list@python.org> |
| In reply to | #58727 |
On Fri, Nov 8, 2013 at 1:05 PM, <jonas.thornvall@gmail.com> wrote: > I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression? I don't care how fast. I care about the laws of physics :) You can't stuff more data into less space without losing some of it. Also, please lose Google Groups, or check out what other people have said about making it less obnoxious. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | jonas.thornvall@gmail.com |
|---|---|
| Date | 2013-11-07 18:25 -0800 |
| Message-ID | <13c04f06-f1f2-4f67-b975-3cff28714641@googlegroups.com> |
| In reply to | #58730 |
Den fredagen den 8:e november 2013 kl. 03:17:36 UTC+1 skrev Chris Angelico: > On Fri, Nov 8, 2013 at 1:05 PM, <jonas.thornvall@gmail.com> wrote: > > > I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression? > > > > I don't care how fast. I care about the laws of physics :) You can't > > stuff more data into less space without losing some of it. > > > > Also, please lose Google Groups, or check out what other people have > > said about making it less obnoxious. > > > > ChrisA Please, you are he obnoxious, so fuck off or go learn about reformulation of problems. Every number has an infinite number of arithmetical solutions. So every number do has a shortest arithmetical encoding. And that is not the hard part to figure out, the hard part is to find a generic arithmetic encoding. I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536.
[toc] | [prev] | [next] | [standalone]
| From | rusi <rustompmody@gmail.com> |
|---|---|
| Date | 2013-11-07 18:36 -0800 |
| Message-ID | <cf822290-556c-44a2-bd41-93c18b688d8b@googlegroups.com> |
| In reply to | #58733 |
On Friday, November 8, 2013 7:55:18 AM UTC+5:30, jonas wrote: > Den fredagen den 8:e november 2013 kl. 03:17:36 UTC+1 skrev Chris Angelico: > > On Fri, Nov 8, 2013 at 1:05 PM, jonas.thornvall wrote: > > > I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression? > > I don't care how fast. I care about the laws of physics :) You can't > > stuff more data into less space without losing some of it. > > Also, please lose Google Groups, or check out what other people have > > said about making it less obnoxious. > > ChrisA > Please, you are he obnoxious, so fuck off or go learn about > reformulation of problems. Every number has an infinite number of > arithmetical solutions. So every number do has a shortest > arithmetical encoding. And that is not the hard part to figure out, > the hard part is to find a generic arithmetic encoding. Since we are discussing profound matters such as cosmic laws, here's my 'all-universal' law: When you engage with a troll you get trolled.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-08 13:36 +1100 |
| Message-ID | <mailman.2175.1383878218.18130.python-list@python.org> |
| In reply to | #58733 |
On Fri, Nov 8, 2013 at 1:25 PM, <jonas.thornvall@gmail.com> wrote: > Please, you are he obnoxious, so fuck off or go learn about reformulation of problems. Every number has an infinite number of arithmetical solutions. So every number do has a shortest arithmetical encoding. And that is not the hard part to figure out, the hard part is to find a generic arithmetic encoding. > > I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536. I can see that 4^8 = 65536. Now how are you going to render 65537? You claimed that you could render *any* number efficiently. What you've proven is that a small subset of numbers can be rendered efficiently. Maybe what you can do is render a large number of small subsets of numbers efficiently. (Let's say all instances of integer^integer, and all cases of Fibonacci numbers.) You'll still have some holes in between which you can't render as tidily, and these are the ones where the best representation is the raw one, plus a small tag saying that it's raw. That's how most compression algorithms work. Also, please don't swear. When I described Google Groups posts as "obnoxious", I was using the word in a strictly correct way, and that is not an invitation for profanity. Of course, if you want to be seen as *yourself* obnoxious, that's one of the easier ways to accomplish that. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Mark Janssen <dreamingforward@gmail.com> |
|---|---|
| Date | 2013-11-07 18:43 -0800 |
| Message-ID | <mailman.2178.1383878604.18130.python-list@python.org> |
| In reply to | #58733 |
>> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536. > > I can see that 4^8 = 65536. Now how are you going to render 65537? You > claimed that you could render *any* number efficiently. What you've > proven is that a small subset of numbers can be rendered efficiently. I think the idea would be to find the prime factorization for a given number, which has been proven to be available (and unique) for any and every number. Most numbers can compress given this technique. Prime numbers, of course, would not be compressed. -- MarkJ Tacoma, Washington
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-11-08 04:47 +0000 |
| Message-ID | <527c6cf1$0$29983$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #58739 |
On Thu, 07 Nov 2013 18:43:17 -0800, Mark Janssen wrote: >>> I am not sure if it is just stupidness or laziness that prevent you >>> from seeing that 4^8=65536. >> >> I can see that 4^8 = 65536. Now how are you going to render 65537? You >> claimed that you could render *any* number efficiently. What you've >> proven is that a small subset of numbers can be rendered efficiently. > > I think the idea would be to find the prime factorization for a given > number, which has been proven to be available (and unique) for any and > every number. Most numbers can compress given this technique. Sadly, they would not. Let's suppose you wish to compress the number 143. So you find the prime factorization, 11*13=143. Have you compressed anything? No you have not -- the original number, 143, fits in a single byte, while it requires *two* bytes to deal with 11 and 13 (one byte for 11, and a second byte for 13). We can try a different strategy: while 143 requires a full eight bits (one byte), both 11 and 13 require only four bits (one nybble) each. So by hard coding our expectation that the result will be two nybbles, we can avoid expanding the data and end up with exactly zero compression: 143 = binary 10001110 or eight bits in total 11, 13 = binary 1011 1101 or eight bits in total But while that trick works for 143, it doesn't work for 154 (one byte, binary 10011010) which has prime factorization 2*7*11 (three nybbles 0010 0111 1011). Even if you can somehow drop the leading zeroes, it requires nine bits versus eight. Suppose instead of using bytes, you use decimal digits. 11 and 13 use four digits, while 143 uses only three. Likewise, 154 has three decimal digits while 2*7*11 has four digits. In both cases, there is no compression. How about recording which prime number it is, instead of the actual value of the prime? So 154=2*7*11 expands to the 1st, 4th and 5th prime, so maybe we can record 1 4 5 instead of 2 7 11 and save some bits. (We'll save more bits the larger the prime.) Of course, to reverse the compression you need to keep a table of the prime numbers somewhere, and that's going to be a lot bigger than the space you save... Now, I accept that, possibly, with sufficient work we might be able to find some cunning mechanism by which we can compress *any* integer value. But it won't be the same mechanism for every value! For example, we might compress (2**1000+1) using a run-length encoding of the binary bits, while compressing 14629839399572435720381495231 as its prime factorization (the 319th prime number, the 479th prime, the 499th prime six times), and 10**1000 as an encoding of the decimal digits. But we still need some sort of tag to distinguish between these different compression schemes, and once you take into account the extra information in the tag, there will be cases where some numbers aren't compressed at all. The ability to losslessly compress *everything* is sheer pseudo- mathematics, up there with finding an exact rational value for pi, or squaring the circle using only a compass and straight-edge. But the ability to losslessly compress *useful data* is real, and such algorithms operate by finding non-random patterns and redundant information in the data. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Gregory Ewing <greg.ewing@canterbury.ac.nz> |
|---|---|
| Date | 2013-11-08 20:09 +1300 |
| Subject | Re: Algorithm that makes maximum compression of completly diffused data. |
| Message-ID | <be3h0tFr5c2U1@mid.individual.net> |
| In reply to | #58751 |
Steven D'Aprano wrote: > Of course, to reverse the > compression you need to keep a table of the prime numbers somewhere, That's not strictly necessary -- you could calculate them as needed. It wouldn't be *fast*, but it would work... You've got me thinking now about how viable a compression scheme this would be, efficiency issues aside. I suppose it would depend on things like the average density of primes and the average number of prime factors a number has. Any number theorists here? -- Greg
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-08 18:21 +1100 |
| Message-ID | <mailman.2199.1383903345.18130.python-list@python.org> |
| In reply to | #58756 |
On Fri, Nov 8, 2013 at 6:09 PM, Gregory Ewing <greg.ewing@canterbury.ac.nz> wrote: > You've got me thinking now about how viable a compression > scheme this would be, efficiency issues aside. I suppose > it would depend on things like the average density of primes > and the average number of prime factors a number has. > Any number theorists here? Start by coming up with an efficient storage representation for the primes. Don't forget that you can't use a bitfield, because eg 98 = 7*7*2, so you somehow need to represent that the 7 comes up twice. And you can't limit the size of your primes (eg by representing 98 as "\x07\x07\x02"). And that's without dealing with the major issue of _finding_ prime factors for a large number. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | jonas.thornvall@gmail.com |
|---|---|
| Date | 2013-11-08 07:48 -0800 |
| Message-ID | <992d347f-8e90-4c6b-b9e0-e539272f9721@googlegroups.com> |
| In reply to | #58739 |
Den fredagen den 8:e november 2013 kl. 03:43:17 UTC+1 skrev zipher: > >> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536. > > > > > > I can see that 4^8 = 65536. Now how are you going to render 65537? You > > > claimed that you could render *any* number efficiently. What you've > > > proven is that a small subset of numbers can be rendered efficiently. > > > > I think the idea would be to find the prime factorization for a given > > number, which has been proven to be available (and unique) for any and > > every number. Most numbers can compress given this technique. Prime > > numbers, of course, would not be compressed. > > > > -- > > MarkJ > > Tacoma, Washington 3^2-2^2=5
[toc] | [prev] | [next] | [standalone]
| From | rusi <rustompmody@gmail.com> |
|---|---|
| Date | 2013-11-08 07:57 -0800 |
| Message-ID | <4a7e5f20-f0f1-44e3-9087-ce3d919c1d44@googlegroups.com> |
| In reply to | #58806 |
On Friday, November 8, 2013 9:18:05 PM UTC+5:30, jonas.t...@gmail.com wrote: > Den fredagen den 8:e november 2013 kl. 03:43:17 UTC+1 skrev zipher: > > >> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536. > > > I can see that 4^8 = 65536. Now how are you going to render 65537? You > > > claimed that you could render *any* number efficiently. What you've > > > proven is that a small subset of numbers can be rendered efficiently. > > I think the idea would be to find the prime factorization for a given > > number, which has been proven to be available (and unique) for any and > > every number. Most numbers can compress given this technique. Prime > > numbers, of course, would not be compressed. > 3^2-2^2=5 Oh my! What a profound insight!!
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2013-11-08 11:48 -0700 |
| Message-ID | <mailman.2242.1383936569.18130.python-list@python.org> |
| In reply to | #58806 |
On Fri, Nov 8, 2013 at 8:48 AM, <jonas.thornvall@gmail.com> wrote: > 3^2-2^2=5 How do you intend to encode 3**2 - 2**2 in such a way that it is more compact than simply encoding 5? If you actually have an algorithm, you should share it instead of dropping these cryptic one-line non-explanations and leaving us guessing about the details. But I'm starting to think that you don't actually have an algorithm at all, whereas my initial impression was that you did and were simply mistaken about its effectiveness.
[toc] | [prev] | [next] | [standalone]
| From | "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> |
|---|---|
| Date | 2013-11-07 21:43 -0500 |
| Message-ID | <mailman.2177.1383878602.18130.python-list@python.org> |
| In reply to | #58733 |
On Nov 7, 2013, at 21:25, jonas.thornvall@gmail.com wrote: > Den fredagen den 8:e november 2013 kl. 03:17:36 UTC+1 skrev Chris Angelico: >> On Fri, Nov 8, 2013 at 1:05 PM, <jonas.thornvall@gmail.com> wrote: >> >>> I guess what matter is how fast an algorithm can encode and decode a big number, at least if you want to use it for very big sets of random data, or losless video compression? >> >> >> >> I don't care how fast. I care about the laws of physics :) You can't >> >> stuff more data into less space without losing some of it. >> >> >> >> Also, please lose Google Groups, or check out what other people have >> >> said about making it less obnoxious. >> >> >> >> ChrisA > > Please, you are he obnoxious, so fuck off or go learn about reformulation of problems. Every number has an infinite number of arithmetical solutions. So every number do has a shortest arithmetical encoding. And that is not the hard part to figure out, the hard part is to find a generic arithmetic encoding. > > I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536. Chris's point is more subtle: the typical computer will store the number 65536 in a single byte, but it will also store 4 and 8 in one byte. So if your choice is between sending "65536" and "(4,8)", you actually loose efficiency in your scheme. Don't think in decimal, but in terms of information needing transfer. You might reply that you don't need a whole byte for 4 or 8 -- that's true. You could, e.g., just encode the fourth and eight bits of a single byte and send that: certainly gives some compression but at the cost of generality-- what would you do with 65^65? It's sort of like the mean-variance tradeoff in statistics; it's much easier to encode certain data sets (e.g. powers of two as you noted) but only if you concede your algorithm will only work for those values. More generally, check out the work of Claud Shannon; a very accessible and worthwhile author. > -- > https://mail.python.org/mailman/listinfo/python-list
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-08 14:05 +1100 |
| Message-ID | <mailman.2179.1383879958.18130.python-list@python.org> |
| In reply to | #58733 |
On Fri, Nov 8, 2013 at 1:43 PM, Mark Janssen <dreamingforward@gmail.com> wrote: >>> I am not sure if it is just stupidness or laziness that prevent you from seeing that 4^8=65536. >> >> I can see that 4^8 = 65536. Now how are you going to render 65537? You >> claimed that you could render *any* number efficiently. What you've >> proven is that a small subset of numbers can be rendered efficiently. > > I think the idea would be to find the prime factorization for a given > number, which has been proven to be available (and unique) for any and > every number. Most numbers can compress given this technique. Prime > numbers, of course, would not be compressed. Also an interesting theory. 1) How do you represent the prime factors? 2) How do you factor large numbers efficiently? Trust me, if you've solved this one, there are a *LOT* of people who want to know. People with money, like Visa. 3) Still doesn't deal with all possible numbers. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2013-11-07 22:08 -0500 |
| Message-ID | <roy-3E9D1A.22083007112013@news.panix.com> |
| In reply to | #58741 |
In article <mailman.2179.1383879958.18130.python-list@python.org>, Chris Angelico <rosuav@gmail.com> wrote: > 2) How do you factor large numbers efficiently? Trust me, if you've > solved this one, there are a *LOT* of people who want to know. People > with money, like Visa. Not to mention the NSA.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-08 14:24 +1100 |
| Message-ID | <mailman.2181.1383881070.18130.python-list@python.org> |
| In reply to | #58733 |
On Fri, Nov 8, 2013 at 1:43 PM, R. Michael Weylandt
<michael.weylandt@gmail.com> <michael.weylandt@gmail.com> wrote:
> Chris's point is more subtle: the typical computer will store the number 65536 in a single byte, but it will also store 4 and 8 in one byte. So if your choice is between sending "65536" and "(4,8)", you actually loose efficiency in your scheme. Don't think in decimal, but in terms of information needing transfer.
Well, 65536 won't fit in a single byte, nor even in two (only just). A
typical binary representation of 65536 would take 3 bytes, or 4 for a
common integer type: 0x00 0x01 0x00 0x00 (big-endian). So it would be
possible to represent that more compactly, if you deem one byte each
for the base and the exponent: 0x04 0x08. However, that system allows
you to represent just 62,041 possible numbers:
>>> decomp={}
>>> for base in range(256):
for exp in range(256):
decomp[base**exp]=base,exp
>>> len(decomp)
62041
The trouble is, these numbers range all the way up from 0**0 == 0 to
255**255 == uhh...
>>> 255**255
46531388344983681457769984555620005635274427815488751368772861643065273360461098097690597702647394229975161523887729348709679192202790820272357752329882392140552515610822058736740145045150003072264722464746837070302159356661765043244993104360887623976285955058200326531849137668562738184397385361179287309286327712528995820702180594566008294593820621769951491324907014215176509758404760451335847252744697820515292329680698271481385779516652518207263143889034764775414387732372812840456880885163361037485452406176311868267428358492408075197688911053603714883403374930891951109790394269793978310190141201019287109375
which would decompress to (obviously) 255 bytes. So you can represent
just over sixty thousand possible 255-byte strings in two bytes with
this system.
To generalize it, you'd need to devote a bit somewhere to saying
"There are more to add in". Let's say you do this on the exponent byte
(high bit for convenience), so you have 0x04 0x08 means 65536, and
0x04 0x88 0x01 0x01 means 65536+1 = 65537. Now we have something that
generalizes; but the efficiency is gone - and there are too many ways
to encode the same value. (Bear in mind, for instance, that 0x01 0xNN
for any NN will still just represent 1, and 0x00 0xNN will represent
0. That's wasting a lot of bits.)
The application I can see for this sort of thing is not data
compression, but puzzles. There are lots of puzzles that humans find
enjoyable that represent an entire grid with less data than it
contains - for one popular example, look at Sudoku. You have a 9x9
grid where each cell could contain any of nine digits, which means a
theoretical information density of 9**81; but the entire grid can be
described by a handful of digits and heaps of blank space. This could
be a similarly-fun mathematical puzzle: 3359232 can be represented as
B1**E1 + B2**E2, where all numbers are single-digit. Find B1, B2, E1,
and E2. In this case, you're guaranteed that the end result is shorter
(four digits), but it's hardly useful for general work.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> |
|---|---|
| Date | 2013-11-07 23:05 -0500 |
| Message-ID | <mailman.2183.1383883528.18130.python-list@python.org> |
| In reply to | #58733 |
On Nov 7, 2013, at 22:24, Chris Angelico <rosuav@gmail.com> wrote: > On Fri, Nov 8, 2013 at 1:43 PM, R. Michael Weylandt > <michael.weylandt@gmail.com> <michael.weylandt@gmail.com> wrote: >> Chris's point is more subtle: the typical computer will store the number 65536 in a single byte, but it will also store 4 and 8 in one byte. > > Well, 65536 won't fit in a single byte, nor even in two (only just). A > typical binary representation of 65536 would take 3 bytes, or 4 for a > common integer type: 0x00 0x01 0x00 0x00 (big-endian). Quite right -- meant C's int type, (machine word) not char. My mistake! Michael
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-08 15:06 +1100 |
| Message-ID | <mailman.2184.1383883619.18130.python-list@python.org> |
| In reply to | #58733 |
On Fri, Nov 8, 2013 at 3:05 PM, R. Michael Weylandt
<michael.weylandt@gmail.com> <michael.weylandt@gmail.com> wrote:
>
>
> On Nov 7, 2013, at 22:24, Chris Angelico <rosuav@gmail.com> wrote:
>
>> On Fri, Nov 8, 2013 at 1:43 PM, R. Michael Weylandt
>> <michael.weylandt@gmail.com> <michael.weylandt@gmail.com> wrote:
>>> Chris's point is more subtle: the typical computer will store the number 65536 in a single byte, but it will also store 4 and 8 in one byte.
>>
>> Well, 65536 won't fit in a single byte, nor even in two (only just). A
>> typical binary representation of 65536 would take 3 bytes, or 4 for a
>> common integer type: 0x00 0x01 0x00 0x00 (big-endian).
>
> Quite right -- meant C's int type, (machine word) not char. My mistake!
Sure. Assuming at least 32-bit words, yes, that's correct. Of course,
this is still just proving that it's {possible, not possible} to
compress specific values, while the OP claimed to compress anything.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2013-11-07 22:12 -0600 |
| Subject | Re: Algorithm that makes maximum compression of completly diffused data. |
| Message-ID | <mailman.2185.1383883932.18130.python-list@python.org> |
| In reply to | #58733 |
On Thu, 7 Nov 2013 18:43:17 -0800, Mark Janssen <dreamingforward@gmail.com> wrote: > I think the idea would be to find the prime factorization for a given > number, which has been proven to be available (and unique) for any and > every number. Most numbers can compress given this technique. Prime > numbers, of course, would not be compressed. No, very few numbers can be compressed using this technique. And primes of course expand greatly. -- DaveA
[toc] | [prev] | [next] | [standalone]
Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →
Back to top | Article view | comp.lang.python
csiph-web