Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #58744

Re: Algorithm that makes maximum compression of completly diffused data.

References (7 earlier) <mailman.2161.1383863216.18130.python-list@python.org> <9b62770c-7ca1-4a4d-81a5-bf7251bac957@googlegroups.com> <mailman.2171.1383877064.18130.python-list@python.org> <13c04f06-f1f2-4f67-b975-3cff28714641@googlegroups.com> <B07C8854-073B-4568-86E2-64374D33163A@gmail.com>
Date 2013-11-08 14:24 +1100
Subject Re: Algorithm that makes maximum compression of completly diffused data.
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.2181.1383881070.18130.python-list@python.org> (permalink)

Show all headers | View raw


On Fri, Nov 8, 2013 at 1:43 PM, R. Michael Weylandt
<michael.weylandt@gmail.com> <michael.weylandt@gmail.com> wrote:
> Chris's point is more subtle: the typical computer will store the number 65536 in a single byte, but it will also store 4 and 8 in one byte. So if your choice is between sending "65536" and "(4,8)", you actually loose efficiency in your scheme. Don't think in decimal, but in terms of information needing transfer.

Well, 65536 won't fit in a single byte, nor even in two (only just). A
typical binary representation of 65536 would take 3 bytes, or 4 for a
common integer type: 0x00 0x01 0x00 0x00 (big-endian). So it would be
possible to represent that more compactly, if you deem one byte each
for the base and the exponent: 0x04 0x08. However, that system allows
you to represent just 62,041 possible numbers:

>>> decomp={}
>>> for base in range(256):
    for exp in range(256):
        decomp[base**exp]=base,exp

>>> len(decomp)
62041

The trouble is, these numbers range all the way up from 0**0 == 0 to
255**255 == uhh...
>>> 255**255
46531388344983681457769984555620005635274427815488751368772861643065273360461098097690597702647394229975161523887729348709679192202790820272357752329882392140552515610822058736740145045150003072264722464746837070302159356661765043244993104360887623976285955058200326531849137668562738184397385361179287309286327712528995820702180594566008294593820621769951491324907014215176509758404760451335847252744697820515292329680698271481385779516652518207263143889034764775414387732372812840456880885163361037485452406176311868267428358492408075197688911053603714883403374930891951109790394269793978310190141201019287109375

which would decompress to (obviously) 255 bytes. So you can represent
just over sixty thousand possible 255-byte strings in two bytes with
this system.

To generalize it, you'd need to devote a bit somewhere to saying
"There are more to add in". Let's say you do this on the exponent byte
(high bit for convenience), so you have 0x04 0x08 means 65536, and
0x04 0x88 0x01 0x01 means 65536+1 = 65537. Now we have something that
generalizes; but the efficiency is gone - and there are too many ways
to encode the same value. (Bear in mind, for instance, that 0x01 0xNN
for any NN will still just represent 1, and 0x00 0xNN will represent
0. That's wasting a lot of bits.)

The application I can see for this sort of thing is not data
compression, but puzzles. There are lots of puzzles that humans find
enjoyable that represent an entire grid with less data than it
contains - for one popular example, look at Sudoku. You have a 9x9
grid where each cell could contain any of nine digits, which means a
theoretical information density of 9**81; but the entire grid can be
described by a handful of digits and heaps of blank space. This could
be a similarly-fun mathematical puzzle: 3359232 can be represented as
B1**E1 + B2**E2, where all numbers are single-digit. Find B1, B2, E1,
and E2. In this case, you're guaranteed that the end result is shorter
(four digits), but it's hardly useful for general work.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 11:21 -0700
  Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 18:53 +0000
    Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:01 -0700
      Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:18 +0000
        Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:22 -0700
          Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:31 +0000
        Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:23 -0700
          Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 19:35 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. Ethan Furman <ethan@stoneleaf.us> - 2013-11-02 21:26 -0700
      Re: Algorithm that makes maximum compression of completly diffused data. Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-10-30 20:28 +0100
      Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-10-30 21:30 +0000
        Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-10-31 05:54 -0700
      Re: Algorithm that makes maximum compression of completly diffused data. Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-30 21:52 +0000
      Re: Algorithm that makes maximum compression of completly diffused data. Tim Chase <python.list@tim.thechases.com> - 2013-10-30 18:01 -0500
      Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-10-31 10:41 +1100
  Re: Algorithm that makes maximum compression of completly diffused data. Dan Stromberg <drsalists@gmail.com> - 2013-10-30 12:29 -0700
  Re: Algorithm that makes maximum compression of completly diffused data. Tim Delaney <timothy.c.delaney@gmail.com> - 2013-10-31 06:35 +1100
    Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:47 -0700
  Re: Algorithm that makes maximum compression of completly diffused data. Modulok <modulok@gmail.com> - 2013-10-30 13:46 -0600
    Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:47 -0700
      Re: Algorithm that makes maximum compression of completly diffused data. Gene Heskett <gheskett@wdtv.com> - 2013-10-30 16:32 -0400
      Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-02 14:31 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-02 14:37 -0700
        Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-03 03:17 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-03 15:10 +1100
          Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-11-03 15:34 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. Joshua Landau <joshua@landau.ws> - 2013-11-03 15:51 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-03 19:40 -0800
          Re: Algorithm that makes maximum compression of completly diffused data. Tim Chase <python.list@tim.thechases.com> - 2013-11-04 07:08 -0600
        Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 05:53 -0800
          Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 06:00 -0800
          Re: Algorithm that makes maximum compression of completly diffused
 data. Dave Angel <davea@davea.name> - 2013-11-04 08:27 -0600
            Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-04 06:46 -0800
            Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-04 14:34 -0800
              Re: Algorithm that makes maximum compression of completly diffused
 data. Dave Angel <davea@davea.name> - 2013-11-04 19:29 -0600
              Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-05 04:33 +0000
                Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-05 04:36 +0000
          Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-07 00:05 -0800
            Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 10:59 -0800
            Re: Algorithm that makes maximum compression of completly diffused data. Tim Roberts <timr@probo.com> - 2013-11-07 11:22 -0800
            Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 09:26 +1100
              Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-07 18:05 -0800
                Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:17 +1100
                Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-07 18:25 -0800
                Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-07 18:36 -0800
                Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:36 +1100
                Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 18:43 -0800
                Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-08 04:47 +0000
                Re: Algorithm that makes maximum compression of completly diffused   data. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-11-08 20:09 +1300
                Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 18:21 +1100
                Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-11-08 07:48 -0800
                Re: Algorithm that makes maximum compression of completly diffused data. rusi <rustompmody@gmail.com> - 2013-11-08 07:57 -0800
                Re: Algorithm that makes maximum compression of completly diffused data. Ian Kelly <ian.g.kelly@gmail.com> - 2013-11-08 11:48 -0700
                Re: Algorithm that makes maximum compression of completly diffused data. "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> - 2013-11-07 21:43 -0500
                Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 14:05 +1100
                Re: Algorithm that makes maximum compression of completly diffused data. Roy Smith <roy@panix.com> - 2013-11-07 22:08 -0500
                Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 14:24 +1100
                Re: Algorithm that makes maximum compression of completly diffused data. "R. Michael Weylandt <michael.weylandt@gmail.com>" <michael.weylandt@gmail.com> - 2013-11-07 23:05 -0500
                Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 15:06 +1100
                Re: Algorithm that makes maximum compression of completly diffused
 data. Dave Angel <davea@davea.name> - 2013-11-07 22:12 -0600
                Re: Algorithm that makes maximum compression of completly diffused data. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-08 05:32 +0000
                Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-07 18:24 -0800
                Re: Algorithm that makes maximum compression of completly diffused   data. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-11-08 20:16 +1300
                Re: Algorithm that makes maximum compression of completly diffused data. Chris Angelico <rosuav@gmail.com> - 2013-11-08 13:27 +1100
      Re: Algorithm that makes maximum compression of completly diffused data. Ethan Furman <ethan@stoneleaf.us> - 2013-11-02 21:26 -0700
      Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-11-02 23:09 -0700
      Re: Algorithm that makes maximum compression of completly diffused data. Michael Torrie <torriem@gmail.com> - 2013-11-03 08:14 -0700
    Re: Algorithm that makes maximum compression of completly diffused data. jonas.thornvall@gmail.com - 2013-10-30 12:49 -0700
  Re: Algorithm that makes maximum compression of completly diffused data. Grant Edwards <invalid@invalid.invalid> - 2013-10-30 21:18 +0000
  Re: Algorithm that makes maximum compression of completly diffused data. Mark Janssen <dreamingforward@gmail.com> - 2013-10-30 14:26 -0700
  Re: Algorithm that makes maximum compression of completly diffused data. Dave Angel <davea@davea.name> - 2013-10-31 03:22 +0000
  Re: Algorithm that makes maximum compression of completly diffused data. Gene Heskett <gheskett@wdtv.com> - 2013-11-03 04:50 -0500

csiph-web