Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: Tim Watts Newsgroups: comp.databases.mysql Subject: Re: Can MySql database store images? Followup-To: comp.databases.mysql Date: Wed, 27 Apr 2011 16:32:01 +0100 Organization: A noiseless patient Spider Lines: 83 Message-ID: References: <55md88-7pb.ln1@squidward.dionic.net> <68mf88-4l5.ln1@squidward.dionic.net> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit Injection-Info: mx03.eternal-september.org; posting-host="PfteNUsu9gxPcp7CbOxONA"; logging-data="19376"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/m0GxdZp1AkOloTInNKyeBUiYON31otjM=" User-Agent: KNode/4.4.9 Cancel-Lock: sha1:EpYEPCVj4TNPdMRqdFmHu/282fg= Xref: x330-a1.tempe.blueboxinc.net comp.databases.mysql:705 Peter H. Coffin wrote: > On Wed, 27 Apr 2011 13:34:31 +0100, Tim Watts wrote: >> Peter H. Coffin wrote: >> >>> On Wed, 27 Apr 2011 08:24:16 +0100, The Natural Philosopher wrote: >>>> Norman Peelman wrote: >>>>> Doug Miller wrote: >>>>>> In article , Norman Peelman >>>>>>> 460 * 5.3kb >>>>>>> >>>>>> You wrote above "460 images ... with an average size of 5393kb" . >>>>>> 5393kb is 5.4 MEGAbytes, not 5.3kb. >>>>> >>>>> Yes, my fingers were going faster than my brain. >>>>> >>>>> Average of 5393 bytes (5kb) >>>>> Max of 10240 bytes (10kb) >>>>> >>>>> ...these are small images. >>>>> >>>>> Dump file (w/images) = 29.8MB >>>>> Dump file zipped = 1.7MB >>>>> >>>>> >>>> It is seldom possible to compress images more than they are already >>>> compressed. >>>> >>>> So I still think you have made a mistake. >>> >>> Dump files can (intentionally and with malice aforethought) export >>> binary columns in hexidecimal text, which is rather compressible. It's >>> also very safe from things like people fussing with it with text >>> editors, being copied and pasted into emails for demonstrative purposes, >>> and other kinds of mistreatment. >>> >> >> I think the point that is being overlooked, is that the text, whilst >> compressible, is itself a re-encoding of an already highly compressed bit >> of data. > > Some parts of the file are highly-compressed data. Well, > fairly-highly-compressed, anyway. The actual compression used in, for > example JFIF/.jpeg is Huffman. More of the initial 'compression' in > those comes from not actual compression but rather lossy tricks to make > an image that looks about the same as the original to the eye, but isn't > itself 'compression' in the sense that the data inside itself isn't > necessarily further incompressible. What this means is that fairly small > images don't have a lot of "compressed data" in them in the first place, > and the overhead for graphics with small fields of image data might be > easily half overhead. > >> if anything, the intermidate text encoding should make things worse >> overall, not better. > > One would think so at first glance, but text is really easy to compress. > >> We are still talking about 2.2MB compressed image data mixed up with >> other stuff in an exploded ASCII form being, somehow, recompressed down >> to a total which is less than the sum of the original images alone. > > That's the key to what I think is happening here. See, one image may be > compressible for some small gains. But many images, especially with very > similar information in the overhead portions of the formats, like > they're mostly all the same sizes, or use similar color pallets, end up > being compressable by being able to compress duplicate information > *between* the images as well as within the image itself. > >> It would make me want to double check the dumps to see they really had >> everything... > > Always a worthwhile step. But if the dump restores okay, the size alone > isn't necessarily a warning that something else is wrong. > Nice explanation Peter. That makes sense (in particular the commonality between images). Cheers Tim -- Tim Watts