Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #21399 > unrolled thread

Finding MIME type for a data stream

Started byTobiah <toby@tobiah.org>
First post2012-03-08 13:55 -0800
Last post2012-03-09 11:50 +0100
Articles 13 — 6 participants

Back to article view | Back to comp.lang.python


Contents

  Finding MIME type for a data stream Tobiah <toby@tobiah.org> - 2012-03-08 13:55 -0800
    Re: Finding MIME type for a data stream Dave Angel <d@davea.name> - 2012-03-08 17:11 -0500
      Re: Finding MIME type for a data stream Tobiah <toby@tobiah.org> - 2012-03-08 14:28 -0800
        Re: Finding MIME type for a data stream Dave Angel <d@davea.name> - 2012-03-08 17:56 -0500
          Re: Finding MIME type for a data stream Tobiah <toby@tobiah.org> - 2012-03-08 15:40 -0800
            Re: Finding MIME type for a data stream Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-03-08 21:04 -0500
              Re: Finding MIME type for a data stream Tobiah <toby@tobiah.org> - 2012-03-09 08:40 -0800
            Re: Finding MIME type for a data stream Jon Clements <joncle@googlemail.com> - 2012-03-08 18:31 -0800
              Re: Finding MIME type for a data stream Tobiah <toby@tobiah.org> - 2012-03-09 09:53 -0800
      Re: Finding MIME type for a data stream Tobiah <toby@tobiah.org> - 2012-03-08 14:34 -0800
        Re: Finding MIME type for a data stream Irmen de Jong <irmen.NOSPAM@xs4all.nl> - 2012-03-09 03:12 +0100
          Re: Finding MIME type for a data stream Tobiah <toby@tobiah.org> - 2012-03-09 08:35 -0800
    Re: Finding MIME type for a data stream Peter Otten <__peter__@web.de> - 2012-03-09 11:50 +0100

#21399 — Finding MIME type for a data stream

FromTobiah <toby@tobiah.org>
Date2012-03-08 13:55 -0800
SubjectFinding MIME type for a data stream
Message-ID<%U96r.32867$082.25394@newsfe04.iad>
I'm pulling image data from a database blob, and serving
it from a web2py app.  I have to send the correct
Content-Type header, so I need to detect the image type.

Everything that I've found on the web so far, needs a file
name on the disk, but I only have the data.

It looks like the 'magic' package might be of use, but
I can't find any documentation for it.

Also, it seems like image/png works for other types 
of image data, while image/foo does not, yet I'm afraid
that not every browser will play along as nicely.

Thanks!

Tobiah

[toc] | [next] | [standalone]


#21400

FromDave Angel <d@davea.name>
Date2012-03-08 17:11 -0500
Message-ID<mailman.518.1331244722.3037.python-list@python.org>
In reply to#21399
On 03/08/2012 04:55 PM, Tobiah wrote:
> I'm pulling image data from a database blob, and serving
> it from a web2py app.  I have to send the correct
> Content-Type header, so I need to detect the image type.
>
> Everything that I've found on the web so far, needs a file
> name on the disk, but I only have the data.
>
> It looks like the 'magic' package might be of use, but
> I can't find any documentation for it.
>
> Also, it seems like image/png works for other types
> of image data, while image/foo does not, yet I'm afraid
> that not every browser will play along as nicely.
>
> Thanks!
>
> Tobiah

First step, ask the authors of the database what format of data this 
blob is in.

Failing that, write the same data locally as a binary file, and see what 
application can open it.  Or if you're on a Linux system, run file on 
it.  "file" can identify most data formats (not just images) just by 
looking at the data.

That assumes, of course, that there's any consistency in the data coming 
out of the database.  What happens if next time this blob is an Excel 
spreadsheet?

-- 

DaveA

[toc] | [prev] | [next] | [standalone]


#21404

FromTobiah <toby@tobiah.org>
Date2012-03-08 14:28 -0800
Message-ID<3oa6r.31449$L12.8825@newsfe23.iad>
In reply to#21400
On 03/08/2012 02:11 PM, Dave Angel wrote:
> On 03/08/2012 04:55 PM, Tobiah wrote:
>> I'm pulling image data from a database blob, and serving
>> it from a web2py app.  I have to send the correct
>> Content-Type header, so I need to detect the image type.
>>
>> Everything that I've found on the web so far, needs a file
>> name on the disk, but I only have the data.
>>
>> It looks like the 'magic' package might be of use, but
>> I can't find any documentation for it.
>>
>> Also, it seems like image/png works for other types
>> of image data, while image/foo does not, yet I'm afraid
>> that not every browser will play along as nicely.
>>
>> Thanks!
>>
>> Tobiah
> 
> First step, ask the authors of the database what format of data this 
> blob is in.
> 
> Failing that, write the same data locally as a binary file, and see what 
> application can open it.  Or if you're on a Linux system, run file on 
> it.  "file" can identify most data formats (not just images) just by 
> looking at the data.
> 
> That assumes, of course, that there's any consistency in the data coming 
> out of the database.  What happens if next time this blob is an Excel 
> spreadsheet?
> 


I should simplify my question.  Let's say I have a string
that contains image data called 'mystring'.

I want to do

mime_type = some_magic(mystring)

and get back 'image/jpg' or 'image/png' or whatever is
appropriate for the image data.

Thanks!

Tobiah

[toc] | [prev] | [next] | [standalone]


#21406

FromDave Angel <d@davea.name>
Date2012-03-08 17:56 -0500
Message-ID<mailman.521.1331247434.3037.python-list@python.org>
In reply to#21404
On 03/08/2012 05:28 PM, Tobiah wrote:
> <snip>
>
>
> I should simplify my question.  Let's say I have a string
> that contains image data called 'mystring'.
>
> I want to do
>
> mime_type = some_magic(mystring)
>
> and get back 'image/jpg' or 'image/png' or whatever is
> appropriate for the image data.
>
> Thanks!
>
> Tobiah

I have to assume you're talking python 2, since in python 3, strings 
cannot generally contain image data.  In python 2, characters are pretty 
much interchangeable with bytes.

Anyway, I don't know any way in the standard lib to distinguish 
arbitrary image formats.  (There very well could be one.)  The file 
program I referred to was an external utility, which you could run with 
the multiprocessing module.

if you're looking for a specific, small list of file formats, you could 
make yourself a signature list.  Most (not all) formats distinguish 
themselves in the first few bytes.  For example, a standard zip file 
starts with "PK" for Phil Katz.  A Windows  exe starts with "MZ" for 
Mark Zbikowsky.  And I believe a jpeg file starts  hex(d8) (ff) (e0) (ff)

If you'd like to see a list of available modules, help() is your 
friend.  You can start with help("modules") to see quite a long list.  
And I was surprised how many image related things already are there.  So 
maybe there's something I don't know about that could help.

-- 

DaveA

[toc] | [prev] | [next] | [standalone]


#21412

FromTobiah <toby@tobiah.org>
Date2012-03-08 15:40 -0800
Message-ID<yrb6r.35195$zD5.24208@newsfe12.iad>
In reply to#21406
> I have to assume you're talking python 2, since in python 3, strings 
> cannot generally contain image data.  In python 2, characters are pretty 
> much interchangeable with bytes.

Yeah, python 2


> if you're looking for a specific, small list of file formats, you could 
> make yourself a signature list.  Most (not all) formats distinguish 
> themselves in the first few bytes. 

Yeah, maybe I'll just do that.  I'm alowing users to paste
images into a rich-text editor, so I'm pretty much looking 
at .png, .gif, or .jpg.  Those should be pretty easy to 
distinguish by looking at the first few bytes.  

Pasting images may sound weird, but I'm using a jquery
widget called cleditor that takes image data from the
clipboard and replaces it with inline base64 data.  
The html from the editor ends up as an email, and the
inline images cause the emails to be tossed in the
spam folder for most people.  So I'm parsing the
emails, storing the image data, and replacing the
inline images with an img tag that points to a 
web2py app that takes arguments that tell it which 
image to pull from the database.  

Now that I think of it, I could use php to detect the
image type, and store that in the database.  Not quite
as clean, but that would work.

Tobiah

[toc] | [prev] | [next] | [standalone]


#21413

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2012-03-08 21:04 -0500
Message-ID<mailman.527.1331258659.3037.python-list@python.org>
In reply to#21412
On Thu, 08 Mar 2012 15:40:13 -0800, Tobiah <toby@tobiah.org> declaimed
the following in gmane.comp.python.general:


> Pasting images may sound weird, but I'm using a jquery
> widget called cleditor that takes image data from the
> clipboard and replaces it with inline base64 data.  

	In Windows, I'd expect "device independent bitmap" to be the result
of a clipboard image...
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#21429

FromTobiah <toby@tobiah.org>
Date2012-03-09 08:40 -0800
Message-ID<coq6r.13531$wd1.176@newsfe13.iad>
In reply to#21413
On 03/08/2012 06:04 PM, Dennis Lee Bieber wrote:
> On Thu, 08 Mar 2012 15:40:13 -0800, Tobiah <toby@tobiah.org> declaimed
> the following in gmane.comp.python.general:
> 
> 
>> Pasting images may sound weird, but I'm using a jquery
>> widget called cleditor that takes image data from the
>> clipboard and replaces it with inline base64 data.  
> 
> 	In Windows, I'd expect "device independent bitmap" to be the result
> of a clipboard image...

This jquery editor seems to detect the image data and
translate it into an inline image like:

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAALU...

I'm parsing those out with regular expressions and decoding
the base64, and putting the resulting image data into a blob.
Hmm... there's the mime type right there.

[toc] | [prev] | [next] | [standalone]


#21415

FromJon Clements <joncle@googlemail.com>
Date2012-03-08 18:31 -0800
Message-ID<30829846.74.1331260282202.JavaMail.geo-discussion-forums@vbai14>
In reply to#21412
On Thursday, 8 March 2012 23:40:13 UTC, Tobiah  wrote:
> > I have to assume you're talking python 2, since in python 3, strings 
> > cannot generally contain image data.  In python 2, characters are pretty 
> > much interchangeable with bytes.
> 
> Yeah, python 2
> 
> 
> > if you're looking for a specific, small list of file formats, you could 
> > make yourself a signature list.  Most (not all) formats distinguish 
> > themselves in the first few bytes. 
> 
> Yeah, maybe I'll just do that.  I'm alowing users to paste
> images into a rich-text editor, so I'm pretty much looking 
> at .png, .gif, or .jpg.  Those should be pretty easy to 
> distinguish by looking at the first few bytes.  
> 
> Pasting images may sound weird, but I'm using a jquery
> widget called cleditor that takes image data from the
> clipboard and replaces it with inline base64 data.  
> The html from the editor ends up as an email, and the
> inline images cause the emails to be tossed in the
> spam folder for most people.  So I'm parsing the
> emails, storing the image data, and replacing the
> inline images with an img tag that points to a 
> web2py app that takes arguments that tell it which 
> image to pull from the database.  
> 
> Now that I think of it, I could use php to detect the
> image type, and store that in the database.  Not quite
> as clean, but that would work.
> 
> Tobiah

Something like the following might be worth a go:
(untested)

from PIL import Image
img = Image.open(StringIO(blob))
print img.format

HTH
Jon.

PIL: http://www.pythonware.com/library/pil/handbook/image.htm

[toc] | [prev] | [next] | [standalone]


#21433

FromTobiah <toby@tobiah.org>
Date2012-03-09 09:53 -0800
Message-ID<vsr6r.13535$wd1.2850@newsfe13.iad>
In reply to#21415
> Something like the following might be worth a go:
> (untested)
> 
> from PIL import Image
> img = Image.open(StringIO(blob))
> print img.format
> 

This worked quite nicely.  I didn't
see a list of all returned formats though
in the docs.  The one image I had returned

PNG

So I'm doing:

mime_type = "image/%s" % img.format.lower()

I'm hoping that will work for any image type.

Thanks,

Tobiah

[toc] | [prev] | [next] | [standalone]


#21405

FromTobiah <toby@tobiah.org>
Date2012-03-08 14:34 -0800
Message-ID<3ua6r.9176$fT5.6230@newsfe02.iad>
In reply to#21400
Also, I realize that I could write the data to a file
and then use one of the modules that want a file path.
I would prefer not to do that.

Thanks

[toc] | [prev] | [next] | [standalone]


#21414

FromIrmen de Jong <irmen.NOSPAM@xs4all.nl>
Date2012-03-09 03:12 +0100
Message-ID<4f596718$0$6847$e4fe514c@news2.news.xs4all.nl>
In reply to#21405
On 8-3-2012 23:34, Tobiah wrote:
> Also, I realize that I could write the data to a file
> and then use one of the modules that want a file path.
> I would prefer not to do that.
> 
> Thanks
> 

Use StringIO then, instead of a file on disk

Irmen

[toc] | [prev] | [next] | [standalone]


#21428

FromTobiah <toby@tobiah.org>
Date2012-03-09 08:35 -0800
Message-ID<yjq6r.31950$L12.22536@newsfe23.iad>
In reply to#21414
On 03/08/2012 06:12 PM, Irmen de Jong wrote:
> On 8-3-2012 23:34, Tobiah wrote:
>> Also, I realize that I could write the data to a file
>> and then use one of the modules that want a file path.
>> I would prefer not to do that.
>>
>> Thanks
>>
> 
> Use StringIO then, instead of a file on disk
> 
> Irmen
> 

Nice.  Thanks.

[toc] | [prev] | [next] | [standalone]


#21422

FromPeter Otten <__peter__@web.de>
Date2012-03-09 11:50 +0100
Message-ID<mailman.531.1331290226.3037.python-list@python.org>
In reply to#21399
Tobiah wrote:

> I'm pulling image data from a database blob, and serving
> it from a web2py app.  I have to send the correct
> Content-Type header, so I need to detect the image type.
> 
> Everything that I've found on the web so far, needs a file
> name on the disk, but I only have the data.
> 
> It looks like the 'magic' package might be of use, but
> I can't find any documentation for it.

After some try-and-error and a look into example.py:

>>> m = magic.open(magic.MAGIC_MIME_TYPE)
>>> m.load()
0
>>> sample = open("tmp.png").read()
>>> m.buffer(sample)
'image/png'

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web