Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #44468
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!news.szaf.org!fu-berlin.de!uni-berlin.de!not-for-mail |
|---|---|
| From | jt@toerring.de (Jens Thoms Toerring) |
| Newsgroups | comp.lang.python |
| Subject | Re: File Read issue by using module binascii |
| Date | 28 Apr 2013 12:04:04 GMT |
| Organization | Freie Universitaet Berlin |
| Lines | 104 |
| Message-ID | <au4hhkFdnqiU1@mid.uni-berlin.de> (permalink) |
| References | <9b5795ab-baec-4c0a-a3b4-1075cffc8744@googlegroups.com> <319pn8hgkc0pbj0099heq2nq3h5sv68g7s@4ax.com> |
| X-Trace | news.uni-berlin.de j5j7ldgeL9TRyPnxHQVZNwUm6FlGkNb1l7cuyWtX8uWpuX |
| X-Orig-Path | not-for-mail |
| User-Agent | tin/1.9.3-20080506 ("Dalintober") (UNIX) (Linux/2.6.30-1-amd64 (x86_64)) |
| Xref | csiph.com comp.lang.python:44468 |
Show key headers only | View raw
Tim Roberts <timr@probo.com> wrote:
> Jimmie He <jimmie.he@gmail.com> wrote:
> >When I run the readbmp on an example.bmp(about 100k),the Shell is become to "No respose",when I change f.read() to f.read(1000),it is ok,could someone tell me the excat reason for this?
> >Thank you in advance!
> >
> >Python Code as below!!
> >
> >import binascii
> >
> >def read_bmp():
> > f = open('example.bmp','rb')
> > rawdata = f.read() #f.read(1000) is ok
> > hexstr = binascii.b2a_hex(rawdata) #Get an HEX number
> > bsstr = bin (int(hexstr,16))[2:]
> I suspect the root of the problem here is that you don't understand what
> this is actually doing. You should run this code in the command-line
> interpreter, one line at a time, and print the results.
> The "read" instruction produces a string with 100k bytes. The b2a_hex then
> produces a string with 200k bytes. Then, int(hexstr,16) takes that 200,000
> byte hex string and converts it to an integer, roughly equal to 10 to the
> 240,000 power, a number with some 240,000 decimal digits. You then convert
> that integer to a binary string. That string will contain 800,000 bytes.
> You then drop the first two characters and print the other 799,998 bytes,
> each of which will be either '0' or '1'.
> I am absolutely, positively convinced that's not what you wanted to do.
> What point is there in printing out the binary equavalent of a bitmap?
> Even if you did, it would be much quicker for you to do the conversion one
> byte at a time, completely skipping the conversion to hex and then the
> creation of a massive multi-precision number. Example:
> f = open('example.bmp','rb')
> rawdata = f.read()
> bsstr = []
> for b in rawdata:
> bsstr.append( bin(ord(b)) )
> bsstr = ''.join(bsstr)
> or even:
> f = open('example.bmp','rb')
> bsstr = ''.join( bin(ord(b))[2:] for b in f.read() )
Exactly my idea at first. But then I started to time it (using
the timeit module) by comparing the following functions:
# Original version
def c1( rawdata ) :
h = binascii.b2a_hex( rawdata )
z = bin( int( h, 16 ) )[ 2 : ]
return '0' * ( 8 * len( r ) - len( z ) ) + z
# Convert each byte directly
def c2( rawdata ) :
return ''.join( bin( ord( x ) )[ 2 : ].rjust( 8, '0' ) for x in r )
# Convert each byte using a list for table look-up
def c3( rawdata ) :
h = [ bin( i )[ 2 : ].rjust( 8, '0' ) for i in range( 256 ) ]
return ''.join( h[ ord( x ) ] for x in rawdata )
# Convert each byte using a dictionary for table look-up (avoids
# lots of ord() calls)
def c4( rawdata ) :
h = { chr( i ) : bin( i )[ 2 : ].rjust( 8, '0' ) for i in range( 256 ) }
return ''.join( h[ x ] for x in rawdata )
As you can see I even in c3() and c4() tried to speed things up
further by using a table look-up instead if calling bin() etc.
on each byte. But the results was that c2() is nearly 15 times
slower than c1(), c3() about 3 times and c4() still more than 2
times slower! So the method the OP uses seems to be quite a bit
more efficient than one might be tempted to assume.
I would guess that the reason is that c1() does just a small
number of calls of functions that probably aren't implemented
in Python but in C and thus can be a lot faster then anything
you could achieve with Python, while the other functions use a
for loop in Python, which seems to account for a good part of
the CPU time used. To test for that I split the 'rawdata' string
into a list of character (i.e. single letter strings) and re-
assembled it using join() and a for loop:
r = list( rawdata( )
z = ''.join( x for x in r )
The second line alone took about 1.7 times longer than the
whole, seemingly convoluted c1() function!
What I take away from this is that a lot of the assumption one
is prone to make when coming from e.g. a C/C++ background can
be quite misleading when extrapolating to Python (or other in-
terpreted languages)...
Best regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
File Read issue by using module binascii Jimmie He <jimmie.he@gmail.com> - 2013-04-26 20:57 -0700
Re: File Read issue by using module binascii Jimmie He <jimmie.he@gmail.com> - 2013-04-26 21:22 -0700
Re: File Read issue by using module binascii Fábio Santos <fabiosantosart@gmail.com> - 2013-04-27 10:56 +0100
Re: File Read issue by using module binascii Jimmie He <jimmie.he@gmail.com> - 2013-04-27 03:42 -0700
Re: File Read issue by using module binascii Peter Otten <__peter__@web.de> - 2013-04-27 12:57 +0200
Re: File Read issue by using module binascii Jimmie He <jimmie.he@gmail.com> - 2013-04-27 04:23 -0700
Re: File Read issue by using module binascii Fábio Santos <fabiosantosart@gmail.com> - 2013-04-27 12:40 +0100
Re: File Read issue by using module binascii Peter Otten <__peter__@web.de> - 2013-04-27 14:01 +0200
Re: File Read issue by using module binascii Jimmie He <jimmie.he@gmail.com> - 2013-04-27 05:46 -0700
Re: File Read issue by using module binascii Tim Roberts <timr@probo.com> - 2013-04-27 21:34 -0700
Re: File Read issue by using module binascii Peter Otten <__peter__@web.de> - 2013-04-28 09:42 +0200
Re: File Read issue by using module binascii jt@toerring.de (Jens Thoms Toerring) - 2013-04-28 12:04 +0000
Re: File Read issue by using module binascii Jimmie He <jimmie.he@gmail.com> - 2013-04-28 06:32 -0700
csiph-web