Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #34347 > unrolled thread

Re: Confused compare function :)

Started byBruno Dupuis <python.ml.bruno.dupuis@lisael.org>
First post2012-12-06 01:19 +0100
Last post2012-12-06 19:25 +0000
Articles 20 on this page of 41 — 15 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Confused compare function :) Bruno Dupuis <python.ml.bruno.dupuis@lisael.org> - 2012-12-06 01:19 +0100
    Re: Confused compare function :) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-06 00:42 +0000
      Re: Confused compare function :) Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-12-06 13:41 -0500
      Re: Confused compare function :) Anatoli Hristov <tolidtm@gmail.com> - 2012-12-06 19:55 +0100
    Re: Confused compare function :) Rotwang <sg552@hotmail.co.uk> - 2012-12-06 03:22 +0000
      Re: Confused compare function :) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-06 04:32 +0000
        Re: Confused compare function :) Bruno Dupuis <python.ml.bruno.dupuis@lisael.org> - 2012-12-06 09:49 +0100
          Re: Confused compare function :) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-06 11:47 +0000
            Re: Confused compare function :) peter <pjmakey2@gmail.com> - 2012-12-06 08:55 -0300
              Re: Confused compare function :) Hans Mulder <hansmu@xs4all.nl> - 2012-12-06 14:32 +0100
                Re: Confused compare function :) Chris Angelico <rosuav@gmail.com> - 2012-12-07 00:47 +1100
            Re: Confused compare function :) Chris Angelico <rosuav@gmail.com> - 2012-12-06 23:14 +1100
              Re: Confused compare function :) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-07 22:16 +0000
                Re: Confused compare function :) Terry Reedy <tjreedy@udel.edu> - 2012-12-08 02:01 -0500
                Re: Confused compare function :) Chris Angelico <rosuav@gmail.com> - 2012-12-08 18:17 +1100
                Re: Confused compare function :) MRAB <python@mrabarnett.plus.com> - 2012-12-08 17:50 +0000
              Re: Confused compare function :) Ramchandra Apte <maniandram01@gmail.com> - 2012-12-08 19:07 -0800
                Re: Confused compare function :) Chris Angelico <rosuav@gmail.com> - 2012-12-09 14:22 +1100
                  Re: Confused compare function :) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-09 07:39 +0000
              Re: Confused compare function :) Ramchandra Apte <maniandram01@gmail.com> - 2012-12-08 19:07 -0800
            Re: Confused compare function :) Neil Cerutti <neilc@norwich.edu> - 2012-12-06 13:51 +0000
              Re: Confused compare function :) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-07 02:55 +0000
                Re: Confused compare function :) Neil Cerutti <neilc@norwich.edu> - 2012-12-07 16:40 +0000
          Re: Confused compare function :) Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2012-12-06 14:33 +0100
            Re: Confused compare function :) Chris Angelico <rosuav@gmail.com> - 2012-12-07 00:58 +1100
              Re: Confused compare function :) Hans Mulder <hansmu@xs4all.nl> - 2012-12-06 15:21 +0100
                Re: Confused compare function :) Chris Angelico <rosuav@gmail.com> - 2012-12-07 01:28 +1100
            Re: Confused compare function :) Anatoli Hristov <tolidtm@gmail.com> - 2012-12-06 15:22 +0100
            Re: Confused compare function :) Dave Angel <d@davea.name> - 2012-12-06 09:40 -0500
            Re: Confused compare function :) peter <pjmakey2@gmail.com> - 2012-12-06 11:46 -0300
            Re: Confused compare function :) Anatoli Hristov <tolidtm@gmail.com> - 2012-12-06 17:16 +0100
            Re: Confused compare function :) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-12-06 16:52 +0000
            Re: Confused compare function :) Anatoli Hristov <tolidtm@gmail.com> - 2012-12-06 18:08 +0100
            Re: Confused compare function :) MRAB <python@mrabarnett.plus.com> - 2012-12-06 17:10 +0000
            Re: Confused compare function :) Anatoli Hristov <tolidtm@gmail.com> - 2012-12-06 18:31 +0100
            Re: Confused compare function :) MRAB <python@mrabarnett.plus.com> - 2012-12-06 17:52 +0000
            Re: Confused compare function :) Anatoli Hristov <tolidtm@gmail.com> - 2012-12-06 19:25 +0100
            Re: Confused compare function :) Anatoli Hristov <tolidtm@gmail.com> - 2012-12-07 14:36 +0100
          Re: Confused compare function :) Rotwang <sg552@hotmail.co.uk> - 2012-12-06 19:24 +0000
        Re: Confused compare function :) Chris Angelico <rosuav@gmail.com> - 2012-12-06 20:34 +1100
        Re: Confused compare function :) Rotwang <sg552@hotmail.co.uk> - 2012-12-06 19:25 +0000

Page 1 of 3  [1] 2 3  Next page →


#34347 — Re: Confused compare function :)

FromBruno Dupuis <python.ml.bruno.dupuis@lisael.org>
Date2012-12-06 01:19 +0100
SubjectRe: Confused compare function :)
Message-ID<mailman.538.1354753193.29569.python-list@python.org>
On Wed, Dec 05, 2012 at 11:50:49PM +0100, Anatoli Hristov wrote:
> I'm confused again with a compare update function. The problem is that
> my function does not work at all and I don't get it where it comes
> from.
> 
> in my DB I have total of 754 products. when I run the function is says:
> Total updated: 754
> Total not found with in the distributor: 747
> I just don't get it, can you find my  mistake ?
> 
> Thanks in advance
> 
> def Change_price():
>     total = 0
>     tnf = 0
>     for row in DB: # DB is mySQL DB, logically I get out 1 SKU and I
> compare it with next loop
>         isku = row["sku"]
>         isku = isku.lower()
>         iprice = row["price"]
>         iprice = int(iprice)
>         found = 0
>         try:
>             for x in PRICELIST:# here is my next loop in a CSV file
> which is allready in a list PRICELIST
>                 try:
>                     dprice = x[6]
>                     dprice = dprice.replace(",",".") # As in the
> PRICELIST the prices are with commas I replace the comma as python
> request it
>                     dprice = float(dprice)
>                     newprice = round(dprice)*1.10
>                     dsku = x[4]
>                     dsku = dsku.lower()
>                     stock = int(x[7])
>                     if isku == dsku and newprice < int(iprice):# If
> found the coresponded SKU and the price is higher than the one in the
> CSV I update the price
>                         print dsku, x[6], dprice, newprice
>                         Update_SQL(newprice, isku)# goes to the SQL Update
>                         print isku, newprice
>                         if isku == dsku:# Just a check to see if it works
>                             print "Found %s" %dsku
>                             found = 1
>                         else:
>                             found = 0
>                 except IndexError:
>                     pass
>                 except ValueError:
>                     pass
>                 except TypeError:
>                     pass
>         except IndexError:
>             pass
>         if found == 1:
>             print "%s This is match" % isku
>         if found == 0:
>             print "%s Not found" % isku
>             tnf = tnf +1
>         total = total +1
>     print "Total updated: %s" % total
>     print"Total not found with in the distributor: %s" % tnf

I tried, I swear I did try, I didn't understand the whole algorithm of the
function. However, in a first sight, I find it way to deeply nested.
def ... for ... try ... for ... if ... if. Can't you split it in several
function, or in methods of a callable class? Somtimes it's finally much
more clear and the errors become obvious. Another advice: never ever

except XXXError:
    pass

at least log, or count, or warn, or anything, but don't pass. I bet your
missing products have disapeared into those black holes. 

mmmh, now, i see that you set found to 1 only if newprice <int(iprice)...
new_price is a float (newprice = round(dprice)*1.10) that you compare with
an int? is that correct? seems strangee to me.

-- 
Bruno Dupuis

[toc] | [next] | [standalone]


#34349

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-12-06 00:42 +0000
Message-ID<50bfe9d8$0$29994$c3e8da3$5496439d@news.astraweb.com>
In reply to#34347
On Thu, 06 Dec 2012 01:19:58 +0100, Bruno Dupuis wrote:

> I tried, I swear I did try, I didn't understand the whole algorithm of
> the function. However, in a first sight, I find it way to deeply nested.

Yes!

But basically, the code seems to run a pair of nested for-loops:

for SKU in database:
    for SKU in csv file:
        if the two SKUs match:
            compare their prices and update the database



> def ... for ... try ... for ... if ... if. 

You missed a second try.

> Can't you split it in several
> function, or in methods of a callable class? Somtimes it's finally much
> more clear and the errors become obvious. Another advice: never ever
> 
> except XXXError:
>     pass
> 
> at least log, or count, or warn, or anything, but don't pass. I bet your
> missing products have disapeared into those black holes.

I think that "never" is too strong, but otherwise I agree with you.


> mmmh, now, i see that you set found to 1 only if newprice
> <int(iprice)... new_price is a float (newprice = round(dprice)*1.10)
> that you compare with an int? is that correct? seems strangee to me.


There's nothing wrong with comparing floats to ints.

However, possibly iprice is not intended to be an int. The code is rather 
confused and the names are at best obscure and at worst actively 
misleading. I assumed that prices must be ints, (iprice means "integer 
price"?) but that might not be the case, since later on another price is 
multiplied by 1.10.

Also I wonder if the code is meant to calculate the new price:

newprice = round(dprice)*1.10  # dprice is the price in the CSV file!

or perhaps it is meant to be:

newprice = int(round(dprice*1.10))


That seems likely.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#34416

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2012-12-06 13:41 -0500
Message-ID<mailman.580.1354819258.29569.python-list@python.org>
In reply to#34349
On 06 Dec 2012 00:42:00 GMT, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> declaimed the following in
gmane.comp.python.general:

> But basically, the code seems to run a pair of nested for-loops:
> 
> for SKU in database:
>     for SKU in csv file:
>         if the two SKUs match:
>             compare their prices and update the database
>
	OUCH...

	I'm presuming the CSV is restarted each time the database record
changes...

	That would seem better reformulated as:

	for (SKU, price) in CSV:
		DB.execute("update SKUtable set price = %s where SKU = %s",
						(price, SKU)	)
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#34419

FromAnatoli Hristov <tolidtm@gmail.com>
Date2012-12-06 19:55 +0100
Message-ID<mailman.583.1354820111.29569.python-list@python.org>
In reply to#34349
> gmane.comp.python.general:
>
>> But basically, the code seems to run a pair of nested for-loops:
>>
>> for SKU in database:
>>     for SKU in csv file:
>>         if the two SKUs match:
>>             compare their prices and update the database
>>
>         OUCH...
>
>         I'm presuming the CSV is restarted each time the database record
> changes...
>
>         That would seem better reformulated as:
>
>         for (SKU, price) in CSV:
>                 DB.execute("update SKUtable set price = %s where SKU = %s",
>                                                 (price, SKU)    )
> --
I don't know if I get it right, but
Nope, each loop I'm getting:
for x in CSV:
MONIIE2409HDS-B1;MON;II;E2409HDS-B1;E2409HDS-B1;IIYAMA LCD 24" Wide
1920x1080 TN Panel Speakers 2MS Black;130;9;RECTD0.41;0,41;;;;;;;;;

So x[4] is my SKU and x[5] is price and so on. Each loop looks like this.

[toc] | [prev] | [next] | [standalone]


#34356

FromRotwang <sg552@hotmail.co.uk>
Date2012-12-06 03:22 +0000
Message-ID<k9p32p$nek$1@dont-email.me>
In reply to#34347
On 06/12/2012 00:19, Bruno Dupuis wrote:
> [...]
>
> Another advice: never ever
>
> except XXXError:
>      pass
>
> at least log, or count, or warn, or anything, but don't pass.

Really? I've used that kind of thing several times in my code. For 
example, there's a point where I have a list of strings and I want to 
create a list of those ints that are represented in string form in my 
list, so I do this:

listofints = []
for k in listofstrings:
	try:
		listofints.append(int(k))
	except ValueError:
		pass

Another example: I have a dialog box with an entry field where the user 
can specify a colour by entering a string, and a preview box showing the 
colour. I want the preview to automatically update when the user has 
finished entering a valid colour string, so whenever the entry field is 
modified I call this:

def preview(*args):
	try:
		previewbox.config(bg = str(entryfield.get()))
	except tk.TclError:
		pass

Is there a problem with either of the above? If so, what should I do 
instead?


-- 
I have made a thing that superficially resembles music:

http://soundcloud.com/eroneity/we-berated-our-own-crapiness

[toc] | [prev] | [next] | [standalone]


#34358

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-12-06 04:32 +0000
Message-ID<50c01fe2$0$21853$c3e8da3$76491128@news.astraweb.com>
In reply to#34356
On Thu, 06 Dec 2012 03:22:53 +0000, Rotwang wrote:

> On 06/12/2012 00:19, Bruno Dupuis wrote:
>> [...]
>>
>> Another advice: never ever
>>
>> except XXXError:
>>      pass
>>
>> at least log, or count, or warn, or anything, but don't pass.
> 
> Really? I've used that kind of thing several times in my code. For
> example, there's a point where I have a list of strings and I want to
> create a list of those ints that are represented in string form in my
> list, so I do this:
> 
> listofints = []
> for k in listofstrings:
> 	try:
> 		listofints.append(int(k))
> 	except ValueError:
> 		pass
> 
> Another example: I have a dialog box with an entry field where the user
> can specify a colour by entering a string, and a preview box showing the
> colour. I want the preview to automatically update when the user has
> finished entering a valid colour string, so whenever the entry field is
> modified I call this:
> 
> def preview(*args):
> 	try:
> 		previewbox.config(bg = str(entryfield.get()))
> 	except tk.TclError:
> 		pass
> 
> Is there a problem with either of the above? If so, what should I do
> instead?

They're fine.

Never, ever say that people should never, ever do something.


*cough*


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#34369

FromBruno Dupuis <python.ml.bruno.dupuis@lisael.org>
Date2012-12-06 09:49 +0100
Message-ID<mailman.549.1354783761.29569.python-list@python.org>
In reply to#34358
On Thu, Dec 06, 2012 at 04:32:34AM +0000, Steven D'Aprano wrote:
> On Thu, 06 Dec 2012 03:22:53 +0000, Rotwang wrote:
> 
> > On 06/12/2012 00:19, Bruno Dupuis wrote:
> >> [...]
> >>
> >> Another advice: never ever
> >>
> >> except XXXError:
> >>      pass
> >>
> >> at least log, or count, or warn, or anything, but don't pass.
> > 
> > Really? I've used that kind of thing several times in my code. For
> > example, there's a point where I have a list of strings and I want to
> > create a list of those ints that are represented in string form in my
> > list, so I do this:
> > 
> > listofints = []
> > for k in listofstrings:
> > 	try:
> > 		listofints.append(int(k))
> > 	except ValueError:
> > 		pass
> > 
> > Another example: I have a dialog box with an entry field where the user
> > can specify a colour by entering a string, and a preview box showing the
> > colour. I want the preview to automatically update when the user has
> > finished entering a valid colour string, so whenever the entry field is
> > modified I call this:
> > 
> > def preview(*args):
> > 	try:
> > 		previewbox.config(bg = str(entryfield.get()))
> > 	except tk.TclError:
> > 		pass
> > 
> > Is there a problem with either of the above? If so, what should I do
> > instead?
> 
> They're fine.
> 
> Never, ever say that people should never, ever do something.
> 
> 
> *cough*
> 

Well, dependening on the context (who provides listofstrings?) I would
log or count errors on the first one... or not.

On the second one, I would split the expression, because (not sure of
that point, i didn't import tk for years) previewbox.config and
entryfield.get may raise a tk.TclError for different reasons.

The point is Exceptions are made for error handling, not for normal
workflow. I hate when i read that for example:

    try:
        do_stuff(mydict[k])
    except KeyError:
        pass

(loads of them in many libraries and frameworks)
instead of:

    if k in mydict:
        do_stuff(mydict[k])

Note that the performances are better with the latter.

There are some exceptions to this, though, like StopIteration

For me, it's a rule of thumb, except: pass is possible in situations
where I control every input data, and I deeply, exactly know all code
interractions. If figuring all this out is longer (it's almost always
the case) than typing:

log.warning('oops:\n %s' % traceback.format_exc())

I log. 

It depends also on the context, I'd be more 'permissive' a short
script than into a large program, framework, or lib, for the
very reason it's easy to know all code interactions.

In my coder life, i spent more time debugging silently swallowed exceptions
than logging abnormal behaviours.

-- 
Bruno Dupuis

[toc] | [prev] | [next] | [standalone]


#34377

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-12-06 11:47 +0000
Message-ID<50c085e5$0$29994$c3e8da3$5496439d@news.astraweb.com>
In reply to#34369
On Thu, 06 Dec 2012 09:49:26 +0100, Bruno Dupuis wrote:

> The point is Exceptions are made for error handling, not for normal
> workflow. 

That's certainly not the case in Python. Using exceptions for flow 
control is a standard part of the language.

IndexError and StopIteration are used to detect the end of lists or 
iterators in for loops.

GeneratorExit is used to request that generators exit.

SysExit is used to exit the interpreter.


There is nothing wrong with using exceptions for flow control in 
moderation.


> I hate when i read that for example:
> 
>     try:
>         do_stuff(mydict[k])
>     except KeyError:
>         pass
> 
> (loads of them in many libraries and frameworks) instead of:
> 
>     if k in mydict:
>         do_stuff(mydict[k])
> 
> Note that the performances are better with the latter.

Not so. Which one is faster will depend on how often you expect to fail. 
If the keys are nearly always present, then:

try:
    do_stuff(mydict[k])
except KeyError:
    pass

will be faster. Setting up a try block is very fast, about as fast as 
"pass", and faster than "if k in mydict".

But if the key is often missing, then catching the exception will be 
slow, and the "if k in mydict" version may be faster. It depends on how 
often the key is missing.


[...]
> It depends also on the context, I'd be more 'permissive' a short script
> than into a large program, framework, or lib, for the very reason it's
> easy to know all code interactions.
> 
> In my coder life, i spent more time debugging silently swallowed
> exceptions than logging abnormal behaviours.


That's fine. I agree with you about not silently swallowing errors. Where 
I disagree is that you said "never ever", which is an exaggeration. 
Remember that exceptions are not always errors.

Problem: take a list of strings, and add up the ones which are integers, 
ignoring everything else.

Solution:

total = 0
for s in list_of_strings:
    try:
        total += int(s)
    except ValueError:
        pass  # Not a number, ignore it.


Why would you want to log that? It's not an error, it is working as 
designed. I hate software that logs every little thing that happens, so 
that you cannot tell what's important and what isn't.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#34378

Frompeter <pjmakey2@gmail.com>
Date2012-12-06 08:55 -0300
Message-ID<mailman.554.1354795380.29569.python-list@python.org>
In reply to#34377
On 12/06/2012 08:47 AM, Steven D'Aprano wrote:
> On Thu, 06 Dec 2012 09:49:26 +0100, Bruno Dupuis wrote:
>
>> The point is Exceptions are made for error handling, not for normal
>> workflow.
> That's certainly not the case in Python. Using exceptions for flow
> control is a standard part of the language.
>
> IndexError and StopIteration are used to detect the end of lists or
> iterators in for loops.
>
> GeneratorExit is used to request that generators exit.
>
> SysExit is used to exit the interpreter.
>
>
> There is nothing wrong with using exceptions for flow control in
> moderation.
>
>
>> I hate when i read that for example:
>>
>>      try:
>>          do_stuff(mydict[k])
>>      except KeyError:
>>          pass
>>
>> (loads of them in many libraries and frameworks) instead of:
>>
>>      if k in mydict:
>>          do_stuff(mydict[k])
>>
>> Note that the performances are better with the latter.
> Not so. Which one is faster will depend on how often you expect to fail.
> If the keys are nearly always present, then:
>
> try:
>      do_stuff(mydict[k])
> except KeyError:
>      pass
>
> will be faster. Setting up a try block is very fast, about as fast as
> "pass", and faster than "if k in mydict".
>
> But if the key is often missing, then catching the exception will be
> slow, and the "if k in mydict" version may be faster. It depends on how
> often the key is missing.
>
>
> [...]
>> It depends also on the context, I'd be more 'permissive' a short script
>> than into a large program, framework, or lib, for the very reason it's
>> easy to know all code interactions.
>>
>> In my coder life, i spent more time debugging silently swallowed
>> exceptions than logging abnormal behaviours.
>
> That's fine. I agree with you about not silently swallowing errors. Where
> I disagree is that you said "never ever", which is an exaggeration.
> Remember that exceptions are not always errors.
>
> Problem: take a list of strings, and add up the ones which are integers,
> ignoring everything else.
>
> Solution:
>
> total = 0
> for s in list_of_strings:
>      try:
>          total += int(s)
>      except ValueError:
>          pass  # Not a number, ignore it.
>
>
> Why would you want to log that? It's not an error, it is working as
> designed. I hate software that logs every little thing that happens, so
> that you cannot tell what's important and what isn't.
>
>
>
Is perfectly right to use try catch for a flow control.
Just think in something more complex like this.

    try:
             self._conn = MySQLdb.connect(host=host,
                 user=user,
                 passwd=passwd,
                 db=db)
     except:
             logging.info("Error de conexion con la base de datos")
             inform(subject = 'Db down on app %s' % app, body=sbody)

Or maybe something like this.

try:
             cursor.execute(sqli, data)
             self._conn.commit()
except:
             try:
                 self._conn.rollback()
                 cursor.execute(sqli, data)
                 self._conn.commit()
             except Exception, e:
                 pass
                 # print e
                 # logging.info('ERROR en la insercion %s' % e)

This is pretty dumb, but is a valid example, on what you can do with try 
catch

[toc] | [prev] | [next] | [standalone]


#34381

FromHans Mulder <hansmu@xs4all.nl>
Date2012-12-06 14:32 +0100
Message-ID<50c09e80$0$6897$e4fe514c@news2.news.xs4all.nl>
In reply to#34378
On 6/12/12 12:55:16, peter wrote:
> Is perfectly right to use try catch for a flow control.
> Just think in something more complex like this.
> 
>    try:
>             self._conn = MySQLdb.connect(host=host,
>                 user=user,
>                 passwd=passwd,
>                 db=db)
>     except:
>             logging.info("Error de conexion con la base de datos")
>             inform(subject = 'Db down on app %s' % app, body=sbody)

This is an example of the sort of incorrect code you
should try to avoid.  An improved version is:

   self._conn = MySQLdb.connect(host=host,
                user=user,
                passwd=passwd,
                db=db)

By not catching the exception, you're allowing the
Python interpreter to report what the problem was,
for example "Keyboard interrupt" or "Access denied".

By report "DB down" when there is no reason to assume
that that is the problem, you're confusing the user.

> Or maybe something like this.
> 
> try:
>             cursor.execute(sqli, data)
>             self._conn.commit()
> except:
>             try:
>                 self._conn.rollback()
>                 cursor.execute(sqli, data)
>                 self._conn.commit()
>             except Exception, e:
>                 pass
>                 # print e
>                 # logging.info('ERROR en la insercion %s' % e)

This is another example of what not to do.  Even the
commented-out print statement loses information, viz.
the traceback.

If you leave out the try/except, then more accurate
information will be printed, and a programmer who needs
to fix the problem, can run the code under the debugger
and it will automatically stop at the point where the
uncaught exception is raised.  That's much easier than
having to set breakpoints at all the "except Exception:"
clauses in a typical chunk of hard-to-maintain code.

Context managers were invented to make it easier to do
this sort of thing correctly.  For example:

    with sqlite3.connect(dbpath) as connection:
        connection.cursor().execute(sqli, data)

If the flow reaches the end of the "with" command,
the connection object will self.commit() automatically.
If an exception is raised, the connection object will
self.rollback() automatically.  No try/except required.

This is shorter, and much easier to get right.

> This is pretty dumb, but is a valid example, on what you can
> do with try catch

It is an unfortunate fact of life that you can write code
that is hard to maintain.  The fact that you *can* do this,
does not mean that you should.


Hope this helps,

-- HansM

[toc] | [prev] | [next] | [standalone]


#34384

FromChris Angelico <rosuav@gmail.com>
Date2012-12-07 00:47 +1100
Message-ID<mailman.556.1354801639.29569.python-list@python.org>
In reply to#34381
On Fri, Dec 7, 2012 at 12:32 AM, Hans Mulder <hansmu@xs4all.nl> wrote:
> On 6/12/12 12:55:16, peter wrote:
>> Is perfectly right to use try catch for a flow control.
>> Just think in something more complex like this.
>>
>>    try:
>>             self._conn = MySQLdb.connect(host=host,
>>                 user=user,
>>                 passwd=passwd,
>>                 db=db)
>>     except:
>>             logging.info("Error de conexion con la base de datos")
>>             inform(subject = 'Db down on app %s' % app, body=sbody)
>
> This is an example of the sort of incorrect code you
> should try to avoid.  An improved version is:
>
>    self._conn = MySQLdb.connect(host=host,
>                 user=user,
>                 passwd=passwd,
>                 db=db)
>
> By not catching the exception, you're allowing the
> Python interpreter to report what the problem was,
> for example "Keyboard interrupt" or "Access denied".
>
> By report "DB down" when there is no reason to assume
> that that is the problem, you're confusing the user.

The problem with the original example is that it has a bare except,
which will catch too much. Call it an oversimplified example, perhaps
:) By not catching the exception, you doom your entire script to
abort. What Steven called felicide is catching exceptions *and
immediately exiting*, which offers little benefit over just letting
the exception propagate up.

>> Or maybe something like this.
>>
>> try:
>>             cursor.execute(sqli, data)
>>             self._conn.commit()
>> except:
>>             try:
>>                 self._conn.rollback()
>>                 cursor.execute(sqli, data)
>>                 self._conn.commit()
>>             except Exception, e:
>>                 pass
>>                 # print e
>>                 # logging.info('ERROR en la insercion %s' % e)
>
> This is another example of what not to do.  Even the
> commented-out print statement loses information, viz.
> the traceback.
>
> If you leave out the try/except, then more accurate
> information will be printed, and a programmer who needs
> to fix the problem, can run the code under the debugger
> and it will automatically stop at the point where the
> uncaught exception is raised.  That's much easier than
> having to set breakpoints at all the "except Exception:"
> clauses in a typical chunk of hard-to-maintain code.

Again, oversimplified example. If this were real code, I'd criticize
the bare except (and the "except Exception", which has the same
problem).

> Context managers were invented to make it easier to do
> this sort of thing correctly.  For example:
>
>     with sqlite3.connect(dbpath) as connection:
>         connection.cursor().execute(sqli, data)
>
> If the flow reaches the end of the "with" command,
> the connection object will self.commit() automatically.
> If an exception is raised, the connection object will
> self.rollback() automatically.  No try/except required.
>
> This is shorter, and much easier to get right.

But it does something completely different. The original code's logic
is: Try the query, then commit. If that fails, roll back and have
another shot at it. This is dangerous if there could have been other
statements in the transaction (they won't be retried), but otherwise,
it's a reasonable way of dealing with serialization failures. It has
its risks, of course, but it's not meant to be a demo of database
code, it's a demo of try/except.

>> This is pretty dumb, but is a valid example, on what you can
>> do with try catch
>
> It is an unfortunate fact of life that you can write code
> that is hard to maintain.  The fact that you *can* do this,
> does not mean that you should.

Agreed. However, the mere presence of try/except does not make code
unmaintainable, nor is it a strong indication that the code already
was.

ChrisA

[toc] | [prev] | [next] | [standalone]


#34379

FromChris Angelico <rosuav@gmail.com>
Date2012-12-06 23:14 +1100
Message-ID<mailman.555.1354796060.29569.python-list@python.org>
In reply to#34377
On Thu, Dec 6, 2012 at 10:47 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> Not so. Which one is faster will depend on how often you expect to fail.
> If the keys are nearly always present, then:
>
> try:
>     do_stuff(mydict[k])
> except KeyError:
>     pass
>
> will be faster. Setting up a try block is very fast, about as fast as
> "pass", and faster than "if k in mydict".
>
> But if the key is often missing, then catching the exception will be
> slow, and the "if k in mydict" version may be faster. It depends on how
> often the key is missing.
>

Setting up the try/except is a constant time cost, while the
duplicated search for k inside the dictionary might depend on various
other factors. In the specific case of a Python dictionary, the
membership check is fairly cheap (assuming you're not the subject of a
hash collision attack - Py3.3 makes that a safe assumption), but if
you were about to execute a program and wanted to first find out if it
existed, that extra check could be ridiculously expensive, eg if the
path takes you on a network drive - or, worse, on multiple network
drives, which I have had occasion to do!

ChrisA

[toc] | [prev] | [next] | [standalone]


#34486

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-12-07 22:16 +0000
Message-ID<50c26aa6$0$29994$c3e8da3$5496439d@news.astraweb.com>
In reply to#34379
On Thu, 06 Dec 2012 23:14:17 +1100, Chris Angelico wrote:

> Setting up the try/except is a constant time cost, 

It's not just constant time, it's constant time and *cheap*. Doing 
nothing inside a try block takes about twice as long as doing nothing:

[steve@ando ~]$ python2.7 -m timeit "try: pass
> except: pass"
10000000 loops, best of 3: 0.062 usec per loop

[steve@ando ~]$ python2.7 -m timeit "pass"
10000000 loops, best of 3: 0.0317 usec per loop


> while the duplicated
> search for k inside the dictionary might depend on various other
> factors. 

It depends on the type, size and even the history of the dict, as well as 
the number, type and values of the keys. Assuming a built-in dict, we can 
say that in the absence of many collisions, key lookup can be amortized 
over many lookups as constant time.


> In the specific case of a Python dictionary, the membership
> check is fairly cheap (assuming you're not the subject of a hash
> collision attack - Py3.3 makes that a safe assumption), 

Don't be so sure -- the hash randomization algorithm for Python 3.3 is 
trivially beaten by an attacker.

http://bugs.python.org/issue14621#msg173455

but in general, yes, key lookup in dicts is fast. But not as fast as 
setting up a try block.

Keep in mind too that the "Look Before You Leap" strategy is 
fundamentally unsound if you are using threads:

# in main thread:
if key in mydict:  # returns True
    x = mydict[key]  # fails with KeyError

How could this happen? In the fraction of a second between checking 
whether the key exists and actually looking up the key, another thread 
could delete it! This is a classic race condition, also known as a Time 
Of Check To Time Of Use bug.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#34489

FromTerry Reedy <tjreedy@udel.edu>
Date2012-12-08 02:01 -0500
Message-ID<mailman.617.1354950105.29569.python-list@python.org>
In reply to#34486
On 12/7/2012 5:16 PM, Steven D'Aprano wrote:
> On Thu, 06 Dec 2012 23:14:17 +1100, Chris Angelico wrote:
>
>> Setting up the try/except is a constant time cost,
>
> It's not just constant time, it's constant time and *cheap*. Doing
> nothing inside a try block takes about twice as long as doing nothing:
>
> [steve@ando ~]$ python2.7 -m timeit "try: pass
>> except: pass"
> 10000000 loops, best of 3: 0.062 usec per loop
>
> [steve@ando ~]$ python2.7 -m timeit "pass"
> 10000000 loops, best of 3: 0.0317 usec per loop
>
>
>> while the duplicated
>> search for k inside the dictionary might depend on various other
>> factors.
>
> It depends on the type, size and even the history of the dict, as well as
> the number, type and values of the keys. Assuming a built-in dict, we can
> say that in the absence of many collisions, key lookup can be amortized
> over many lookups as constant time.
>
>
>> In the specific case of a Python dictionary, the membership
>> check is fairly cheap (assuming you're not the subject of a hash
>> collision attack - Py3.3 makes that a safe assumption),
>
> Don't be so sure -- the hash randomization algorithm for Python 3.3 is
> trivially beaten by an attacker.
>
> http://bugs.python.org/issue14621#msg173455
>
> but in general, yes, key lookup in dicts is fast. But not as fast as
> setting up a try block.
>
> Keep in mind too that the "Look Before You Leap" strategy is
> fundamentally unsound if you are using threads:
>
> # in main thread:
> if key in mydict:  # returns True
>      x = mydict[key]  # fails with KeyError
>
> How could this happen? In the fraction of a second between checking
> whether the key exists and actually looking up the key, another thread
> could delete it! This is a classic race condition, also known as a Time
> Of Check To Time Of Use bug.

I generally agree with everything Steven has said here and in previous 
responses and add the following.

There are two reasons to not execute a block of code.

1. It could and would run, but we do not want it to run because a) we do 
not want an answer, even if correct; b) it would return a wrong answer 
(which of course we do not want); or c) it would run forever and never 
give any answer. To not run code, for any of these reasons, requires an 
if statement.

2. It will not run but will raise an exception instead. In this case, we 
can always use try-except. Sometimes we can detect that it would not run 
before running it, and can use an if statement instead. (But as Steven 
points out, this is sometimes trickier than it might seem.) However, 
even if we can reliably detect that code would either run or raise an 
exception, this often or even usually requires doing redundant calculation.

For example, 'key in mydict' must hash the key, mod the hash according 
to the size of the dict, find the corresponding slot in the dict, and do 
an equality comparison with the existing key in the dict. If not equal, 
repeat according to the collision algorithm for inserting keys.

In other words, 'key in mydict' does everything done by 'mydict[key]' 
except to actually fetch the value when the right slot is found or raise 
an exception if there is no right slot.

So why ever use a redundant condition check? A. esthetics. B. 
practicality. Unfortunately, catching exceptions may be and often is as 
slow as the redundant check and even multiple redundant checks.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#34490

FromChris Angelico <rosuav@gmail.com>
Date2012-12-08 18:17 +1100
Message-ID<mailman.618.1354951042.29569.python-list@python.org>
In reply to#34486
On Sat, Dec 8, 2012 at 6:01 PM, Terry Reedy <tjreedy@udel.edu> wrote:
> Unfortunately, catching exceptions may be and often is as slow as the
> redundant check and even multiple redundant checks.

It depends on how often you're going to catch and how often just flow
through. In Python, as in most other modern languages, exceptions only
cost you when they get thrown. The extra check, though, costs you in
the normal case.

ChrisA

[toc] | [prev] | [next] | [standalone]


#34498

FromMRAB <python@mrabarnett.plus.com>
Date2012-12-08 17:50 +0000
Message-ID<mailman.628.1354989048.29569.python-list@python.org>
In reply to#34486
On 2012-12-08 07:17, Chris Angelico wrote:
> On Sat, Dec 8, 2012 at 6:01 PM, Terry Reedy <tjreedy@udel.edu> wrote:
>> Unfortunately, catching exceptions may be and often is as slow as the
>> redundant check and even multiple redundant checks.
>
> It depends on how often you're going to catch and how often just flow
> through. In Python, as in most other modern languages, exceptions only
> cost you when they get thrown. The extra check, though, costs you in
> the normal case.
>
That's where the .get method comes in handy:

MISSING = object()
...
value = my_dict.get(key, MISSING)
if value is not MISSING:
    ...

It could be faster if the dict often doesn't contain the key.

[toc] | [prev] | [next] | [standalone]


#34516

FromRamchandra Apte <maniandram01@gmail.com>
Date2012-12-08 19:07 -0800
Message-ID<b32a0bad-4772-4cb8-b371-869f706a2ac8@googlegroups.com>
In reply to#34379
On Thursday, 6 December 2012 17:44:17 UTC+5:30, Chris Angelico  wrote:
> On Thu, Dec 6, 2012 at 10:47 PM, Steven D'Aprano
> 
> <steve+comp.lang.python@pearwood.info> wrote:
> 
> > Not so. Which one is faster will depend on how often you expect to fail.
> 
> > If the keys are nearly always present, then:
> 
> >
> 
> > try:
> 
> >     do_stuff(mydict[k])
> 
> > except KeyError:
> 
> >     pass
> 
> >
> 
> > will be faster. Setting up a try block is very fast, about as fast as
> 
> > "pass", and faster than "if k in mydict".
> 
> >
> 
> > But if the key is often missing, then catching the exception will be
> 
> > slow, and the "if k in mydict" version may be faster. It depends on how
> 
> > often the key is missing.
> 
> >
> 
> 
> 
> Setting up the try/except is a constant time cost, while the
> 
> duplicated search for k inside the dictionary might depend on various
> 
> other factors. In the specific case of a Python dictionary, the
> 
> membership check is fairly cheap (assuming you're not the subject of a
> 
> hash collision attack - Py3.3 makes that a safe assumption), but if
> 
> you were about to execute a program and wanted to first find out if it
> 
> existed, that extra check could be ridiculously expensive, eg if the
> 
> path takes you on a network drive - or, worse, on multiple network
> 
> drives, which I have had occasion to do!
> 
> 
> 
> ChrisA

Not really. I remember a bug saying that only 256 hashes were required of known texts and then the randomization becomes useless.

[toc] | [prev] | [next] | [standalone]


#34517

FromChris Angelico <rosuav@gmail.com>
Date2012-12-09 14:22 +1100
Message-ID<mailman.643.1355023344.29569.python-list@python.org>
In reply to#34516
On Sun, Dec 9, 2012 at 2:07 PM, Ramchandra Apte <maniandram01@gmail.com> wrote:
> Not really. I remember a bug saying that only 256 hashes were required of known texts and then the randomization becomes useless.

That requires that someone be able to get you to hash some text and
give back the hash. In any case, even if you _are_ dealing with the
worst-case hash collision attack, all it does is stop a Python
dictionary from being an exception to the general principle. If you're
doing a lookup in, say, a tree, then checking if the element exists
and then retrieving it means walking the tree twice - O(log n) if the
tree's perfectly balanced, though a splay tree would be potentially
quite efficient at that particular case. But there's still extra cost
to the check.

ChrisA

[toc] | [prev] | [next] | [standalone]


#34519

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-12-09 07:39 +0000
Message-ID<50c44027$0$29994$c3e8da3$5496439d@news.astraweb.com>
In reply to#34517
On Sun, 09 Dec 2012 14:22:21 +1100, Chris Angelico wrote:

> On Sun, Dec 9, 2012 at 2:07 PM, Ramchandra Apte <maniandram01@gmail.com>
> wrote:
>> Not really. I remember a bug saying that only 256 hashes were required
>> of known texts and then the randomization becomes useless.
> 
> That requires that someone be able to get you to hash some text and give
> back the hash. In any case, even if you _are_ dealing with the
> worst-case hash collision attack, all it does is stop a Python
> dictionary from being an exception to the general principle.


Dictionaries never were an exception to the general principle.

Regardless of how cheap or expensive it is, whether it is a constant cost 
or a potentially unbound expense or somewhere in between, the "look" in 
"look before you leap" tests has some cost.

# Look Before You Leap (LBYL)
if condition():
   do_this()
else:
   do_that()


# Easier to Ask Forgiveness than Permission (EAFP)
try:
    do_this()
except ConditionFailed:
    do_that()


Depending on the cost, and how often you have to pay it, either LBYL or 
EAFP will be cheaper. Sometimes you can predict which one will be cheaper 
-- "most of the time, the key will be present" -- sometimes you can't. 
But there's always a choice, and the choice is not always the right 
choice.


But... why are we focusing only on the cost? What about *correctness*?

LBYL suffers from an even more critical problem, never mind the cost of 
the check. That problem is the gap in time between the check and when you 
go to do the actual work. In the fraction of a second between checking 
condition and calling do_this(), the situation might have changed. The 
file that did exist has now been deleted by another program. The key was 
in the dict, and now another thread has deleted it. The network was up, 
and now its down.

These sorts of things are extremely hard to diagnose, extremely hard to 
prevent, and extremely hard to replicate. They lead to race conditions 
and "Time Of Check to Time Of Use" bugs. Many of the more tricky, subtle 
security vulnerabilities are due to TOCTOU bugs.

So, very often, it isn't enough to Look Before You Leap. You *still* have 
to be prepared to Ask Forgiveness:

# not good enough
if os.path.exists(filename):
    f = open(filename)
else:
    handle_missing_file()

# should be
if os.path.exists(filename):
    try:
        f = open(filename)
    except (IOError, OSError):
        handle_missing_file()
else:
    handle_missing_file()


But in that case, what's the point of the initial check?



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#34518

FromRamchandra Apte <maniandram01@gmail.com>
Date2012-12-08 19:07 -0800
Message-ID<mailman.644.1355027681.29569.python-list@python.org>
In reply to#34379
On Thursday, 6 December 2012 17:44:17 UTC+5:30, Chris Angelico  wrote:
> On Thu, Dec 6, 2012 at 10:47 PM, Steven D'Aprano
> 
> <steve+comp.lang.python@pearwood.info> wrote:
> 
> > Not so. Which one is faster will depend on how often you expect to fail.
> 
> > If the keys are nearly always present, then:
> 
> >
> 
> > try:
> 
> >     do_stuff(mydict[k])
> 
> > except KeyError:
> 
> >     pass
> 
> >
> 
> > will be faster. Setting up a try block is very fast, about as fast as
> 
> > "pass", and faster than "if k in mydict".
> 
> >
> 
> > But if the key is often missing, then catching the exception will be
> 
> > slow, and the "if k in mydict" version may be faster. It depends on how
> 
> > often the key is missing.
> 
> >
> 
> 
> 
> Setting up the try/except is a constant time cost, while the
> 
> duplicated search for k inside the dictionary might depend on various
> 
> other factors. In the specific case of a Python dictionary, the
> 
> membership check is fairly cheap (assuming you're not the subject of a
> 
> hash collision attack - Py3.3 makes that a safe assumption), but if
> 
> you were about to execute a program and wanted to first find out if it
> 
> existed, that extra check could be ridiculously expensive, eg if the
> 
> path takes you on a network drive - or, worse, on multiple network
> 
> drives, which I have had occasion to do!
> 
> 
> 
> ChrisA

Not really. I remember a bug saying that only 256 hashes were required of known texts and then the randomization becomes useless.

[toc] | [prev] | [next] | [standalone]


Page 1 of 3  [1] 2 3  Next page →

Back to top | Article view | comp.lang.python


csiph-web