Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder2.hal-mli.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.008 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'only,': 0.07; 'suppress': 0.07; 'python': 0.09; 'subject:string': 0.09; 'subject:using': 0.09; 'def': 0.10; 'subject:error': 0.11; '100,': 0.16; 'better:': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'silly': 0.16; 'sure.': 0.16; 'tweak': 0.16; 'variations': 0.16; 'zero,': 0.16; 'wrote:': 0.17; 'skip:i 40': 0.17; 'string,': 0.17; 'jan': 0.18; 'trying': 0.21; 'import': 0.21; 'own.': 0.22; "i've": 0.23; 'seems': 0.23; 'header:In-Reply-To:1': 0.25; 'expand': 0.26; 'am,': 0.27; 'message-id:@mail.gmail.com': 0.27; 'options': 0.27; 'chris': 0.28; 'decide': 0.28; 'fine': 0.28; 'complain': 0.29; "d'aprano": 0.29; 'steven': 0.29; "i'm": 0.29; 'that.': 0.30; 'usually': 0.30; 'fri,': 0.30; 'subject: : ': 0.30; 'function': 0.30; 'error': 0.30; 'suggestion': 0.32; 'getting': 0.33; 'shorter': 0.33; 'to:addr:python-list': 0.33; 'version': 0.34; 'received:google.com': 0.34; 'done': 0.34; 'doing': 0.35; 'sometimes': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'really': 0.36; 'but': 0.36; 'should': 0.36; 'enough': 0.36; 'usual': 0.37; 'received:209': 0.37; 'well.': 0.37; 'subject:: ': 0.38; 'mean': 0.38; 'instead': 0.39; 'to:addr:python.org': 0.39; 'your': 0.60; 'easy': 0.60; 'first': 0.61; 'free': 0.61; 'between': 0.63; 'necessarily': 0.63; 'more': 0.63; '2013': 0.84; 'cater': 0.84; 'conversion:': 0.84; 'distinguish': 0.84; 'front.': 0.84; 'serious': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=vORWsfN/iIhUJ9ORBHRJWycoKtE84SZM+t44pFJPvUc=; b=WfEPD3JZooVP8kJTpxy7aNQVefcGvzQfqXzg6djUw7oCcxyvSS+lwfz+3fvMJL5fOo kryYSqN5PVo7bChWPG8wqKtHitn7Pq6tGzQepKNFTqaf9Ia596EcMcWHB9ZRX5Bz/dp1 xwR0nG1q8TNbhUNkDVq9IJd/gZ/OO2dOx0S7Xtos14/PJdgc0jhhpVq+Gq5PB2ROTysh rGXXja6KjYeO4Gcr2D8IvxfrvMzpHATsa1Vtk+kSthdxaULZssUMqym+GxzVlIei2lhq ZQT5l1NtP+SckixwDFMHGcRmrwbgWqBRyfz3KXz36VEMdkTT+nCXRl2KQl4Qw7ph/DgA T9OA== MIME-Version: 1.0 X-Received: by 10.221.11.205 with SMTP id pf13mr4092786vcb.70.1359076055276; Thu, 24 Jan 2013 17:07:35 -0800 (PST) In-Reply-To: <5101cfdb$0$29980$c3e8da3$5496439d@news.astraweb.com> References: <51011822.3020702@tobix.eu> <5101cfdb$0$29980$c3e8da3$5496439d@news.astraweb.com> Date: Fri, 25 Jan 2013 12:07:35 +1100 Subject: Re: using split for a string : error From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 56 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1359076064 news.xs4all.nl 6944 [2001:888:2000:d::a6]:45643 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:37644 On Fri, Jan 25, 2013 at 11:20 AM, Steven D'Aprano wrote: > Chris Angelico wrote: > >> It's usually fine to have int() complain about any non-numerics in the >> string, but I must confess, I do sometimes yearn for atoi() semantics: >> atoi("123asd") == 123, and atoi("qqq") == 0. I've not seen a >> convenient Python function for doing that. Usually it involves >> manually getting the digits off the front. All I want is to suppress >> the error on finding a non-digit. Oh well. > > It's easy enough to write your own. All you need do is decide what you > mean by "suppress the error on finding a non-digit". > > Should atoi("123xyz456") return 123 or 123456? > > Should atoi("xyz123") return 0 or 123? > > And here's a good one: > > Should atoi("1OOl") return 1, 100, or 1001? 123, 0, and 1. That's standard atoi semantics. > That last is a serious suggestion by the way. There are still many people > who do not distinguish between 1 and l or 0 and O. Sure. But I'm not trying to cater to people who get it wrong; that's a job for a DWIM. > def atoi(s): > from unicodedata import digit > i = 0 > for c in s: > i *= 10 > i += digit(c, 0) > return i > > Variations that stop on the first non-digit, instead of treating them as > zero, are not much more difficult. And yes, I'm fully aware that I can roll my own. Here's a shorter version (ASCII digits only, feel free to expand to Unicode), not necessarily better: def atoi(s): return int("0"+s[:-len(s.lstrip("0123456789"))]) It just seems silly that this should have to be done separately, when it's really just a tweak to the usual string-to-int conversion: when you come to a non-digit, take one of three options (throw error, skip, or terminate). Anyway, not a big deal. ChrisA