Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #10012 > unrolled thread

Convert '165.0' to int

Started by"Frank Millman" <frank@chagford.com>
First post2011-07-21 11:31 +0200
Last post2011-07-25 13:11 -0400
Articles 7 — 6 participants

Back to article view | Back to comp.lang.python


Contents

  Convert '165.0' to int "Frank Millman" <frank@chagford.com> - 2011-07-21 11:31 +0200
    Re: Convert '165.0' to int SigmundV <sigmundv@gmail.com> - 2011-07-24 11:27 -0700
      Re: Convert '165.0' to int Billy Mays <noway@nohow.com> - 2011-07-24 20:07 -0400
        Re: Convert '165.0' to int Chris Angelico <rosuav@gmail.com> - 2011-07-25 15:46 +1000
        Re: Convert '165.0' to int Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-25 19:48 +1000
          Re: Convert '165.0' to int SigmundV <sigmundv@gmail.com> - 2011-07-25 09:39 -0700
          Re: Convert '165.0' to int Billy Mays <81282ed9a88799d21e77957df2d84bd6514d9af6@myhashismyemail.com> - 2011-07-25 13:11 -0400

#10012 — Convert '165.0' to int

From"Frank Millman" <frank@chagford.com>
Date2011-07-21 11:31 +0200
SubjectConvert '165.0' to int
Message-ID<mailman.1315.1311240764.1164.python-list@python.org>
Hi all

I want to convert '165.0' to an integer.

The obvious method does not work -

>>> x = '165.0'
>>> int(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '165.0'

If I convert to a float first, it does work -

>>> int(float(x))
165
>>>

Is there a short cut, or must I do this every time (I have lots of them!) ? 
I know I can write a function to do this, but is there anything built-in?

Thanks

Frank Millman

[toc] | [next] | [standalone]


#10221

FromSigmundV <sigmundv@gmail.com>
Date2011-07-24 11:27 -0700
Message-ID<d62232ce-0600-48f6-8eaf-a2bda993d7c9@cq10g2000vbb.googlegroups.com>
In reply to#10012
On Jul 21, 10:31 am, "Frank Millman" <fr...@chagford.com> wrote:
> Is there a short cut, or must I do this every time (I have lots of them!) ?
> I know I can write a function to do this, but is there anything built-in?

I'd say that we have established that there is no shortcut, no built-
in for this. You write you own function:

string_to_int = lambda s: int(float(s))

Then you apply it to your list of strings:

list_of_integers = map(string_to_int, list_of_strings)

Of course, this will be horribly slow if you have thousands of
strings. In such a case you should use an iterator (assuming you use
python 2.7):

import itertools as it
iterator = it.imap(string_to_int, list_of_strings)


Regards,

Sigmund

[toc] | [prev] | [next] | [standalone]


#10234

FromBilly Mays <noway@nohow.com>
Date2011-07-24 20:07 -0400
Message-ID<j0ic3o$b5k$1@speranza.aioe.org>
In reply to#10221
On 7/24/2011 2:27 PM, SigmundV wrote:
> On Jul 21, 10:31 am, "Frank Millman"<fr...@chagford.com>  wrote:
>> Is there a short cut, or must I do this every time (I have lots of them!) ?
>> I know I can write a function to do this, but is there anything built-in?
>
> I'd say that we have established that there is no shortcut, no built-
> in for this. You write you own function:
>
> string_to_int = lambda s: int(float(s))
>
> Then you apply it to your list of strings:
>
> list_of_integers = map(string_to_int, list_of_strings)
>
> Of course, this will be horribly slow if you have thousands of
> strings. In such a case you should use an iterator (assuming you use
> python 2.7):
>
> import itertools as it
> iterator = it.imap(string_to_int, list_of_strings)
>
>
> Regards,
>
> Sigmund

if the goal is speed, then you should use generator expressions:

list_of_integers = (int(float(s)) for s in list_of_strings)

[toc] | [prev] | [next] | [standalone]


#10242

FromChris Angelico <rosuav@gmail.com>
Date2011-07-25 15:46 +1000
Message-ID<mailman.1443.1311572784.1164.python-list@python.org>
In reply to#10234
On Mon, Jul 25, 2011 at 10:07 AM, Billy Mays <noway@nohow.com> wrote:
> if the goal is speed, then you should use generator expressions:
>
> list_of_integers = (int(float(s)) for s in list_of_strings)

Clarification: This is faster if and only if you don't actually need
it as a list. In spite of the variable name, it's NOT a list, and you
can't index it (eg you can't work with list_of_integers[7]). However,
you can iterate over it to work with the integers in sequence, and for
that specific (and very common) use, it will be faster and use less
memory than actually creating the list. It's also going to be a LOT
faster than creating the list, if you only need a few from the
beginning of it; the generator evaluates lazily.

Personally, I'd just create a tiny function and use that, as has been suggested.

ChrisA

[toc] | [prev] | [next] | [standalone]


#10253

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2011-07-25 19:48 +1000
Message-ID<4e2d3bf7$0$29971$c3e8da3$5496439d@news.astraweb.com>
In reply to#10234
On Mon, 25 Jul 2011 10:07 am Billy Mays wrote:

> On 7/24/2011 2:27 PM, SigmundV wrote:

>> list_of_integers = map(string_to_int, list_of_strings)
>>
>> Of course, this will be horribly slow if you have thousands of
>> strings. In such a case you should use an iterator (assuming you use
>> python 2.7):
>>
>> import itertools as it
>> iterator = it.imap(string_to_int, list_of_strings)

 
> if the goal is speed, then you should use generator expressions:
> 
> list_of_integers = (int(float(s)) for s in list_of_strings)


I'm not intending to pick on Billy or Sigmund here, but for the beginners
out there, there are a lot of myths about the relative speed of map, list
comprehensions, generator expressions, etc.

The usual optimization rules apply:

    We should forget about small efficiencies, say about 97% of 
    the time: premature optimization is the root of all evil.
    -- Donald Knuth

    More computing sins are committed in the name of efficiency 
    (without necessarily achieving it) than for any other single 
    reason - including blind stupidity. -- W.A. Wulf

and of course:

    If you haven't measured it, you're only guessing whether it is 
    faster or slower. 

(And unless you're named Raymond Hettinger, I give little or no credibility
to your guesses except for the most obvious cases. *wink*)

Generators (including itertools.imap) include some overhead which list
comprehensions don't have (at least in some versions of Python). So for
small sets of data, creating the generator may be more time consuming than
evaluating the generator all the way through.

For large sets of data, that overhead is insignificant, but in *total*
generators aren't any faster than creating the list up front. They can't
be. They end up doing the same amount of work: if you have to process one
million strings, then whether you use a list comp or a gen expression, you
still end up processing one million strings. The only advantage to the
generator expression (and it is a HUGE advantage, don't get me wrong!) is
that you can do the processing lazily, on demand, rather than all up front,
possibly bailing out early if necessary.

But if you end up pre-processing the entire data set, there is no advantage
to using a gen expression rather than a list comp, or map. So which is
faster depends on how you end up using the data.

One other important proviso: if your map function is a wrapper around a
Python expression:

map(lambda x: x+1, data)
[x+1 for x in data]

then the list comp will be much faster, due to the overhead of the function
call. List comps and gen exprs can inline the expression x+1, performing it
in fast C rather than slow Python.

But if you're calling a function in both cases:

map(int, data)
[int(x) for x in data]

then the overhead of the function call is identical for both the map and the
list comp, and they should be equally as fast. Or slow, as the case may be.

But don't take my word on this! Measure, measure, measure! Performance is
subject to change without notice. I could be mistaken.

(And don't forget that everything changes in Python 3. Whatever you think
you know about speed in Python 2, it will be different in Python 3.
Generator expressions become more efficient; itertools.imap disappears; the
built-in map becomes a lazy generator rather than returning a list.)


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#10286

FromSigmundV <sigmundv@gmail.com>
Date2011-07-25 09:39 -0700
Message-ID<8ba02eae-453a-464e-bcf7-5a2cc172744d@x10g2000vbl.googlegroups.com>
In reply to#10253
On Jul 25, 10:48 am, Steven D'Aprano <steve
+comp.lang.pyt...@pearwood.info> wrote:
>
> One other important proviso: if your map function is a wrapper around a
> Python expression:
>
> map(lambda x: x+1, data)
> [x+1 for x in data]
>
> then the list comp will be much faster, due to the overhead of the function
> call. List comps and gen exprs can inline the expression x+1, performing it
> in fast C rather than slow Python.
>
> But if you're calling a function in both cases:
>
> map(int, data)
> [int(x) for x in data]
>
> then the overhead of the function call is identical for both the map and the
> list comp, and they should be equally as fast. Or slow, as the case may be.

I would like to thank Steven for his enlightening (at least for me)
post.

In the OP's case I'd keep everything as lists initially. If speed then
is an issue other constructs can be considered. The use of map in the
example only reflects my inherently mathematical way of thinking.

Generally, I'd say

1) write code that works, i.e. does what it's intended to do in all
cases, and
2) if speed is an issue, try to sort out the main culprits.

Coding style is a different issue altogether, but in general I'd say
that one should use self-explanatory variable names.


Sigmund

[toc] | [prev] | [next] | [standalone]


#10291

FromBilly Mays <81282ed9a88799d21e77957df2d84bd6514d9af6@myhashismyemail.com>
Date2011-07-25 13:11 -0400
Message-ID<j0k83u$qf6$1@speranza.aioe.org>
In reply to#10253
On 07/25/2011 05:48 AM, Steven D'Aprano wrote:
> But if you're calling a function in both cases:
>
> map(int, data)
> [int(x) for x in data]
>

I am aware the premature optimization is a danger, but its also 
incorrect to ignore potential performance pitfalls.

I would favor a generator expression here, if only because I think its 
easier to read.  In addition, it properly handles large amounts of data 
by not duplicating the list.  For very long input sequences, genexp 
would be the proper thing to do (assuming you don't need to index into 
results, in which case, its wrong.)

I think the fastest way to solve the OP's problem is the following: ;)

def convert_165_0_to_int(arg):
     return 165


--
Bill

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web