Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #27155

Re: Strange behavior

Date 2012-08-16 13:18 +0200
From Virgil Stokes <vs@it.uu.se>
Subject Re: Strange behavior
References <1a1834ae-2b4a-473f-b626-f37a17588199@googlegroups.com> <mailman.3285.1344973216.4697.python-list@python.org> <502aeb2b$0$29978$c3e8da3$5496439d@news.astraweb.com>
Newsgroups comp.lang.python
Message-ID <mailman.3356.1345117482.4697.python-list@python.org> (permalink)

Show all headers | View raw


[Multipart message — attachments visible in raw view] - view raw

On 15-Aug-2012 02:19, Steven D'Aprano wrote:
> On Tue, 14 Aug 2012 21:40:10 +0200, Virgil Stokes wrote:
>
>> You might find the following useful:
>>
>> def testFunc(startingList):
>>       xOnlyList = []; j = -1
>>       for xl in startingList:
>>           if (xl[0] == 'x'):
> That's going to fail in the starting list contains an empty string. Use
> xl.startswith('x') instead.
Yes, but this was by design (tacitly assumed that startingList was both a list 
and non-empty).
>
>
>>               xOnlyList.append(xl)
>>           else:
>>               j += 1
>>               startingList[j] = xl
> Very cunning, but I have to say that your algorithm fails the "is this
> obviously correct without needing to study it?" test. Sometimes that is
> unavoidable, but for something like this, there are simpler ways to solve
> the same problem.
Sorry, but I do not sure what you mean here.
>
>
>>       if j == -1:
>>           startingList = []
>>       else:
>>           del startingList[j:-1]
>>       return(xOnlyList)
>
>> And here is another version using list comprehension that I prefer
>> def testFunc2(startingList):
>>       return([x for x in startingList if x[0] == 'x'], [x for x in
>> startingList if x[0] != 'x'])
> This walks over the starting list twice, doing essentially the same thing
> both times. It also fails to meet the stated requirement that
> startingList is modified in place, by returning a new list instead.
This can meet the requirement that startingList is modified in place via the 
call to this function (see the attached code).
> Here's an example of what I mean:
>
> py> mylist = mylist2 = ['a', 'x', 'b', 'xx', 'cx']  # two names for one
> list
> py> result, mylist = testFunc2(mylist)
> py> mylist
> ['a', 'b', 'cx']
> py> mylist2  # should be same as mylist
> ['a', 'x', 'b', 'xx', 'cx']
Yes, I had a typo in my original posting --- sorry about that!
>
> Here is the obvious algorithm for extracting and removing words starting
> with 'x'. It walks the starting list only once, and modifies it in place.
> The only trick needed is list slice assignment at the end.
>
> def extract_x_words(words):
>      words_with_x = []
>      words_without_x = []
>      for word in words:
>          if word.startswith('x'):
>              words_with_x.append(word)
>          else:
>              words_without_x.append(word)
>      words[:] = words_without_x  # slice assignment
>      return words_with_x
Suppose words was not a list --- you have tacitly assumed that words is a list.
>
> The only downside of this is that if the list of words is so enormous
> that you can fit it in memory *once* but not *twice*, this may fail. But
> the same applies to the list comprehension solution.
But, this is not the only downside if speed is important --- it is slower than 
the list comprehension method (see results that follows).

Here is a summary of three algorithms (algorithm-1, algorithm-2, algorithm-2A) 
that I tested (see attached code). Note, algorithm-2A was obtained by removing 
the slice assignment in the above code and modifying the return as follows

def extract_x_words(words):
     words_with_x = []
     words_without_x = []
     for word in words:
         if word.startswith('x'):
             words_with_x.append(word)
         else:
             words_without_x.append(word)
     #words[:] = words_without_x  # slice assignment
     return words_with_x, words_without_x

Of course, one needs to modify the call for "in-place" update of startingList as 
follows:

    xOnlyList,startingList = extract_x_words(startingList)

Here is a summary of my timing results obtained for 3 different algorithms for 
lists with 100,000 strings of length 4 in each list:

Method
	average (sd) time in seconds
algorithm-1 (list comprehension)
	0.11630 (0.0014)
algorithm-2 (S. D'Aprano)
	0.17594 (0.0014)
algorithm-2A (modified S. D'Aprano)
	0.18217 (0.0023)


These values  were obtained from 100 independent runs (MC simulations) on lists 
that contain 100,000 strings. Approximately 50% of these strings contained a 
leading 'x'. Note, that the results show that algorithm-2 (suggested by S. 
D'Aprano) is approximately 51% slower than algorithm-1 (list comprehensions) and 
algorithm-2A (simple modification of algorithm-2) is approximately 57% slower 
than algorithm-1. Why is algorithm-2A slower than algorithm-2?

I would be interested in seeing code that is faster than algorithm-1 --- any 
suggestions are welcomed.  And of course, if there are any errors in my attached 
code please inform me of them and I will try to correct them as soon as 
possible. Note, some of the code is actually irrelevant for the original 
"Strange behavior" post.

Have a good day!

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Strange behavior light1quark@gmail.com - 2012-08-14 08:38 -0700
  Re: Strange behavior Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-08-14 17:59 +0200
    Re: Strange behavior Terry Reedy <tjreedy@udel.edu> - 2012-08-14 15:05 -0400
  Re: Strange behavior Virgil Stokes <vs@it.uu.se> - 2012-08-14 21:40 +0200
    Re: Strange behavior Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-15 00:19 +0000
      Re: Strange behavior Virgil Stokes <vs@it.uu.se> - 2012-08-16 13:18 +0200
        Re: Strange behavior Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-16 17:40 +0000
      Re: Strange behavior Peter Otten <__peter__@web.de> - 2012-08-16 15:02 +0200
  Re: Strange behavior light1quark@gmail.com - 2012-08-14 12:20 -0700
    Re: Strange behavior Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-08-15 11:57 +0200
  Re: Strange behavior Chris Angelico <rosuav@gmail.com> - 2012-08-15 07:55 +1000
    Re: Strange behavior Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-08-15 11:50 +0200

csiph-web