Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #27048 > unrolled thread

Strange behavior

Started bylight1quark@gmail.com
First post2012-08-14 08:38 -0700
Last post2012-08-15 11:50 +0200
Articles 12 — 7 participants

Back to article view | Back to comp.lang.python


Contents

  Strange behavior light1quark@gmail.com - 2012-08-14 08:38 -0700
    Re: Strange behavior Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-08-14 17:59 +0200
      Re: Strange behavior Terry Reedy <tjreedy@udel.edu> - 2012-08-14 15:05 -0400
    Re: Strange behavior Virgil Stokes <vs@it.uu.se> - 2012-08-14 21:40 +0200
      Re: Strange behavior Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-15 00:19 +0000
        Re: Strange behavior Virgil Stokes <vs@it.uu.se> - 2012-08-16 13:18 +0200
          Re: Strange behavior Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-16 17:40 +0000
        Re: Strange behavior Peter Otten <__peter__@web.de> - 2012-08-16 15:02 +0200
    Re: Strange behavior light1quark@gmail.com - 2012-08-14 12:20 -0700
      Re: Strange behavior Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-08-15 11:57 +0200
    Re: Strange behavior Chris Angelico <rosuav@gmail.com> - 2012-08-15 07:55 +1000
      Re: Strange behavior Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-08-15 11:50 +0200

#27048 — Strange behavior

Fromlight1quark@gmail.com
Date2012-08-14 08:38 -0700
SubjectStrange behavior
Message-ID<1a1834ae-2b4a-473f-b626-f37a17588199@googlegroups.com>
Hi, I am migrating from PHP to Python and I am slightly confused.

I am making a function that takes a startingList, finds all the strings in the list that begin with 'x', removes those strings and puts them into a xOnlyList.

However if you run the code you will notice only one of the strings beginning with 'x' is removed from the startingList.
If I comment out 'startingList.remove(str);' the code runs with both strings beginning with 'x' being put in the xOnlyList.
Using the print statement I noticed that the second string that begins with 'x' isn't even identified by the function. Why does this happen?

def testFunc(startingList):
	xOnlyList = [];
	for str in startingList:
		if (str[0] == 'x'):
			print str;
			xOnlyList.append(str)
			startingList.remove(str) #this seems to be the problem
	print xOnlyList;
	print startingList
testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews'])

#Thanks for your help!

[toc] | [next] | [standalone]


#27049

FromAlain Ketterlin <alain@dpt-info.u-strasbg.fr>
Date2012-08-14 17:59 +0200
Message-ID<87lihhpiq9.fsf@dpt-info.u-strasbg.fr>
In reply to#27048
light1quark@gmail.com writes:

> However if you run the code you will notice only one of the strings
> beginning with 'x' is removed from the startingList.

>
> def testFunc(startingList):
> 	xOnlyList = [];
> 	for str in startingList:
> 		if (str[0] == 'x'):
> 			print str;
> 			xOnlyList.append(str)
> 			startingList.remove(str) #this seems to be the problem
> 	print xOnlyList;
> 	print startingList
> testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews'])
>
> #Thanks for your help!

Try with ['xasd', 'sefwr', 'xjkl', 'dfsews'] and you'll understand what
happens. Also, have a look at:

http://docs.python.org/reference/compound_stmts.html#the-for-statement

You can't modify the list you're iterating on, better use another list
to collect the result.

-- Alain.

P/S: str is a builtin, you'd better avoid assigning to it.

[toc] | [prev] | [next] | [standalone]


#27055

FromTerry Reedy <tjreedy@udel.edu>
Date2012-08-14 15:05 -0400
Message-ID<mailman.3284.1344971165.4697.python-list@python.org>
In reply to#27049
On 8/14/2012 11:59 AM, Alain Ketterlin wrote:
> light1quark@gmail.com writes:
>
>> However if you run the code you will notice only one of the strings
>> beginning with 'x' is removed from the startingList.
>
>>
>> def testFunc(startingList):
>> 	xOnlyList = [];
>> 	for str in startingList:
>> 		if (str[0] == 'x'):
>> 			print str;
>> 			xOnlyList.append(str)
>> 			startingList.remove(str) #this seems to be the problem
>> 	print xOnlyList;
>> 	print startingList
>> testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews'])
>>
>> #Thanks for your help!
>
> Try with ['xasd', 'sefwr', 'xjkl', 'dfsews'] and you'll understand what
> happens. Also, have a look at:
>
> http://docs.python.org/reference/compound_stmts.html#the-for-statement
>
> You can't modify the list you're iterating on,

Except he obviously did ;-).
(Modifying set or dict raises SomeError.)

Indeed, people routine *replace* items while iterating.

def squarelist(lis):
     for i, n in enumerate(lis):
         lis[i] = n*n
     return lis

print(squarelist([0,1,2,3,4,5]))
# [0, 1, 4, 9, 16, 25]

Removals can be handled by iterating in reverse. This works even with 
duplicates because if the item removed is not the one tested, the one 
tested gets retested.

def removeodd(lis):
     for n in reversed(lis):
         if n % 2:
             lis.remove(n)
         print(n, lis)

ll = [0,1, 5, 5, 4, 5]
removeodd(ll)
 >>>
5 [0, 1, 5, 4, 5]
5 [0, 1, 4, 5]
5 [0, 1, 4]
4 [0, 1, 4]
1 [0, 4]
0 [0, 4]

> better use another list to collect the result.

If there are very many removals, a new list will be faster, even if one 
needs to copy the new list back into the original, as k removals from 
len n list is O(k*n) versus O(n) for new list and copy.

> P/S: str is a builtin, you'd better avoid assigning to it.

Agreed. People have actually posted code doing something like

...
list = [1,2,3]
...
z = list(x)
...
and wondered and asked why it does not work.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#27056

FromVirgil Stokes <vs@it.uu.se>
Date2012-08-14 21:40 +0200
Message-ID<mailman.3285.1344973216.4697.python-list@python.org>
In reply to#27048
On 2012-08-14 17:38, light1quark@gmail.com wrote:
> Hi, I am migrating from PHP to Python and I am slightly confused.
>
> I am making a function that takes a startingList, finds all the strings in the list that begin with 'x', removes those strings and puts them into a xOnlyList.
>
> However if you run the code you will notice only one of the strings beginning with 'x' is removed from the startingList.
> If I comment out 'startingList.remove(str);' the code runs with both strings beginning with 'x' being put in the xOnlyList.
> Using the print statement I noticed that the second string that begins with 'x' isn't even identified by the function. Why does this happen?
>
> def testFunc(startingList):
> 	xOnlyList = [];
> 	for str in startingList:
> 		if (str[0] == 'x'):
> 			print str;
> 			xOnlyList.append(str)
> 			startingList.remove(str) #this seems to be the problem
> 	print xOnlyList;
> 	print startingList
> testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews'])
>
> #Thanks for your help!

You might find the following useful:

def testFunc(startingList):
     xOnlyList = []; j = -1
     for xl in startingList:
         if (xl[0] == 'x'):
             xOnlyList.append(xl)
         else:
             j += 1
             startingList[j] = xl
     if j == -1:
         startingList = []
     else:
         del startingList[j:-1]

     return(xOnlyList)


testList1 = ['xasd', 'xjkl', 'sefwr', 'dfsews']
testList2 = ['xasd', 'xjkl', 'xsefwr', 'xdfsews']
testList3 = ['xasd', 'jkl', 'sefwr', 'dfsews']
testList4 = ['asd', 'jkl', 'sefwr', 'dfsews']

xOnlyList = testFunc(testList1)
print 'xOnlyList = ',xOnlyList
print 'testList = ',testList1
xOnlyList = testFunc(testList2)
print 'xOnlyList = ',xOnlyList
print 'testList = ',testList2
xOnlyList = testFunc(testList3)
print 'xOnlyList = ',xOnlyList
print 'testList = ',testList3
xOnlyList = testFunc(testList4)
print 'xOnlyList = ',xOnlyList
print 'testList = ',testList4

And here is another version using list comprehension that I prefer

testList1 = ['xasd', 'xjkl', 'sefwr', 'dfsews']
testList2 = ['xasd', 'xjkl', 'xsefwr', 'xdfsews']
testList3 = ['xasd', 'jkl', 'sefwr', 'dfsews']
testList4 = ['asd', 'jkl', 'sefwr', 'dfsews']

def testFunc2(startingList):
     return([x for x in startingList if x[0] == 'x'], [x for x in
startingList if x[0] != 'x'])

xOnlyList,testList = testFunc2(testList1)
print xOnlyList
print testList
xOnlyList,testList = testFunc2(testList2)
print xOnlyList
print testList
xOnlyList,testList = testFunc2(testList3)
print xOnlyList
print testList
xOnlyList,testList = testFunc2(testList4)
print xOnlyList
print testList

[toc] | [prev] | [next] | [standalone]


#27066

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-08-15 00:19 +0000
Message-ID<502aeb2b$0$29978$c3e8da3$5496439d@news.astraweb.com>
In reply to#27056
On Tue, 14 Aug 2012 21:40:10 +0200, Virgil Stokes wrote:

> You might find the following useful:
> 
> def testFunc(startingList):
>      xOnlyList = []; j = -1
>      for xl in startingList:
>          if (xl[0] == 'x'):

That's going to fail in the starting list contains an empty string. Use 
xl.startswith('x') instead.


>              xOnlyList.append(xl)
>          else:
>              j += 1
>              startingList[j] = xl

Very cunning, but I have to say that your algorithm fails the "is this 
obviously correct without needing to study it?" test. Sometimes that is 
unavoidable, but for something like this, there are simpler ways to solve 
the same problem.


>      if j == -1:
>          startingList = []
>      else:
>          del startingList[j:-1]
>      return(xOnlyList)


> And here is another version using list comprehension that I prefer

> def testFunc2(startingList):
>      return([x for x in startingList if x[0] == 'x'], [x for x in
> startingList if x[0] != 'x'])

This walks over the starting list twice, doing essentially the same thing 
both times. It also fails to meet the stated requirement that 
startingList is modified in place, by returning a new list instead. 
Here's an example of what I mean:

py> mylist = mylist2 = ['a', 'x', 'b', 'xx', 'cx']  # two names for one 
list
py> result, mylist = testFunc2(mylist)
py> mylist
['a', 'b', 'cx']
py> mylist2  # should be same as mylist
['a', 'x', 'b', 'xx', 'cx']

Here is the obvious algorithm for extracting and removing words starting 
with 'x'. It walks the starting list only once, and modifies it in place. 
The only trick needed is list slice assignment at the end.

def extract_x_words(words):
    words_with_x = []
    words_without_x = []
    for word in words:
        if word.startswith('x'):
            words_with_x.append(word)
        else:
            words_without_x.append(word)
    words[:] = words_without_x  # slice assignment
    return words_with_x


The only downside of this is that if the list of words is so enormous 
that you can fit it in memory *once* but not *twice*, this may fail. But 
the same applies to the list comprehension solution.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#27155

FromVirgil Stokes <vs@it.uu.se>
Date2012-08-16 13:18 +0200
Message-ID<mailman.3356.1345117482.4697.python-list@python.org>
In reply to#27066

[Multipart message — attachments visible in raw view] — view raw

On 15-Aug-2012 02:19, Steven D'Aprano wrote:
> On Tue, 14 Aug 2012 21:40:10 +0200, Virgil Stokes wrote:
>
>> You might find the following useful:
>>
>> def testFunc(startingList):
>>       xOnlyList = []; j = -1
>>       for xl in startingList:
>>           if (xl[0] == 'x'):
> That's going to fail in the starting list contains an empty string. Use
> xl.startswith('x') instead.
Yes, but this was by design (tacitly assumed that startingList was both a list 
and non-empty).
>
>
>>               xOnlyList.append(xl)
>>           else:
>>               j += 1
>>               startingList[j] = xl
> Very cunning, but I have to say that your algorithm fails the "is this
> obviously correct without needing to study it?" test. Sometimes that is
> unavoidable, but for something like this, there are simpler ways to solve
> the same problem.
Sorry, but I do not sure what you mean here.
>
>
>>       if j == -1:
>>           startingList = []
>>       else:
>>           del startingList[j:-1]
>>       return(xOnlyList)
>
>> And here is another version using list comprehension that I prefer
>> def testFunc2(startingList):
>>       return([x for x in startingList if x[0] == 'x'], [x for x in
>> startingList if x[0] != 'x'])
> This walks over the starting list twice, doing essentially the same thing
> both times. It also fails to meet the stated requirement that
> startingList is modified in place, by returning a new list instead.
This can meet the requirement that startingList is modified in place via the 
call to this function (see the attached code).
> Here's an example of what I mean:
>
> py> mylist = mylist2 = ['a', 'x', 'b', 'xx', 'cx']  # two names for one
> list
> py> result, mylist = testFunc2(mylist)
> py> mylist
> ['a', 'b', 'cx']
> py> mylist2  # should be same as mylist
> ['a', 'x', 'b', 'xx', 'cx']
Yes, I had a typo in my original posting --- sorry about that!
>
> Here is the obvious algorithm for extracting and removing words starting
> with 'x'. It walks the starting list only once, and modifies it in place.
> The only trick needed is list slice assignment at the end.
>
> def extract_x_words(words):
>      words_with_x = []
>      words_without_x = []
>      for word in words:
>          if word.startswith('x'):
>              words_with_x.append(word)
>          else:
>              words_without_x.append(word)
>      words[:] = words_without_x  # slice assignment
>      return words_with_x
Suppose words was not a list --- you have tacitly assumed that words is a list.
>
> The only downside of this is that if the list of words is so enormous
> that you can fit it in memory *once* but not *twice*, this may fail. But
> the same applies to the list comprehension solution.
But, this is not the only downside if speed is important --- it is slower than 
the list comprehension method (see results that follows).

Here is a summary of three algorithms (algorithm-1, algorithm-2, algorithm-2A) 
that I tested (see attached code). Note, algorithm-2A was obtained by removing 
the slice assignment in the above code and modifying the return as follows

def extract_x_words(words):
     words_with_x = []
     words_without_x = []
     for word in words:
         if word.startswith('x'):
             words_with_x.append(word)
         else:
             words_without_x.append(word)
     #words[:] = words_without_x  # slice assignment
     return words_with_x, words_without_x

Of course, one needs to modify the call for "in-place" update of startingList as 
follows:

    xOnlyList,startingList = extract_x_words(startingList)

Here is a summary of my timing results obtained for 3 different algorithms for 
lists with 100,000 strings of length 4 in each list:

Method
	average (sd) time in seconds
algorithm-1 (list comprehension)
	0.11630 (0.0014)
algorithm-2 (S. D'Aprano)
	0.17594 (0.0014)
algorithm-2A (modified S. D'Aprano)
	0.18217 (0.0023)


These values  were obtained from 100 independent runs (MC simulations) on lists 
that contain 100,000 strings. Approximately 50% of these strings contained a 
leading 'x'. Note, that the results show that algorithm-2 (suggested by S. 
D'Aprano) is approximately 51% slower than algorithm-1 (list comprehensions) and 
algorithm-2A (simple modification of algorithm-2) is approximately 57% slower 
than algorithm-1. Why is algorithm-2A slower than algorithm-2?

I would be interested in seeing code that is faster than algorithm-1 --- any 
suggestions are welcomed.  And of course, if there are any errors in my attached 
code please inform me of them and I will try to correct them as soon as 
possible. Note, some of the code is actually irrelevant for the original 
"Strange behavior" post.

Have a good day!

[toc] | [prev] | [next] | [standalone]


#27187

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-08-16 17:40 +0000
Message-ID<502d309a$0$29978$c3e8da3$5496439d@news.astraweb.com>
In reply to#27155
On Thu, 16 Aug 2012 13:18:59 +0200, Virgil Stokes wrote:

> On 15-Aug-2012 02:19, Steven D'Aprano wrote:
>> On Tue, 14 Aug 2012 21:40:10 +0200, Virgil Stokes wrote:
>>
>>> You might find the following useful:
>>>
>>> def testFunc(startingList):
>>>       xOnlyList = []; j = -1
>>>       for xl in startingList:
>>>           if (xl[0] == 'x'):
>> That's going to fail in the starting list contains an empty string. Use
>> xl.startswith('x') instead.
>
> Yes, but this was by design (tacitly assumed that startingList was both
> a list and non-empty).

As Peter already pointed out, I said it would fail if the list contains 
an empty string, not if the list was empty.


>>>               xOnlyList.append(xl)
>>>           else:
>>>               j += 1
>>>               startingList[j] = xl
>>
>> Very cunning, but I have to say that your algorithm fails the "is this
>> obviously correct without needing to study it?" test. Sometimes that is
>> unavoidable, but for something like this, there are simpler ways to
>> solve the same problem.
>
> Sorry, but I do not sure what you mean here.

In a perfect world, you should be able to look at a piece of code, read 
it once, and see whether or not it is correct. That is what I mean by 
"obviously correct". For example, if I have a function that takes an 
argument, doubles it, and prints the result:

def f1(x):
    print(2*x)


that is obviously correct. Whereas this is not:

def f2(x):
    y = (x + 5)**2 - (x + 4)**2
    sys.stdout.write(str(y - 9) + '\n')


because you have to study it to see whether or not it works correctly.

Not all programs are simple enough to be obviously correct. Sometimes you 
have no choice but to write something which requires cleverness to get 
the right result. But this is not one of those cases. You should almost 
always prefer simple code over clever code, because the greatest expense 
in programming (time, effort and money) is to make code correct.

Most code does not need to be fast. But all code needs to be correct.


[...]
> This can meet the requirement that startingList is modified in place via
> the call to this function (see the attached code).

Good grief! See, that's exactly the sort of thing I'm talking about. 
Without *detailed* study of your attached code, how can I possibly know 
what it does or whether it does it correctly?

Your timing code calculates the mean using a recursive algorithm. Why 
don't you calculate the mean the standard way: add the numbers and divide 
by the total? What benefit do you gain from a more complicated algorithm 
when a simple one will do the job just as well?

You have spent a lot of effort creating a complicated, non-obvious piece 
of timing code, with different random seeds for each run, and complicated 
ways of calculating timing statistics... but unfortunately the most 
important part of any timing test, the actually *timing*, is not done 
correctly. Consequently, your code is not correct.

With an average time of a fraction of a second, none of those timing 
results are trustworthy, because they are vulnerable to interference from 
other processes, the operating system, and other random noise. You spend 
a lot of time processing the timing results, but it is Garbage In, 
Garbage Out -- the results are not trustworthy, and if they are correct, 
it is only by accident.

Later in your post, you run some tests, and are surprised by the result:

> Why is algorithm-2A slower than algorithm-2?

It isn't slower. It is physically impossible, since 2A does *less* work 
than 2. This demonstrates that you are actually taking a noisy 
measurement: the values you get have random noise, and you don't make any 
effort to minimise that noise. Hence GIGO.

The right way to test small code snippets is with the timeit module. It 
is carefully written to overcome as much random noise as possible. But 
even there, the authors of the timeit module are very clear that you 
should not try to calculate means, let alone higher order statistics like 
standard deviation. The only statistic which is trustworthy is to run as 
many trials as you can afford, and select the minimum value.

So here is my timing code, which is much shorter and simpler and doesn't 
try to do too much. You do need to understand the timeit.Timer class:

timeit.Timer creates a timer object; timer.repeat does the actual timing. 
The specific arguments to them are not vital to understand, but you can 
read the documentation if you wish to find out what they mean.

First, I define the two functions. I compare similar functions that have 
the same effect. Neither modifies the input argument in place. Copy and 
paste the following block into an interactive interpreter:

# Start block

def f1(startingList):
    return ([x for x in startingList if x[0] == 'x'],
            [x for x in startingList if x[0] != 'x'])

# Note that the above function is INCORRECT, it will fail if a string is
# empty; nevertheless I will use it for timing purposes anyway.


def f2(startingList):
    words_without_x = []
    words_with_x = []
    for word in startingList:
        if word.startswith('x'):
            words_with_x.append(word)
        else:
            words_without_x.append(word)
    return (words_with_x, words_without_x)

# Set up some test data.  There's no point being too clever about this.
# Keep it simple.

import random
data = ['aa', 'bb', 'cb', 'xa', 'xb', 'xc']*1000000
random.shuffle(data)

# Set up two timers.
from timeit import Timer
setup = "from __main__ import data, f1, f2"
t1 = Timer("a, b = f1(data)", setup)
t2 = Timer("a, b = f2(data)", setup)

# and run the timers
best1 = min(t1.repeat(number=1, repeat=10))
best2 = min(t2.repeat(number=1, repeat=10))

# End block


On my computer, here are the results. Yours may differ.

best1: 3.5199968814849854
best2: 3.515479803085327

No significant difference. And that is to be expected: the bulk of the 
time is spent building up two lists of three million items each.

So let's run it again with less data:

data = data[:10000]
best1 = min(t1.repeat(number=200, repeat=10))/200
best2 = min(t2.repeat(number=200, repeat=10))/200

which gives results:

best1: 0.0037816047668457033
best2: 0.005841898918151856

The double list comp solution is faster, but it's also incorrect -- it 
fails if there is an empty string in the list. What happens if we replace 
it with a version that doesn't have the empty string bug?

def f1(startingList):
    return ([x for x in startingList if x.startswith('x')],
            [x for x in startingList if not x.startswith('x')])

best1 = min(t1.repeat(number=200, repeat=10))/200
best2 = min(t2.repeat(number=200, repeat=10))/200


which gives these results:

best1: 0.008604295253753662
best2: 0.005863149166107178


So there's the first lesson: it's easy to be fast if you don't mind 
writing buggy code.

Can we do better? Try this:


def f3(startingList):
    words_with_x = []
    words_without_x = []
    append_with = words_with_x.append
    append_without = words_without_x.append
    for word in iter(startingList):
        if word[:1] == 'x':
            append_with(word)
        else:
            append_without(word)
    return (words_with_x, words_without_x)

t3 = Timer('a, b = f3(data)', 'from __main__ import f3, data')
best3 = min(t3.repeat(number=200, repeat=10))/200

And the result:

best3: 0.0033271098136901855


which is even faster than your original version.

Or is it? No, I can't conclude that. The difference between the original 
f1 function (0.00378s) and my f3 function (0.00332s) is too small to be 
sure it is real from just ten trials of each. A better statistician than 
me could probably estimate the number of trials needed to be confident 
that one is better than the other.

But then, with a difference that small, who cares? In the real world, a 
difference that small is lost in the noise. Because of the noise, 
probably 50% of the time the slower code will finish first.


[...]
> Suppose words was not a list --- you have tacitly assumed that words is
> a list.

Actually, no I have not. I have assumed it is an iterable object, such as 
a list, a tuple, or an iterator. So what? You have done the same thing. 
Doing an isinstance type check at the beginning of both functions will 
just slow them both down by the same amount.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#27159

FromPeter Otten <__peter__@web.de>
Date2012-08-16 15:02 +0200
Message-ID<mailman.3359.1345122171.4697.python-list@python.org>
In reply to#27066
Virgil Stokes wrote:

>>> def testFunc(startingList):
>>>xOnlyList = []; j = -1
>>>for xl in startingList:
>>>if (xl[0] == 'x'):
>> That's going to fail in the starting list contains an empty string. Use
>> xl.startswith('x') instead.
> Yes, but this was by design (tacitly assumed that startingList was both a
> list and non-empty).

You missunderstood it will fail if the list contains an empty string, not if 
the list itself is empty: 

>>> words = ["alpha", "", "xgamma"]
>>> [word for word in words if word[0] == "x"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range

The startswith() version:

>>> [word for word in words if word.startswith("x")]
['xgamma']

Also possible:

>>> [word for word in words if word[:1] == "x"]
['xgamma']

> def testFunc1(startingList): 
>      ''' 
>        Algorithm-1 
>        Note: 
>          One should check for an empty startingList before 
>          calling testFunc1 -- If this possibility exists! 
>      ''' 
>      return([x for x in startingList if x[0] == 'x'], 
>             [x for x in startingList if x[0] != 'x']) 
>  
> 
> I would be interested in seeing code that is faster than algorithm-1

In pure Python? Perhaps the messy variant:

def test_func(words):
    nox = []
    append = nox.append
    withx = [x for x in words if x[0] == 'x' or append(x)]
    return withx, nox

[toc] | [prev] | [next] | [standalone]


#27057

Fromlight1quark@gmail.com
Date2012-08-14 12:20 -0700
Message-ID<03e72fb4-ca09-403f-b742-a884f8316809@googlegroups.com>
In reply to#27048
I got my answer by reading your posts and referring to: http://docs.python.org/reference/compound_stmts.html#the-for-statement
(particularly the shaded grey box)

I guess I should have (obviously) looked at the doc's before posting here; but im a noob.

Thanks for your help.

[toc] | [prev] | [next] | [standalone]


#27088

FromAlain Ketterlin <alain@dpt-info.u-strasbg.fr>
Date2012-08-15 11:57 +0200
Message-ID<87d32spjdk.fsf@dpt-info.u-strasbg.fr>
In reply to#27057
light1quark@gmail.com writes:

> I got my answer by reading your posts and referring to:
> http://docs.python.org/reference/compound_stmts.html#the-for-statement
> (particularly the shaded grey box)

Not that the problem is not specific to python (if you erase the current
element when traversing a STL list in C++ you'll get a crash as well).

> I guess I should have (obviously) looked at the doc's before posting
> here; but im a noob.

Python has several surprising features. I think it is a good idea to
take some time to read the language reference, from cover to cover
(before or after the various tutorials, depending on your background).

-- Alain.

[toc] | [prev] | [next] | [standalone]


#27063

FromChris Angelico <rosuav@gmail.com>
Date2012-08-15 07:55 +1000
Message-ID<mailman.3289.1344981361.4697.python-list@python.org>
In reply to#27048
On Wed, Aug 15, 2012 at 1:38 AM,  <light1quark@gmail.com> wrote:
> def testFunc(startingList):
>         xOnlyList = [];
>         for str in startingList:
>                 if (str[0] == 'x'):
>                         print str;
>                         xOnlyList.append(str)
>                         startingList.remove(str) #this seems to be the problem
>         print xOnlyList;
>         print startingList
> testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews'])

Other people have explained the problem with your code. I'll take this
example as a way of introducing you to one of Python's handy features
- it's an idea borrowed from functional languages, and is extremely
handy. It's called the "list comprehension", and can be looked up in
the docs under that name,

def testFunc(startingList):
    xOnlyList = [strng for strng in startingList if strng[0] == 'x']
    startingList = [strng for strng in startingList if strng[0] != 'x']
    print(xOnlyList)
    print(startingList)

It's a compact notation for building a list from another list. (Note
that I changed "str" to "strng" to avoid shadowing the built-in name
"str", as others suggested.)

(Unrelated side point: Putting parentheses around the print statements
makes them compatible with Python 3, in which 'print' is a function.
Unless something's binding you to Python 2, consider working with the
current version - Python 2 won't get any more features added to it any
more.)

Python's an awesome language. You may have to get your head around a
few new concepts as you shift thinking from PHP's, but it's well worth
while.

Chris Angelico

[toc] | [prev] | [next] | [standalone]


#27087

FromAlain Ketterlin <alain@dpt-info.u-strasbg.fr>
Date2012-08-15 11:50 +0200
Message-ID<87has4pjpx.fsf@dpt-info.u-strasbg.fr>
In reply to#27063
Chris Angelico <rosuav@gmail.com> writes:

> Other people have explained the problem with your code. I'll take this
> example as a way of introducing you to one of Python's handy features
> - it's an idea borrowed from functional languages, and is extremely
> handy. It's called the "list comprehension", and can be looked up in
> the docs under that name,
>
> def testFunc(startingList):
>     xOnlyList = [strng for strng in startingList if strng[0] == 'x']
>     startingList = [strng for strng in startingList if strng[0] != 'x']
>     print(xOnlyList)
>     print(startingList)
>
> It's a compact notation for building a list from another list. (Note
> that I changed "str" to "strng" to avoid shadowing the built-in name
> "str", as others suggested.)

Fully agree with you: list comprehension is, imo, the most useful
program construct ever. Extremely useful.

But not when it makes the program traverse twice the same list, where
one traversal is enough.

-- Alain.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web