Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #27155
| Path | csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!feeder2.ecngs.de!ecngs!feeder.ecngs.de!border1.nntp.ams.giganews.com!nntp.giganews.com!usenetcore.com!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <vs@it.uu.se> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.000 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'author:': 0.03; 'else:': 0.03; 'algorithm': 0.03; 'essentially': 0.04; 'string.': 0.04; 'modified': 0.05; 'modify': 0.05; 'result,': 0.05; 'executed': 0.07; 'filename:fname piece:py': 0.07; 'sequences.': 0.07; 'skip:% 20': 0.07; 'strings.': 0.07; 'suppose': 0.07; 'python': 0.09; '#print': 0.09; "'''": 0.09; '32-bit': 0.09; '[];': 0.09; 'assumed': 0.09; 'behavior,': 0.09; 'enormous': 0.09; 'modifies': 0.09; 'skip:# 60': 0.09; 'toc': 0.09; 'def': 0.10; 'aug': 0.13; 'vista': 0.13; 'applies': 0.15; 'modification': 0.15; "'b',": 0.16; '(must': 0.16; '-1:': 0.16; "['a',": 0.16; 'code).': 0.16; 'downside': 0.16; 'mug': 0.16; 'mylist': 0.16; 'run.': 0.16; 'skip:[ 50': 0.16; 'true:': 0.16; 'typo': 0.16; 'string': 0.17; 'wrote:': 0.17; 'char': 0.17; 'element': 0.17; 'test.': 0.17; 'obviously': 0.18; 'solution.': 0.18; 'memory': 0.18; 'windows': 0.19; 'math': 0.20; 'parameters': 0.20; 'import': 0.21; 'assignment': 0.22; 'clock': 0.22; 'delta': 0.22; 'displayed': 0.22; 'modifying': 0.22; 'simpler': 0.22; 'skip:% 10': 0.22; 'sorry,': 0.22; 'runs': 0.22; 'example': 0.23; 'elements': 0.23; 'errors': 0.23; 'task': 0.23; '(this': 0.24; 'random': 0.24; 'testing': 0.24; 'header:In-Reply-To:1': 0.25; 'header:User- Agent:1': 0.26; '---': 0.26; 'charset:iso-8859-15': 0.26; 'fit': 0.26; 'values': 0.26; '(see': 0.27; 'possibility': 0.27; 'instead.': 0.27; 'list:': 0.27; 'correct': 0.28; 'initial': 0.28; 'run': 0.28; '100000': 0.29; 'cpu': 0.29; "d'aprano": 0.29; 'once,': 0.29; 'seed': 0.29; "skip:' 50": 0.29; 'steven': 0.29; 'character': 0.29; 'summary': 0.29; 'words': 0.29; '(from': 0.30; 'checks': 0.30; 'function': 0.30; 'resolution': 0.30; 'seconds': 0.30; 'lists': 0.31; 'code': 0.31; 'skip:- 30': 0.31; 'problem.': 0.32; 'print': 0.32; '+0200,': 0.33; 'that!': 0.33; 'to:addr :python-list': 0.33; 'times.': 0.33; 'another': 0.33; 'version': 0.34; 'list': 0.35; 'clear': 0.35; 'needed': 0.35; 'fail': 0.35; 'false': 0.35; 'faster': 0.35; 'follows:': 0.35; 'returning': 0.35; 'doing': 0.35; 'posting': 0.35; 'sometimes': 0.35; 'something': 0.35; 'there': 0.35; 'list.': 0.35; 'but': 0.36; 'characters': 0.36; 'generation': 0.36; 'skip:g 30': 0.36; 'email addr:python.org': 0.36; 'method': 0.36; 'note:': 0.64; 'here': 0.65; 'charset:windows-1252': 0.65; 'results': 0.65; '10,000': 0.65; '10000': 0.65; 'study': 0.66; 'stated': 0.69; 'soon': 0.70; 'obtained': 0.71; 'obvious': 0.71; '100': 0.78; 'inform': 0.78; 'completion': 0.78; '50%': 0.81; 'day!': 0.83; 'avg': 0.84; "d'aprano)": 0.84; 'irrelevant': 0.84; '100,000': 0.91; 'average': 0.93 |
| X-SENDER-IP | [213.112.50.224] |
| X-LISTENER | [smtp.bredband.net] |
| X-IronPort-Anti-Spam-Filtered | true |
| X-IronPort-Anti-Spam-Result | AmYbAK3WLFDVcDLgPGdsb2JhbAANOItfrlwBAQEBN4JUAQEBAQMdWxELGAkWDwIHAwIBAgEPIgYBDRMGAgEBF4djA6Z0ihsNiU6KJWSGVwOOWoEghAKMX4dn |
| X-IronPort-AV | E=Sophos;i="4.77,778,1336341600"; d="py'?scan'208,217";a="98105791" |
| Date | Thu, 16 Aug 2012 13:18:59 +0200 |
| From | Virgil Stokes <vs@it.uu.se> |
| User-Agent | Mozilla/5.0 (Windows NT 6.0; rv:15.0) Gecko/20120808 Thunderbird/15.0 |
| MIME-Version | 1.0 |
| To | python-list@python.org |
| Subject | Re: Strange behavior |
| References | <1a1834ae-2b4a-473f-b626-f37a17588199@googlegroups.com> <mailman.3285.1344973216.4697.python-list@python.org> <502aeb2b$0$29978$c3e8da3$5496439d@news.astraweb.com> |
| In-Reply-To | <502aeb2b$0$29978$c3e8da3$5496439d@news.astraweb.com> |
| Content-Type | multipart/mixed; boundary="------------000402070808050100070305" |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.12 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3356.1345117482.4697.python-list@python.org> (permalink) |
| Lines | 482 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1345117482 news.xs4all.nl 6982 [2001:888:2000:d::a6]:50644 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:27155 |
Show key headers only | View raw
[Multipart message — attachments visible in raw view] - view raw
On 15-Aug-2012 02:19, Steven D'Aprano wrote:
> On Tue, 14 Aug 2012 21:40:10 +0200, Virgil Stokes wrote:
>
>> You might find the following useful:
>>
>> def testFunc(startingList):
>> xOnlyList = []; j = -1
>> for xl in startingList:
>> if (xl[0] == 'x'):
> That's going to fail in the starting list contains an empty string. Use
> xl.startswith('x') instead.
Yes, but this was by design (tacitly assumed that startingList was both a list
and non-empty).
>
>
>> xOnlyList.append(xl)
>> else:
>> j += 1
>> startingList[j] = xl
> Very cunning, but I have to say that your algorithm fails the "is this
> obviously correct without needing to study it?" test. Sometimes that is
> unavoidable, but for something like this, there are simpler ways to solve
> the same problem.
Sorry, but I do not sure what you mean here.
>
>
>> if j == -1:
>> startingList = []
>> else:
>> del startingList[j:-1]
>> return(xOnlyList)
>
>> And here is another version using list comprehension that I prefer
>> def testFunc2(startingList):
>> return([x for x in startingList if x[0] == 'x'], [x for x in
>> startingList if x[0] != 'x'])
> This walks over the starting list twice, doing essentially the same thing
> both times. It also fails to meet the stated requirement that
> startingList is modified in place, by returning a new list instead.
This can meet the requirement that startingList is modified in place via the
call to this function (see the attached code).
> Here's an example of what I mean:
>
> py> mylist = mylist2 = ['a', 'x', 'b', 'xx', 'cx'] # two names for one
> list
> py> result, mylist = testFunc2(mylist)
> py> mylist
> ['a', 'b', 'cx']
> py> mylist2 # should be same as mylist
> ['a', 'x', 'b', 'xx', 'cx']
Yes, I had a typo in my original posting --- sorry about that!
>
> Here is the obvious algorithm for extracting and removing words starting
> with 'x'. It walks the starting list only once, and modifies it in place.
> The only trick needed is list slice assignment at the end.
>
> def extract_x_words(words):
> words_with_x = []
> words_without_x = []
> for word in words:
> if word.startswith('x'):
> words_with_x.append(word)
> else:
> words_without_x.append(word)
> words[:] = words_without_x # slice assignment
> return words_with_x
Suppose words was not a list --- you have tacitly assumed that words is a list.
>
> The only downside of this is that if the list of words is so enormous
> that you can fit it in memory *once* but not *twice*, this may fail. But
> the same applies to the list comprehension solution.
But, this is not the only downside if speed is important --- it is slower than
the list comprehension method (see results that follows).
Here is a summary of three algorithms (algorithm-1, algorithm-2, algorithm-2A)
that I tested (see attached code). Note, algorithm-2A was obtained by removing
the slice assignment in the above code and modifying the return as follows
def extract_x_words(words):
words_with_x = []
words_without_x = []
for word in words:
if word.startswith('x'):
words_with_x.append(word)
else:
words_without_x.append(word)
#words[:] = words_without_x # slice assignment
return words_with_x, words_without_x
Of course, one needs to modify the call for "in-place" update of startingList as
follows:
xOnlyList,startingList = extract_x_words(startingList)
Here is a summary of my timing results obtained for 3 different algorithms for
lists with 100,000 strings of length 4 in each list:
Method
average (sd) time in seconds
algorithm-1 (list comprehension)
0.11630 (0.0014)
algorithm-2 (S. D'Aprano)
0.17594 (0.0014)
algorithm-2A (modified S. D'Aprano)
0.18217 (0.0023)
These values were obtained from 100 independent runs (MC simulations) on lists
that contain 100,000 strings. Approximately 50% of these strings contained a
leading 'x'. Note, that the results show that algorithm-2 (suggested by S.
D'Aprano) is approximately 51% slower than algorithm-1 (list comprehensions) and
algorithm-2A (simple modification of algorithm-2) is approximately 57% slower
than algorithm-1. Why is algorithm-2A slower than algorithm-2?
I would be interested in seeing code that is faster than algorithm-1 --- any
suggestions are welcomed. And of course, if there are any errors in my attached
code please inform me of them and I will try to correct them as soon as
possible. Note, some of the code is actually irrelevant for the original
"Strange behavior" post.
Have a good day!
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Strange behavior light1quark@gmail.com - 2012-08-14 08:38 -0700
Re: Strange behavior Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-08-14 17:59 +0200
Re: Strange behavior Terry Reedy <tjreedy@udel.edu> - 2012-08-14 15:05 -0400
Re: Strange behavior Virgil Stokes <vs@it.uu.se> - 2012-08-14 21:40 +0200
Re: Strange behavior Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-15 00:19 +0000
Re: Strange behavior Virgil Stokes <vs@it.uu.se> - 2012-08-16 13:18 +0200
Re: Strange behavior Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-16 17:40 +0000
Re: Strange behavior Peter Otten <__peter__@web.de> - 2012-08-16 15:02 +0200
Re: Strange behavior light1quark@gmail.com - 2012-08-14 12:20 -0700
Re: Strange behavior Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-08-15 11:57 +0200
Re: Strange behavior Chris Angelico <rosuav@gmail.com> - 2012-08-15 07:55 +1000
Re: Strange behavior Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-08-15 11:50 +0200
csiph-web