Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #106283
| Path | csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail |
|---|---|
| From | Peter Otten <__peter__@web.de> |
| Newsgroups | comp.lang.python |
| Subject | Re: Drowning in a teacup? |
| Date | Sat, 02 Apr 2016 12:31:30 +0200 |
| Organization | None |
| Lines | 97 |
| Message-ID | <mailman.361.1459593100.28225.python-list@python.org> (permalink) |
| References | <ndmlj2$pb0$1@gioia.aioe.org> <ndnm9v$j07$1@ger.gmane.org> <CAGgTfkO4HuzoRBgrGJ3ccqcD+VB4_7O-g19NUk9FaYSBpQ2HkA@mail.gmail.com> <ndnomv$mf5$1@ger.gmane.org> |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset="ISO-8859-1" |
| Content-Transfer-Encoding | 7Bit |
| X-Trace | news.uni-berlin.de x4mHAmAQmm+aLuIw6q+CggUQTykTYDZTLRN1Lg/rYSoA== |
| Return-Path | <python-python-list@m.gmane.org> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.000 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'elif': 0.04; 'python3': 0.05; 'sys': 0.05; '__name__': 0.07; 'main()': 0.07; 'matches': 0.07; 'strings.': 0.07; 'indeed,': 0.09; 'invocation': 0.09; 'item,': 0.09; 'modifies': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'suggestions.': 0.09; 'def': 0.13; 'suggest': 0.15; "'__main__':": 0.16; '...)': 0.16; 'beginning.': 0.16; 'invocations': 0.16; 'key?': 0.16; 'keyword,': 0.16; 'main():': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:io': 0.16; 'received:plane.gmane.org': 0.16; 'received:psf.io': 0.16; 'received:t-ipconnect.de': 0.16; 'reverses': 0.16; 'sort()': 0.16; 'wrote:': 0.16; 'element': 0.18; 'tests': 0.18; 'runs': 0.18; '>>>': 0.20; 'pos': 0.22; 'elements': 0.23; "python's": 0.23; 'import': 0.24; 'sort': 0.25; 'script': 0.25; 'header:User- Agent:1': 0.26; 'header:X-Complaints-To:1': 0.26; 'skip:# 10': 0.27; 'function': 0.28; "skip:' 10": 0.28; 'cat': 0.29; 'correct,': 0.29; 'index,': 0.29; 'sensible': 0.29; 'raise': 0.29; 'relative': 0.30; "i'd": 0.31; 'post': 0.31; "can't": 0.32; 'knows': 0.32; 'michael': 0.33; "skip:' 20": 0.34; 'add': 0.34; 'list': 0.34; 'skip:3 10': 0.35; 'lists.': 0.35; 'skip:. 20': 0.35; 'something': 0.35; 'item': 0.35; 'but': 0.36; 'list,': 0.36; 'skip:i 20': 0.36; 'there': 0.36; 'beginning': 0.36; 'to:addr :python-list': 0.36; 'subject:?': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'expect': 0.37; 'received:org': 0.37; 'spread': 0.37; 'list.': 0.37; 'difference': 0.38; 'front': 0.38; 'skip:p 20': 0.38; 'copying': 0.38; 'skip:o 20': 0.38; 'test': 0.39; 'to:addr:python.org': 0.40; 'where': 0.40; 'received:de': 0.40; 'your': 0.60; 'more': 0.63; 'between': 0.65; 'better.': 0.66; 'here': 0.66; 'results': 0.66; '100': 0.79; 'benchmark': 0.84; 'conclusions': 0.84; 'liberty': 0.84; 'rises': 0.84; 'snap': 0.84; 'fare': 0.93 |
| X-Injected-Via-Gmane | http://gmane.org/ |
| X-Gmane-NNTP-Posting-Host | p57bd90b1.dip0.t-ipconnect.de |
| User-Agent | KNode/4.13.3 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.21 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Xref | csiph.com comp.lang.python:106283 |
Show key headers only | View raw
Vito De Tullio wrote:
> Michael Selik wrote:
>
>>> > I need to scan a list of strings. If one of the elements matches the
>>> > beginning of a search keyword, that element needs to snap to the front
>>> > of the list.
>>>
>>> I know this post regards the function passing, but, on you specific
>>> problem,
>>> can't you just ... sort the list with a custom key?
>>
>> If the number of matches is small relative to the size of the list, I'd
>> expect the sort would be slower than most of the other suggestions.
>
> umm... I take the liberty to set up a little test
>
> $ cat test_sort_prefix.py
> #!/usr/bin/env python3
>
> from sys import argv
> from timeit import timeit
>
>
> def sort_in_place(l):
> def key(e):
> return not e.startswith('yes')
> l.sort(key=key)
>
>
> def original_solution(mylist):
> for i in range(len(mylist)):
> if(mylist[i].startswith('yes')):
> mylist[:] = [mylist[i]] + mylist[:i] + mylist[i+1:]
original_solution() reverses the order of the matches while sort_in_place()
doesn't. If the order is not relevant I'd try something like
def exchange(items):
pos = 0
for index, item in enumerate(items):
if item.startswith("yes"):
items[pos], items[index] = item, items[pos]
pos += 1
> def main():
> if argv[1] == 'sort_in_place':
> f = sort_in_place
> elif argv[1] == 'original_solution':
> f = original_solution
> else:
> raise Exception()
>
> nomatches = int(argv[2])
> matches = int(argv[3])
>
> l = ["no_" for _ in range(nomatches)] + ["yes_" for _ in
> range(matches)]
Python's timsort knows how to deal with (partially) sorted lists. I suggest
that you add
random.seed(42) # same list for all script invocations
random.shuffle(l)
here to spread the matches somewhat over the list.
> print(timeit("f(l)", number=100, globals={"f": f, "l": l}))
Remember that f(l) modifies l, so all but the first invocation will see a
list where all matches are at the beginning. In other words: in 99 out of
100 runs the list is already sorted.
While adding the overhead of copying the list I would expect the results of
timeit("f(l[:])", ...)
to be more realistic.
> if __name__ == '__main__':
> main()
> $ ./test_sort_prefix.py sort_in_place 1000000 3
> 33.260575089996564
> $ ./test_sort_prefix.py original_solution 1000000 3
> 35.93009542999789
> $
>
>
> in my tests there is no sensible difference between the two solutions...
> and the number of matches is risible :)
Indeed, as the number of matches rises your sort() approach will likely fare
better.
Your conclusions may be correct, but the benchmark to support them is
flawed.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Drowning in a teacup? Fillmore <fillmore_remove@hotmail.com> - 2016-04-01 16:27 -0400
Re: Drowning in a teacup? Rob Gaddi <rgaddi@highlandtechnology.invalid> - 2016-04-01 20:39 +0000
Re: Drowning in a teacup? Ian Kelly <ian.g.kelly@gmail.com> - 2016-04-01 14:52 -0600
Re: Drowning in a teacup? MRAB <python@mrabarnett.plus.com> - 2016-04-01 21:45 +0100
Re: Drowning in a teacup? Michael Selik <michael.selik@gmail.com> - 2016-04-01 20:46 +0000
Re: Drowning in a teacup? Steven D'Aprano <steve@pearwood.info> - 2016-04-02 09:46 +1100
Re: Drowning in a teacup? Fillmore <fillmore_remove@hotmail.com> - 2016-04-02 00:03 -0400
Re: Drowning in a teacup? Vito De Tullio <vito.detullio@gmail.com> - 2016-04-02 07:45 +0200
Re: Drowning in a teacup? Michael Selik <michael.selik@gmail.com> - 2016-04-02 05:51 +0000
Re: Drowning in a teacup? Vito De Tullio <vito.detullio@gmail.com> - 2016-04-02 08:26 +0200
Re: Drowning in a teacup? Peter Otten <__peter__@web.de> - 2016-04-02 12:31 +0200
Re: Drowning in a teacup? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-04-02 12:53 +0100
Re: Drowning in a teacup? Michael Selik <michael.selik@gmail.com> - 2016-04-02 14:36 +0000
Re: Drowning in a teacup? Ned Batchelder <ned@nedbatchelder.com> - 2016-04-02 12:28 -0700
Re: Drowning in a teacup? Random832 <random832@fastmail.com> - 2016-04-02 15:54 -0400
Re: Drowning in a teacup? Steven D'Aprano <steve@pearwood.info> - 2016-04-03 12:33 +1000
Re: Drowning in a teacup? Ethan Furman <ethan@stoneleaf.us> - 2016-04-02 16:15 -0700
Re: Drowning in a teacup? Random832 <random832@fastmail.com> - 2016-04-02 19:19 -0400
csiph-web