Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #104403

Re: Review Request of Python Code

Path csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From Matt Wheeler <m@funkyhat.org>
Newsgroups comp.lang.python
Subject Re: Review Request of Python Code
Date Wed, 9 Mar 2016 12:06:31 +0000
Lines 51
Message-ID <mailman.75.1457525219.15725.python-list@python.org> (permalink)
References <f0973a0d-62ba-402b-ab23-cb68bdd15323@googlegroups.com>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
X-Trace news.uni-berlin.de /QOmQ6nSLUgfV340RaE51gEwKTmBSuzrRy/Ovnjz5ZWw==
Return-Path <m@funkyhat.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'elif': 0.04; 'subject:Python': 0.05; 'line:': 0.07; 'matches': 0.07; 'cc:addr:python-list': 0.09; 'items)': 0.09; 'reached.': 0.09; 'simplified': 0.09; 'python': 0.10; 'assume': 0.11; '"none"': 0.16; '(given': 0.16; '2016': 0.16; 'inverse': 0.16; 'lookups': 0.16; 'obviously,': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'wrote:': 0.16; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'suggested': 0.20; 'subject:Code': 0.22; 'cc:no real name:2**0': 0.22; 'code,': 0.23; 'code.': 0.23; 'leave': 0.23; 'bit': 0.23; 'elements': 0.23; 'second': 0.24; 'header:In-Reply-To:1': 0.24; 'checking': 0.27; 'message-id:@mail.gmail.com': 0.27; 'another.': 0.29; 'large.': 0.29; 'sensible': 0.29; "i'm": 0.30; 'print': 0.30; 'compared': 0.30; 'skip:[ 10': 0.31; 'probably': 0.31; 'especially': 0.32; 'possibly': 0.32; 'scanned': 0.32; "i'll": 0.33; 'skip:d 20': 0.34; 'list': 0.34; 'received:google.com': 0.35; 'could': 0.35; 'but': 0.36; 'should': 0.36; 'lines': 0.36; 'received:209.85': 0.36; 'subject:: ': 0.37; 'list.': 0.37; 'received:209': 0.38; 'means': 0.39; 'rather': 0.39; 'still': 0.40; 'your': 0.60; "you'll": 0.61; 'entire': 0.61; 'further': 0.62; 'making': 0.62; 'course': 0.62; 'more': 0.63; 'march': 0.64; 'better.': 0.66; 'sounds': 0.76; 'fast,': 0.84; 'from:addr:m': 0.84
X-Virus-Scanned Debian amavisd-new at membrane.funkyhat.net
X-Gm-Message-State AD7BkJIkQJStlckaoODthbEf3DJkXy8/plmA/WTDRUOCyeySyq9ZFG92B1UKaDsam1RLj0fRXjn+0xjGP/Z+xA==
X-Received by 10.25.89.201 with SMTP id n192mr11502584lfb.69.1457525210962; Wed, 09 Mar 2016 04:06:50 -0800 (PST)
In-Reply-To <f0973a0d-62ba-402b-ab23-cb68bdd15323@googlegroups.com>
X-Gmail-Original-Message-ID <CAG93HwGPBK0Mbu-N+mE6+MtbgLKpyRb2UoO6VZ9k_9yQN46cRw@mail.gmail.com>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.21
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Xref csiph.com comp.lang.python:104403

Show key headers only | View raw


I'm just going to focus on a couple of lines as others are already
looking at the whole thing:

On 9 March 2016 at 04:18,  <subhabangalore@gmail.com> wrote:
> [snip].........
>                 if word in a4:
>                     [stuff]
>                 elif word not in a4:
>                     [other stuff]
>                 else:
>                     print "None"

This is bad for a couple of reasons.

Most obviously, your `else: print "None"` case can never be reached.
word not in a4 is the inverse of word in a4.
That also means for the `not` case the entire a4 list is scanned
*twice*, and the second time is completely pointless.

This can be simplified to
                 if word in a4:
                     [stuff]
                 else:
                     [other stuff]

But we can still do better. A list is a poor choice for this kind of
lookup, as Python has no way to find elements other than by checking
them one after another. (given (one of the) name(s) you've given it
sounds a bit like "dictionary" I assume it contains rather a lot of
items)

If you change one other line:

> dict_word=dict_read.split()
> a4=dict_word #Assignment for code.

a4=set(dict_word)
#(this could of course be shortened further but I'll leave that to you/others)

You'll probably see a massive speedup in your code, possibly even
dwarfing the speedup you see from more sensible database access like
INADA Naoki suggested (though you should definitely still do that
too!), especially if your word list is very large.

This is because the set type uses a hashmap internally, making lookups
for matches extremely fast, compared to scanning through the list.


-- 
Matt Wheeler
http://funkyh.at

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Review Request of Python Code subhabangalore@gmail.com - 2016-03-08 20:18 -0800
  Re: Review Request of Python Code Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-03-09 16:10 +1100
  Re: Review Request of Python Code INADA Naoki <songofacandy@gmail.com> - 2016-03-09 16:52 +0900
  Re: Review Request of Python Code Friedrich Rentsch <anthra.norell@bluewin.ch> - 2016-03-09 10:06 +0100
  Re: Review Request of Python Code Matt Wheeler <m@funkyhat.org> - 2016-03-09 12:06 +0000
  Re: Review Request of Python Code Matt Wheeler <m@funkyhat.org> - 2016-03-09 12:33 +0000
  Re: Review Request of Python Code subhabangalore@gmail.com - 2016-03-10 10:12 -0800
    Re: Review Request of Python Code BartC <bc@freeuk.com> - 2016-03-10 18:36 +0000
    Re: Review Request of Python Code Matt Wheeler <m@funkyhat.org> - 2016-03-10 18:51 +0000
      Re: Review Request of Python Code subhabangalore@gmail.com - 2016-03-10 12:14 -0800
    RE: Review Request of Python Code Joaquin Alzola <Joaquin.Alzola@lebara.com> - 2016-03-10 19:12 +0000
  Re: Review Request of Python Code Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-10 19:56 +0000

csiph-web