Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!194.109.133.85.MISMATCH!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.011 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'anyway': 0.03; 'from:addr:python': 0.09; 'lines:': 0.09; 'count,': 0.16; 'duplicates': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:name:mrab': 0.16; 'message-id:@mrabarnett.plus.com': 0.16; 'received:84.92': 0.16; 'received:84.92.122': 0.16; 'received:84.92.122.60': 0.16; 'reply-to:addr:python-list': 0.16; 'set:': 0.16; 'soup': 0.16; '----------': 0.16; 'string,': 0.19; 'header:In-Reply-To:1': 0.22; 'sort': 0.28; 'received:84': 0.28; '(so': 0.30; 'print': 0.32; 'to:addr:python-list': 0.34; 'header :User-Agent:1': 0.34; 'preserve': 0.35; 'reply- to:addr:python.org': 0.35; 'page': 0.36; 'subject:Please': 0.37; 'subject:: ': 0.38; 'easiest': 0.38; 'help': 0.39; 'skip:s 20': 0.39; "there's": 0.39; 'to:addr:python.org': 0.39; 'here:': 0.64; '------': 0.68; 'header:Reply-To:1': 0.71; 'reply-to:no real name:2**0': 0.72; 'url:php': 0.83; 'concatenate': 0.84 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AiUIACs/H07Unw4S/2dsb2JhbAApK5hgjn53iHoCI8F5hjoEl2SLUmg Date: Thu, 14 Jul 2011 20:12:05 +0100 From: MRAB User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Please critique my script References: <77AE044B1BF3944FAE2435F395F11B4B01859CD7@clt-exmb02.bbtnet.com> In-Reply-To: <77AE044B1BF3944FAE2435F395F11B4B01859CD7@clt-exmb02.bbtnet.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: python-list@python.org List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 51 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1310670724 news.xs4all.nl 23852 [2001:888:2000:d::a6]:46343 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:9482 [snip] raw_input() returns a string, so there's no need for these 3 lines: > y = str(y) > z = str(z) > p = str(p) > pagedef = ("http://www.localcallingguide.com/xmllocalprefix.php?npa=" + y + "&nxx=" + z) > print "Querying", pagedef > > #------Get info from NANPA.com ---------- > urllib2.install_opener(opener) > page = urllib2.urlopen(pagedef) > soup = BeautifulSoup(page) > soup = str(soup) > > #------Parse Gathered Data---------- > for line in npaReg.findall(soup): > npalist.insert(count, line) > count = count + 1 > > for line2 in nxxReg.findall(soup): > nxxlist.insert(count2, line2) > count2 = count2 + 1 > enumerate will help you here: for count, line in enumerate(npaReg.findall(soup)): npalist.insert(count, line) for count2, line2 in enumerate(nxxReg.findall(soup)): nxxlist.insert(count2, line2) > #-----Sort, remove duplicates, concatenate the last digits for similiar NPA/NXX ------ > for makenewlist in range(0, count): > sortlist.append(npalist.pop(0) + nxxlist.pop(0)) > > sortlist.sort() > > for sortednumber in sortlist: > if sortednumber not in sortedlist: > sortedlist.append(sortednumber) > If you're going to sort them anyway (so you don't need to preserve the existing order), the easiest way to remove duplicates is to use a set: sortedlist = sorted(set(sortlist)) [snip]