Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!feeder2.ecngs.de!ecngs!feeder.ecngs.de!xlned.com!feeder1.xlned.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'operator': 0.03; 'args': 0.04; 'argument': 0.04; 'string.': 0.04; 'subject:Python': 0.05; '"""': 0.05; 'sys': 0.05; '"__main__":': 0.07; '__name__': 0.07; 'bash': 0.07; 'classes.': 0.07; 'filename': 0.07; 'think,': 0.07; 'try:': 0.07; 'python': 0.09; 'filename,': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'surrounded': 0.09; 'tuple': 0.09; 'def': 0.10; 'library': 0.15; '(but': 0.15; 'skip:f 30': 0.15; '(filename)': 0.16; '/usr/bin/env': 0.16; 'also:': 0.16; 'balls': 0.16; 'brutal': 0.16; 'commandline': 0.16; 'holy': 0.16; 'indexerror:': 0.16; 'itemgetter': 0.16; 'message- id:@dough.gmane.org': 0.16; 'received:80.91.229.3': 0.16; 'received:dip.t-dialin.net': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-dialin.net': 0.16; 'sorts': 0.16; 'subject:ever': 0.16; 'subject:program': 0.16; 'superfluous': 0.16; 'tuple,': 0.16; 'later': 0.16; 'string': 0.17; 'wrote:': 0.17; 'numerical': 0.17; 'shell': 0.18; 'tests': 0.18; '>>>': 0.18; 'input': 0.18; 'file.': 0.20; 'sort': 0.21; 'import': 0.21; 'posted': 0.22; "i'd": 0.22; 'task': 0.23; 'script': 0.24; 'allows': 0.25; 'header:User-Agent:1': 0.26; 'checking': 0.27; 'header:X-Complaints-To:1': 0.28; 'lines': 0.28; 'crash': 0.29; 'optional': 0.29; 'prints': 0.29; "i'm": 0.29; 'error': 0.30; 'implement': 0.32; 'handle': 0.33; 'to:addr:python-list': 0.33; "can't": 0.34; 'described': 0.35; 'continue': 0.35; 'something': 0.35; 'there': 0.35; 'received:org': 0.36; 'except': 0.36; 'wanted': 0.36; 'brian': 0.36; 'display': 0.36; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'sure': 0.38; 'to:addr:python.org': 0.39; 'takes': 0.39; 'short': 0.39; 'little': 0.39; 'skip:" 10': 0.40; 'header:Received:5': 0.40; 'most': 0.61; 'subject:, ': 0.61; 'first': 0.61; 'back': 0.62; 'mentioned': 0.63; 'more': 0.63; 'life': 0.66; 'greetings': 0.69; 'lives': 0.71; 'secret': 0.71; 'lemon': 0.84; 'feelings': 0.91; 'kat': 0.91; 'clip': 0.93; 'lucky': 0.96 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Peter Otten <__peter__@web.de> Subject: Re: My first ever Python program, comments welcome Date: Sun, 22 Jul 2012 09:56:50 +0200 Organization: None References: Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Gmane-NNTP-Posting-Host: p5084a910.dip.t-dialin.net User-Agent: KNode/4.7.3 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 126 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1342943873 news.xs4all.nl 6937 [2001:888:2000:d::a6]:55066 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:25798 Lipska the Kat wrote: > Greetings Pythoners > > A short while back I posted a message that described a task I had set > myself. I wanted to implement the following bash shell script in Python > > Here's the script > > sort -nr $1 | head -${2:-10} > > this script takes a filename and an optional number of lines to display > and sorts the lines in numerical order, printing them to standard out. > if no optional number of lines are input the script prints 10 lines > > Here's the file. > > 50 Parrots > 12 Storage Jars > 6 Lemon Currys > 2 Pythons > 14 Spam Fritters > 23 Flying Circuses > 1 Meaning Of Life > 123 Holy Grails > 76 Secret Policemans Balls > 8 Something Completely Differents > 12 Lives of Brian > 49 Spatulas > > > ... and here's my very first attempt at a Python program > I'd be interested to know what you think, you can't hurt my feelings > just be brutal (but fair). There is very little error checking as you > can see and I'm sure you can crash the program easily. > 'Better' implementations most welcome > #! /usr/bin/env python3.2 > > import fileinput > from sys import argv > from operator import itemgetter > > l=[] > t = tuple > filename=argv[1] > lineCount=10 > > with fileinput.input(files=(filename)) as f: Note that (filename) is not a tuple, just a string surrounded by superfluous parens. >>> filename = "foo.bar" >>> (filename) 'foo.bar' >>> (filename,) ('foo.bar',) >>> filename, ('foo.bar',) You are lucky that FileInput() tests if its files argument is just a single string. > for line in f: > t=(line.split('\t')) > t[0]=int(t[0]) > l.append(t) > l=sorted(l, key=itemgetter(0)) > > try: > inCount = int(argv[2]) > lineCount = inCount > except IndexError: > #just catch the error and continue > None > > for c in range(lineCount): > t=l[c] > print(t[0], t[1], sep='\t', end='') > I prefer a more structured approach even for such a tiny program: - process all commandline args - read data - sort - clip extra lines - write data I'd break it into these functions: def get_commmandline_args(): """Recommended library: argparse. Its FileType can deal with stdin/stdout. """ def get_quantity(line): return int(line.split("\t", 1)[0]) def sorted_by_quantity(lines): """Leaves the lines intact, so you don't have to reassemble them later on.""" return sorted(lines, key=get_quantity) def head(lines, count): """Have a look at itertools.islice() for a more general approach""" return lines[:count] if __name__ == "__main__": # protecting the script body allows you to import # the script as a library into other programs # and reuse its functions and classes. # Also: play nice with pydoc. Try # $ python -m pydoc -w ./yourscript.py args = get_commandline_args() with args.infile as f: lines = sorted_by_quantity(f) with args.outfile as f: f.writelines(head(lines, args.line_count)) Note that if you want to handle large files gracefully you need to recombine sorted_by_quantity() and head() (have a look at heapq.nsmallest() which was already mentioned in the other thread).