Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!nntp-feed.chiark.greenend.org.uk!ewrotcd!news.nosignal.org!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'algorithm': 0.03; 'output': 0.04; 'expressions': 0.07; 'see:': 0.07; 'augmented': 0.09; 'lines:': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'tab': 0.09; "'is',": 0.16; "'this',": 0.16; 'differs': 0.16; 'received:80.91.229.3': 0.16; 'received:dip.t-dialin.net': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-dialin.net': 0.16; 'subject:whitespaces': 0.16; 'wrote:': 0.17; 'fix': 0.17; '>>>': 0.18; 'command': 0.24; 'header :User-Agent:1': 0.26; '(which': 0.26; 'candidate': 0.26; 'regular': 0.27; 'header:X-Complaints-To:1': 0.28; 'rest': 0.28; 'cat': 0.29; 'kumar': 0.29; 'separated': 0.29; 'could': 0.32; 'print': 0.32; 'to:addr:python-list': 0.33; 'adds': 0.35; 'subject:?': 0.35; 'received:org': 0.36; 'but': 0.36; 'wanted': 0.36; 'why': 0.37; 'subject:: ': 0.38; 'to:addr:python.org': 0.39; 'skip:" 10': 0.40; 'header:Received:5': 0.40; 'your': 0.60; 'first': 0.61; 'confirm': 0.64; 'therefore': 0.65; 'wish': 0.70; 'messed': 0.84; 'subject:Any': 0.84 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Peter Otten <__peter__@web.de> Subject: Re: Any algorithm to preserve whitespaces? Date: Thu, 24 Jan 2013 11:47:03 +0100 Organization: None References: <50FAEE23.5030107@lightbird.net> <50FFBCC5.5020506@davea.name> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Gmane-NNTP-Posting-Host: p50849719.dip.t-dialin.net User-Agent: KNode/4.7.3 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 36 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1359024419 news.xs4all.nl 6865 [2001:888:2000:d::a6]:52495 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:37555 Santosh Kumar wrote: > But I can; see: http://pastebin.com/ZGGeZ71r You have messed with your cat command -- it adds line numbers. Therefore the output of cat somefile | ./argpa.py differs from ./argpa.py somefile Try ./argpa.py < somefile to confirm my analysis. As to why your capitalisation algorithm fails on those augmented lines: the number is separated from the rest of the line by a TAB -- therefore the first word is "1\tthis" and the only candidate to be capitalised is the "1". To fix this you could use regular expressions (which I wanted to avoid initially): >>> parts = re.compile("(\s+)").split(" 1\tthis is it") >>> parts ['', ' ', '1', '\t', 'this', ' ', 'is', ' ', 'it'] Process every other part as you wish and then join all parts: >>> parts[::2] = [s.upper() for s in parts[::2]] >>> parts ['', ' ', '1', '\t', 'THIS', ' ', 'IS', ' ', 'IT'] >>> print "".join(parts) 1 THIS IS IT