Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!nntp-feed.chiark.greenend.org.uk!ewrotcd!news.nosignal.org!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
To: python-list@python.org
From: Peter Otten <__peter__@web.de>
Subject: Re: Any algorithm to preserve whitespaces?
Date: Thu, 24 Jan 2013 11:47:03 +0100
Organization: None
References: <CAE7MaQZyvwOvD6ggMDZ4+b=r1PsKuYX+VZThEpKvxBbfJJCK3g@mail.gmail.com> <50FAEE23.5030107@lightbird.net> <CAE7MaQb2-qQXuAJ4jK6RB+fOPO0ZjK6_gEuigqP_FOn0=Sv29g@mail.gmail.com> <50FFBCC5.5020506@davea.name> <CAE7MaQbgpj+xsELBEkmHq1GS9-cRNph+O-sq=5oPh4QTt=2Z=g@mail.gmail.com> <kdpmo3$pos$1@ger.gmane.org> <CAE7MaQYCieZJAJtJr8S4KSxpMF8UFSOx6XY_X59c-RmPRbHG-g@mail.gmail.com> <kdquvs$g74$1@ger.gmane.org> <CAE7MaQZ-3K95=S7tAHxEfR3StdVmHRJ1q4bewiy-Yn0rU5cLGQ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7Bit
User-Agent: KNode/4.7.3
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.955.1359024419.2939.python-list@python.org>
Lines: 36
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:37555

Santosh Kumar wrote:

> But I can; see: http://pastebin.com/ZGGeZ71r

You have messed with your cat command -- it adds line numbers.
Therefore the output of

cat somefile | ./argpa.py

differs from

./argpa.py somefile

Try

./argpa.py < somefile

to confirm my analysis. As to why your capitalisation algorithm fails on 
those augmented lines: the number is separated from the rest of the line by 
a TAB -- therefore the first word is "1\tthis" and the only candidate to be 
capitalised is the "1". To fix this you could use regular expressions (which 
I wanted to avoid initially):

>>> parts = re.compile("(\s+)").split(" 1\tthis is it")
>>> parts
['', ' ', '1', '\t', 'this', ' ', 'is', ' ', 'it']

Process every other part as you wish and then join all parts:

>>> parts[::2] = [s.upper() for s in parts[::2]]
>>> parts
['', ' ', '1', '\t', 'THIS', ' ', 'IS', ' ', 'IT']
>>> print "".join(parts)
 1      THIS IS IT