Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #27607 > unrolled thread

help me debug my "word capitalizer" script

Started bySantosh Kumar <sntshkmr60@gmail.com>
First post2012-08-22 11:51 +0530
Last post2012-08-22 16:40 +0100
Articles 3 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  help me debug my "word capitalizer" script Santosh Kumar <sntshkmr60@gmail.com> - 2012-08-22 11:51 +0530
    Re: help me debug my "word capitalizer" script Hans Mulder <hansmu@xs4all.nl> - 2012-08-22 10:20 +0200
      Re: help me debug my "word capitalizer" script MRAB <python@mrabarnett.plus.com> - 2012-08-22 16:40 +0100

#27607 — help me debug my "word capitalizer" script

FromSantosh Kumar <sntshkmr60@gmail.com>
Date2012-08-22 11:51 +0530
Subjecthelp me debug my "word capitalizer" script
Message-ID<mailman.3631.1345616511.4697.python-list@python.org>
Here is the script I am using:

from os import linesep
from string import punctuation
from sys import argv

script, givenfile = argv

with open(givenfile) as file:
    # List to store the capitalised lines.
    lines = []
    for line in file:
        # Split words by spaces.
        words = line.split(' ')
        for i, word in enumerate(words):
            if len(word.strip(punctuation)) > 3:
                # Capitalise and replace words longer than 3 (without
punctuation)
                words[i] = word.capitalize()
        # Join the capitalised words with spaces.
        lines.append(' '.join(words))
    # Join the capitalised lines by the line separator
    capitalised = linesep.join(lines)
# Optionally, write the capitalised words back to the file.

print(capitalised)


Purpose of the script:
To capitalize the first letter of any word in a given file, leaving
words which have 3 or less letters.

Bugs:
I know it has many bugs or/and it can be improved by cutting down the
code, but my current focus is to fix this bug:
  1. When I pass it any file, it does it stuff but inserts a blank
line everytime it processes a new line. (Please notice that I don't
want the output in an another file, I want it on screen).

[toc] | [next] | [standalone]


#27618

FromHans Mulder <hansmu@xs4all.nl>
Date2012-08-22 10:20 +0200
Message-ID<50349669$0$6846$e4fe514c@news2.news.xs4all.nl>
In reply to#27607
On 22/08/12 08:21:47, Santosh Kumar wrote:
> Here is the script I am using:
> 
> from os import linesep
> from string import punctuation
> from sys import argv
> 
> script, givenfile = argv
> 
> with open(givenfile) as file:
>     # List to store the capitalised lines.
>     lines = []
>     for line in file:
>         # Split words by spaces.
>         words = line.split(' ')
>         for i, word in enumerate(words):
>             if len(word.strip(punctuation)) > 3:
>                 # Capitalise and replace words longer than 3 (without
> punctuation)
>                 words[i] = word.capitalize()
>         # Join the capitalised words with spaces.
>         lines.append(' '.join(words))
>     # Join the capitalised lines by the line separator
>     capitalised = linesep.join(lines)
> # Optionally, write the capitalised words back to the file.
> 
> print(capitalised)
> 
> 
> Purpose of the script:
> To capitalize the first letter of any word in a given file, leaving
> words which have 3 or less letters.
> 
> Bugs:
> I know it has many bugs or/and it can be improved by cutting down the
> code, but my current focus is to fix this bug:
>   1. When I pass it any file, it does it stuff but inserts a blank
> line every time it processes a new line. (Please notice that I don't
> want the output in an another file, I want it on screen).

The lines you read from your input file end in a line separator.
When you print them, the 'print' command adds another line separator.
This results in two line separators in a row, in other words, a blank
line.

The best way to solve this is usually to remove the line separator
right after you've read in the line.  You could do that by inserting
after line 10:

        line = line.rstrip()

That will remove all whitespace characters (spaces, tabs, carriage
returns, newlines) from the end of the line.

Alternatively, if you want to remove only the line separator,
you could do:

        if line.endswith(linesep):
            line = line[:-len(linesep)]

The 'if' command is only necessary for the last line, which may or
may not end in a linesep.  All earlier lines are guaranteed to end
with a linesep.


Hope this helps,

-- HansM

[toc] | [prev] | [next] | [standalone]


#27646

FromMRAB <python@mrabarnett.plus.com>
Date2012-08-22 16:40 +0100
Message-ID<mailman.3658.1345650036.4697.python-list@python.org>
In reply to#27618
On 22/08/2012 09:20, Hans Mulder wrote:
[snip]
>
> Alternatively, if you want to remove only the line separator,
> you could do:
>
>          if line.endswith(linesep):
>              line = line[:-len(linesep)]
>
> The 'if' command is only necessary for the last line, which may or
> may not end in a linesep.  All earlier lines are guaranteed to end
> with a linesep.
>
Even better is:

     line = line.rstrip(linesep)

The line separator is '\n'.

Strictly speaking, the line separator varies according to platform
(Windows, *nix, etc), but it's translated to '\n' on reading from a
file which has been opened in text mode (the default).

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web