Groups > comp.lang.python > #95958 > unrolled thread

Re: Reading \n unescaped from a file

Started by	Rob Hills <rhills@medimorphosis.com.au>
First post	2015-09-04 00:24 +0800
Last post	2015-09-04 00:24 +0800
Articles	1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Re: Reading \n unescaped from a file Rob Hills <rhills@medimorphosis.com.au> - 2015-09-04 00:24 +0800

#95958 — Re: Reading \n unescaped from a file

From	Rob Hills <rhills@medimorphosis.com.au>
Date	2015-09-04 00:24 +0800
Subject	Re: Reading \n unescaped from a file
Message-ID	<mailman.87.1441297494.8327.python-list@python.org>

Hi Chris,

On 03/09/15 06:10, Chris Angelico wrote:
> On Wed, Sep 2, 2015 at 12:03 PM, Rob Hills <rhills@medimorphosis.com.au> wrote:
>> My mapping file contents look like this:
>>
>> \r = \\n
>> â€œ = &quot;
> Oh, lovely. Code page 1252 when you're expecting UTF-8. Sadly, you're
> likely to have to cope with a whole pile of other mojibake if that
> happens :(

Yeah, tell me about it!!!

> Technically, what's happening is that your "\r" is literally a
> backslash followed by the letter r; the transformation of backslash
> sequences into single characters is part of Python source code
> parsing. (Incidentally, why do you want to change a carriage return
> into backslash-n? Seems odd.)
>
> Probably the easiest solution would be a simple and naive replace(),
> looking for some very specific strings and ignoring everything else.
> Easy to do, but potentially confusing down the track if someone tries
> something fancy :)
>
> line = line.split('#')[:1][0].strip() # trim any trailing comments
> line = line.replace(r"\r", "\r") # repeat this for as many backslash
> escapes as you want to handle
>
> Be aware that this, while simple, is NOT capable of handling escaped
> backslashes. In Python, "\\r" comes out the same as r"\r", but with
> this parser, it would come out the same as "\\\r". But it might be
> sufficient for you.

Thanks for the explanation which has helped me understand the problem. 
I also tried your approach but wound up with output data that somehow
had every single character escaped :-(

I've since decided I was being too obsessive trying to load *everything*
from my mapping file and have simply hard-coded my two escaped character
replacements for now and moved on to more important problems (ie the
Windoze Character soup that comprises my data and which I have to clean
up!).

Thanks again,

-- 
Rob Hills
Waikiki, Western Australia

[toc] | [standalone]

csiph-web

Re: Reading \n unescaped from a file

Contents

#95958 — Re: Reading \n unescaped from a file