Re: Reading \n unescaped from a file

Subject	Re: Reading \n unescaped from a file
References	<55E65909.2080507@medimorphosis.com.au> <CAPTjJmqR8ZGXAa7BGwxP6kKJf+_FJkRxcLR61km+sn0FzGGVfA@mail.gmail.com>
From	Rob Hills <rhills@medimorphosis.com.au>
Date	2015-09-04 00:24 +0800
Newsgroups	comp.lang.python
Message-ID	<mailman.87.1441297494.8327.python-list@python.org> (permalink)

Show all headers | View raw

Hi Chris,

On 03/09/15 06:10, Chris Angelico wrote:
> On Wed, Sep 2, 2015 at 12:03 PM, Rob Hills <rhills@medimorphosis.com.au> wrote:
>> My mapping file contents look like this:
>>
>> \r = \\n
>> â€œ = &quot;
> Oh, lovely. Code page 1252 when you're expecting UTF-8. Sadly, you're
> likely to have to cope with a whole pile of other mojibake if that
> happens :(

Yeah, tell me about it!!!

> Technically, what's happening is that your "\r" is literally a
> backslash followed by the letter r; the transformation of backslash
> sequences into single characters is part of Python source code
> parsing. (Incidentally, why do you want to change a carriage return
> into backslash-n? Seems odd.)
>
> Probably the easiest solution would be a simple and naive replace(),
> looking for some very specific strings and ignoring everything else.
> Easy to do, but potentially confusing down the track if someone tries
> something fancy :)
>
> line = line.split('#')[:1][0].strip() # trim any trailing comments
> line = line.replace(r"\r", "\r") # repeat this for as many backslash
> escapes as you want to handle
>
> Be aware that this, while simple, is NOT capable of handling escaped
> backslashes. In Python, "\\r" comes out the same as r"\r", but with
> this parser, it would come out the same as "\\\r". But it might be
> sufficient for you.

Thanks for the explanation which has helped me understand the problem. 
I also tried your approach but wound up with output data that somehow
had every single character escaped :-(

I've since decided I was being too obsessive trying to load *everything*
from my mapping file and have simply hard-coded my two escaped character
replacements for now and moved on to more important problems (ie the
Windoze Character soup that comprises my data and which I have to clean
up!).

Thanks again,

-- 
Rob Hills
Waikiki, Western Australia

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread

Thread

Re: Reading \n unescaped from a file Rob Hills <rhills@medimorphosis.com.au> - 2015-09-04 00:24 +0800

csiph-web