Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #95958
| Subject | Re: Reading \n unescaped from a file |
|---|---|
| References | <55E65909.2080507@medimorphosis.com.au> <CAPTjJmqR8ZGXAa7BGwxP6kKJf+_FJkRxcLR61km+sn0FzGGVfA@mail.gmail.com> |
| From | Rob Hills <rhills@medimorphosis.com.au> |
| Date | 2015-09-04 00:24 +0800 |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.87.1441297494.8327.python-list@python.org> (permalink) |
Hi Chris,
On 03/09/15 06:10, Chris Angelico wrote:
> On Wed, Sep 2, 2015 at 12:03 PM, Rob Hills <rhills@medimorphosis.com.au> wrote:
>> My mapping file contents look like this:
>>
>> \r = \\n
>> “ = "
> Oh, lovely. Code page 1252 when you're expecting UTF-8. Sadly, you're
> likely to have to cope with a whole pile of other mojibake if that
> happens :(
Yeah, tell me about it!!!
> Technically, what's happening is that your "\r" is literally a
> backslash followed by the letter r; the transformation of backslash
> sequences into single characters is part of Python source code
> parsing. (Incidentally, why do you want to change a carriage return
> into backslash-n? Seems odd.)
>
> Probably the easiest solution would be a simple and naive replace(),
> looking for some very specific strings and ignoring everything else.
> Easy to do, but potentially confusing down the track if someone tries
> something fancy :)
>
> line = line.split('#')[:1][0].strip() # trim any trailing comments
> line = line.replace(r"\r", "\r") # repeat this for as many backslash
> escapes as you want to handle
>
> Be aware that this, while simple, is NOT capable of handling escaped
> backslashes. In Python, "\\r" comes out the same as r"\r", but with
> this parser, it would come out the same as "\\\r". But it might be
> sufficient for you.
Thanks for the explanation which has helped me understand the problem.
I also tried your approach but wound up with output data that somehow
had every single character escaped :-(
I've since decided I was being too obsessive trying to load *everything*
from my mapping file and have simply hard-coded my two escaped character
replacements for now and moved on to more important problems (ie the
Windoze Character soup that comprises my data and which I have to clean
up!).
Thanks again,
--
Rob Hills
Waikiki, Western Australia
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Re: Reading \n unescaped from a file Rob Hills <rhills@medimorphosis.com.au> - 2015-09-04 00:24 +0800
csiph-web