Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #95958 > unrolled thread
| Started by | Rob Hills <rhills@medimorphosis.com.au> |
|---|---|
| First post | 2015-09-04 00:24 +0800 |
| Last post | 2015-09-04 00:24 +0800 |
| Articles | 1 — 1 participant |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Reading \n unescaped from a file Rob Hills <rhills@medimorphosis.com.au> - 2015-09-04 00:24 +0800
| From | Rob Hills <rhills@medimorphosis.com.au> |
|---|---|
| Date | 2015-09-04 00:24 +0800 |
| Subject | Re: Reading \n unescaped from a file |
| Message-ID | <mailman.87.1441297494.8327.python-list@python.org> |
Hi Chris,
On 03/09/15 06:10, Chris Angelico wrote:
> On Wed, Sep 2, 2015 at 12:03 PM, Rob Hills <rhills@medimorphosis.com.au> wrote:
>> My mapping file contents look like this:
>>
>> \r = \\n
>> “ = "
> Oh, lovely. Code page 1252 when you're expecting UTF-8. Sadly, you're
> likely to have to cope with a whole pile of other mojibake if that
> happens :(
Yeah, tell me about it!!!
> Technically, what's happening is that your "\r" is literally a
> backslash followed by the letter r; the transformation of backslash
> sequences into single characters is part of Python source code
> parsing. (Incidentally, why do you want to change a carriage return
> into backslash-n? Seems odd.)
>
> Probably the easiest solution would be a simple and naive replace(),
> looking for some very specific strings and ignoring everything else.
> Easy to do, but potentially confusing down the track if someone tries
> something fancy :)
>
> line = line.split('#')[:1][0].strip() # trim any trailing comments
> line = line.replace(r"\r", "\r") # repeat this for as many backslash
> escapes as you want to handle
>
> Be aware that this, while simple, is NOT capable of handling escaped
> backslashes. In Python, "\\r" comes out the same as r"\r", but with
> this parser, it would come out the same as "\\\r". But it might be
> sufficient for you.
Thanks for the explanation which has helped me understand the problem.
I also tried your approach but wound up with output data that somehow
had every single character escaped :-(
I've since decided I was being too obsessive trying to load *everything*
from my mapping file and have simply hard-coded my two escaped character
replacements for now and moved on to more important problems (ie the
Windoze Character soup that comprises my data and which I have to clean
up!).
Thanks again,
--
Rob Hills
Waikiki, Western Australia
Back to top | Article view | comp.lang.python
csiph-web