Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #17575 > unrolled thread

Performing a number of substitutions on a unicode string

Started byArnaud Delobelle <arnodel@gmail.com>
First post2011-12-20 14:02 +0000
Last post2011-12-20 14:02 +0000
Articles 1 — 1 participant

Back to article view | Back to comp.lang.python


Contents

  Performing a number of substitutions on a unicode string Arnaud Delobelle <arnodel@gmail.com> - 2011-12-20 14:02 +0000

#17575 — Performing a number of substitutions on a unicode string

FromArnaud Delobelle <arnodel@gmail.com>
Date2011-12-20 14:02 +0000
SubjectPerforming a number of substitutions on a unicode string
Message-ID<mailman.3863.1324389753.27778.python-list@python.org>
Hi all,

I've got to escape some unicode text according to the following map:

escape_map = {
    u'\n': u'\\n',
    u'\t': u'\\t',
    u'\r': u'\\r',
    u'\f': u'\\f',
    u'\\': u'\\\\'
}

The simplest solution is to use str.replace:

def escape_text(text):
    return text.replace('\\', '\\\\').replace('\n',
'\\n').replace('\t', '\\t').replace('\r', '\\r').replace('\f', '\\f')

But it creates 4 intermediate strings, which is quite inefficient
(I've got 10s of MB's worth of unicode strings to escape)

I can think of another way using regular expressions:

escape_ptn = re.compile(r"[\n\t\f\r\\]")

# escape_map is defined above
def escape_match(m, map=escape_map):
    return map[m.group(0)]

def escape_text(text, sub=escape_match):
    return escape_ptn.sub(sub, text)

Is there a better way?

Thanks,

-- 
Arnaud

[toc] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web