Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #16604 > unrolled thread
| Started by | Peter Otten <__peter__@web.de> |
|---|---|
| First post | 2011-12-04 11:22 +0100 |
| Last post | 2011-12-04 11:22 +0100 |
| Articles | 1 — 1 participant |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: How to generate java .properties files in python Peter Otten <__peter__@web.de> - 2011-12-04 11:22 +0100
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2011-12-04 11:22 +0100 |
| Subject | Re: How to generate java .properties files in python |
| Message-ID | <mailman.3261.1322994132.27778.python-list@python.org> |
Arnaud Delobelle wrote:
> On 3 December 2011 23:51, Peter Otten <__peter__@web.de> wrote:
>> Arnaud Delobelle wrote:
>>
>>> I need to generate some java .properties files in Python (2.6 / 2.7).
>>> It's a simple format to store key/value pairs e.g.
>>>
>>> blue=bleu
>>> green=vert
>>> red=rouge
>>>
>>> The key/value are unicode strings. The annoying thing is that the
>>> file is encoded in ISO 8859-1, with all non Latin1 characters escaped
>>> in the form \uHHHH (same as how unicode characters are escaped in
>>> Python).
>>>
>>> I thought I could use the "unicode_escape" codec. But it doesn't work
>>> because it escapes Latin1 characters with escape sequences of the form
>>> \xHH, which is not valid in a java .properties file.
>>>
>>> Is there a simple way to achieve this? I could do something like this:
>>>
>>> def encode(u):
>>> """encode a unicode string in .properties format"""
>>> return u"".join(u"\\u%04x" % ord(c) if ord(c) > 0xFF else c for c
>>> in u).encode("latin_1")
>>>
>>> but it would be quite inefficient as I have many to generate.
>>
>>>>> class D(dict):
>> ... def __missing__(self, key):
>> ... result = self[key] = u"\\u%04x" % key
>> ... return result
>> ...
>>>>> d = D(enumerate(map(unichr, range(256))))
>>>>> u"ähnlich üblich nötig ΦΧΨ"
>> u'\xe4hnlich \xfcblich n\xf6tig \u03a6\u03a7\u03a8'
>>>>> u"ähnlich üblich nötig ΦΧΨ".translate(d)
>> u'\xe4hnlich \xfcblich n\xf6tig \\u03a6\\u03a7\\u03a8'
>>>>> u"ähnlich üblich nötig ΦΧΨ".translate(d).encode("latin1")
>> '\xe4hnlich \xfcblich n\xf6tig \\u03a6\\u03a7\\u03a8'
>
> A very nice solution - thanks, Peter.
I found another one:
>>> u"äöü ΦΧΨ".encode("latin1", "backslashreplace")
'\xe4\xf6\xfc \\u03a6\\u03a7\\u03a8'
Back to top | Article view | comp.lang.python
csiph-web