Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #19786

Re: xhtml encoding question

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'python,': 0.01; 'python.': 0.04; 'backwards': 0.07; 'raises': 0.07; 'typed': 0.07; '(int': 0.09; 'method:': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'though:': 0.09; 'thoughts?': 0.09; 'exception': 0.12; 'def': 0.13; 'eckhardt': 0.16; 'exception?': 0.16; 'integers.': 0.16; 'keys.': 0.16; 'received:dip.t-dialin.net': 0.16; 'received:t-dialin.net': 0.16; 'subject:question': 0.16; 'wrote:': 0.16; '>>>': 0.18; "doesn't": 0.22; 'breaks': 0.23; 'dictionary': 0.23; 'from:addr:web.de': 0.23; 'string': 0.24; 'stefan': 0.24; 'raise': 0.28; '(see': 0.28; 'unicode': 0.28; 'print': 0.29; 'class': 0.29; 'does': 0.32; 'instead': 0.33; 'to:addr:python-list': 0.33; 'integer': 0.34; 'keys': 0.34; 'latter': 0.34; 'header:X-Complaints-To:1': 0.34; 'probably': 0.35; 'question': 0.35; 'but': 0.37; 'received:org': 0.37; 'using': 0.37; 'skip:_ 10': 0.37; 'could': 0.37; 'replace': 0.38; 'characters': 0.38; 'either': 0.39; 'subject:: ': 0.39; 'to:addr:python.org': 0.40; 'according': 0.60; 'matter': 0.61; 'more': 0.61; 'double': 0.61; 'improve': 0.62; 'chance': 0.62; 'expressive': 0.84; 'grain': 0.84; '-->': 0.91
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Peter Otten <__peter__@web.de>
Subject Re: xhtml encoding question
Date Thu, 02 Feb 2012 12:02:21 +0100
Organization None
References <jg9apg$v0$1@foggy.unx.sas.com> <daanv8-7i.ln1@satorlaser.homedns.org> <mailman.5292.1328088791.27778.python-list@python.org> <8b4ov8-ad2.ln1@satorlaser.homedns.org>
Mime-Version 1.0
Content-Type text/plain; charset="UTF-8"
Content-Transfer-Encoding 8Bit
X-Gmane-NNTP-Posting-Host p50848fee.dip.t-dialin.net
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.5353.1328180546.27778.python-list@python.org> (permalink)
Lines 46
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1328180546 news.xs4all.nl 6904 [2001:888:2000:d::a6]:49012
X-Complaints-To abuse@xs4all.nl
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:19786

Show key headers only | View raw


Ulrich Eckhardt wrote:

> Am 01.02.2012 10:32, schrieb Peter Otten:
>> It doesn't matter for the OP (see Stefan Behnel's post), but If you want
>> to replace characters in a unicode string the best way is probably the
>> translate() method:
>>
>>>>> print u"\xa9\u2122"
>> ©™
>>>>> u"\xa9\u2122".translate({0xa9: u"&copy;", 0x2122: u"&trade;"})
>> u'&copy;&trade;'
>>
> 
> Yes, this is both more expressive and at the same time probably even
> more efficient.
> 
> 
> Question though:
> 
>  >>> u'abc'.translate({u'a': u'A'})
> u'abc'
> 
> I would call this a chance to improve Python. According to the
> documentation, using a string is invalid, but it neither raises an
> exception nor does it do the obvious and accept single-character strings
> as keys.
> 
> 
> Thoughts?

How could this raise an exception? You'd either need a typed dictionary (int 
--> unicode) or translate() would have to verify that all keys are indeed 
integers. The former would go against the grain of Python, the latter would 
make the method less flexible as the set of keys currently need not be 
predefined:

>>> class A(object):
...     def __getitem__(self, key):
...             return unichr(key).upper()
...
>>> u"alpha".translate(A())
u'ALPHA'

Using unicode instead of integer keys would be nice but breaks backwards 
compatibility, using both could double the number of dictionary lookups.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

xhtml encoding question Tim Arnold <Tim.Arnold@sas.com> - 2012-01-31 13:09 -0500
  Re: xhtml encoding question Stefan Behnel <stefan_ml@behnel.de> - 2012-02-01 09:26 +0100
    Re: xhtml encoding question Tim Arnold <Tim.Arnold@sas.com> - 2012-02-01 13:15 -0500
      Re: xhtml encoding question Stefan Behnel <stefan_ml@behnel.de> - 2012-02-02 08:02 +0100
  Re: xhtml encoding question Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2012-02-01 09:39 +0100
    Re: xhtml encoding question Peter Otten <__peter__@web.de> - 2012-02-01 10:32 +0100
      Re: xhtml encoding question Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2012-02-01 17:03 +0100
        Re: xhtml encoding question Peter Otten <__peter__@web.de> - 2012-02-02 12:02 +0100
          Re: xhtml encoding question Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2012-02-02 13:40 +0100

csiph-web