Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #87204

Re: Letter class in re

From Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de>
Subject Re: Letter class in re
Date 2015-03-09 15:09 +0100
References <20150309061736.07e0d944@bigbox.christie.dr> <1425908003.40588.YahooMailBasic@web163805.mail.gq1.yahoo.com> <mdk9aj$357$1@ger.gmane.org>
Newsgroups comp.lang.python
Message-ID <mailman.212.1425910511.21433.python-list@python.org> (permalink)

Show all headers | View raw


On 03/09/2015 03:04 PM, Wolfgang Maier wrote:
> On 03/09/2015 02:33 PM, Albert-Jan Roskam wrote:
>> --------------------------------------------
>> On Mon, 3/9/15, Tim Chase <python.list@tim.thechases.com> wrote:
>>
>> "[^\d\W_]+" means something like "one or more (+) of 'not (a digit, a
>> non-word, an underscore)'.
>>
>
> interesting (using Python3.4 and
> U+2188     ROMAN NUMERAL ONE HUNDRED THOUSAND     ↈ):
>
>  >>> re.search('[^\d\W_]+', '\u2188', re.I | re.U)
> <_sre.SRE_Match object; span=(0, 1), match='ↈ'>
>
> ↈ and at least some other Nl (letter numbers) category characters seem
> to be part of \w (not part of \W).
>
> Would that be considered a bug ?
>

Sorry for the potential confusion: I meant in the pattern search above 
(not in the definition of \w or \W).

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Letter class in re Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2015-03-09 15:09 +0100

csiph-web