Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!1.eu.feeder.erje.net!weretis.net!feeder4.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!bcyclone02.am1.xlned.com!bcyclone02.am1.xlned.com!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.010 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'classes,': 0.05; 'subject:module': 0.09; 'python': 0.11; 'exception': 0.13; "'a',": 0.16; '-tkc': 0.16; 'awk': 0.16; 'from:addr:python.list': 0.16; 'from:addr:tim.thechases.com': 0.16; 'from:name:tim chase': 0.16; 'posix': 0.16; 'thanks,': 0.19; '>>>': 0.20; 'recognize': 0.22; 'tried': 0.24; 'gnu': 0.27; "doesn't": 0.28; 'perl': 0.29; 'character': 0.29; "skip:' 10": 0.30; 'supposed': 0.31; 'knows': 0.32; '[1]': 0.32; 'to:addr:python-list': 0.35; 'expected': 0.35; 'but': 0.36; 'url:org': 0.36; 'there': 0.36; '(and': 0.36; 'received:10': 0.37; 'does': 0.39; 'to:addr:python.org': 0.39; '8bit%:77': 0.84; 'received:23': 0.84; 'treats': 0.84 X-Sender-Id: wwwh|x-authuser|tim@thechases.com X-Sender-Id: wwwh|x-authuser|tim@thechases.com X-MC-Relay: Neutral X-MailChannels-SenderId: wwwh|x-authuser|tim@thechases.com X-MailChannels-Auth-Id: wwwh X-MC-Loop-Signature: 1433190546787:1989753224 X-MC-Ingress-Time: 1433190546786 Date: Mon, 1 Jun 2015 15:29:30 -0500 From: Tim Chase To: python-list@python.org Subject: Pyton re module and POSIX equivalence classes X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-AuthUser: tim@thechases.com X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 28 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1433195168 news.xs4all.nl 2942 [2001:888:2000:d::a6]:40558 X-Complaints-To: abuse@xs4all.nl X-Received-Bytes: 3147 X-Received-Body-CRC: 1373350549 Xref: csiph.com comp.lang.python:91723 Is Python supposed to support POSIX "equivalence classes"? I tried the following in Py2 and Py3: >>> re.sub('[[=3Da=3D]]', 'A', 'a=C3=A1=C3=A0=C3=A3=C3=A2=C3=A4', re.U) 'a=C3=A1=C3=A0=C3=A3=C3=A2=C3=A4' which suggests that it doesn't (I would have expected "AAAAAA" as the result). Is there a way to get this behavior? I found that perl knows about them but treats them as an exception for now[1]. Supposedly GNU awk (and other GNU POSIXish tools) recognize character classes, as does vim. Thanks, -tkc [1] http://perldoc.perl.org/perlrecharclass.html