Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder3.xlned.com!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.016 X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; 'encoded': 0.07; 'subject:file': 0.07; 'skip:\\ 40': 0.09; 'subject:Why': 0.09; 'cc:addr:python-list': 0.11; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'notation': 0.16; 'wrote:': 0.18; 'cc:addr:python.org': 0.22; 'cc:2**0': 0.24; 'header:In- Reply-To:1': 0.27; 'message-id:@mail.gmail.com': 0.30; '13,': 0.31; "d'aprano": 0.31; 'steven': 0.31; 'text': 0.33; 'subject:the': 0.34; 'received:google.com': 0.35; 'subject:?': 0.36; 'pm,': 0.38; 'above,': 0.60; '(that': 0.65; 'or:': 0.84; 'to:none': 0.92 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=RJorNaGdHOGLX+0JbgikhRs8C76FtyfmWwDXm4LsDSE=; b=xeRG5t2Qx294GGx5RLSTBTFvqoCelOsNgirJp5A9BA6V7wN9naJiO7dnb1ka8X8NoF efdPwFFFqKSz+G9kIffVuUlgUqwOPScKXltGH/qhscSolK6Krsybm5fX+rLcbWth8IKF YgQiv1D/R1FL/KMHG/7CnD5kTaOuGFJSuLwPWRCAVnPqzcme+m6Xu+EobY1cQda+e59W JVnIf4o+jn+ALfUV9Qp5/e6rkxFr25zXuyZ05kJkm76YEVfm5U03MyZ4zSiGy1YhjNdy Hq8eAv9X3OYljKobh0V+ff3mbnLoLX0gOXBHgIwQShQTrlKgi5ZtLkmdJWiLbCFXPq4H fahQ== MIME-Version: 1.0 X-Received: by 10.58.198.75 with SMTP id ja11mr226749vec.59.1399989357912; Tue, 13 May 2014 06:55:57 -0700 (PDT) In-Reply-To: <537222d8$0$29980$c3e8da3$5496439d@news.astraweb.com> References: <536d6f08$0$29980$c3e8da3$5496439d@news.astraweb.com> <6caea381-c765-41e7-9135-d5a0d60b7f42@googlegroups.com> <537222d8$0$29980$c3e8da3$5496439d@news.astraweb.com> Date: Tue, 13 May 2014 23:55:57 +1000 Subject: Re: Why isn't my re.sub replacing the contents of my MS Word file? From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 18 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1399989361 news.xs4all.nl 2941 [2001:888:2000:d::a6]:41993 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:71492 On Tue, May 13, 2014 at 11:49 PM, Steven D'Aprano wrote: > > This {EN DASH} is an n-dash. > > or: > > x\x9c\x0b\xc9\xc8,V\xa8v\xf5Spq\x0c\xf6\xa8U\x00r\x12 > \xf3\x14\xf2tS\x12\x8b3\xf4\x00\x82^\x08\xf8 > > > (that last one is the text passed through the zlib compressor) I had to decompress that just to see what "text" you passed through zlib, given that zlib is a *byte* compressor :) Turns out it's the braced notation given above, encoded as ASCII/UTF-8. ChrisA