Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.022 X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; 'ascii': 0.09; 'bits': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject: [': 0.09; 'assume': 0.14; 'ascii,': 0.16; 'badly.': 0.16; 'encodings': 0.16; 'message-id:@post.gmane.org': 0.16; 'received:24.136': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'subject:Unicode': 0.16; 'written': 0.21; 'appears': 0.22; 'header :User-Agent:1': 0.23; 'mention': 0.26; 'header:X-Complaints-To:1': 0.27; 'received:24': 0.27; 'character': 0.29; "doesn't": 0.30; "i'm": 0.30; 'posting': 0.31; "d'aprano": 0.31; 'sets.': 0.31; 'steven': 0.31; 'writes:': 0.31; 'probably': 0.32; 'received:rr.com': 0.33; 'could': 0.34; 'basic': 0.35; 'but': 0.35; 'really': 0.36; 'charset:us-ascii': 0.36; 'subject:]': 0.38; 'to:addr:python-list': 0.38; 'anything': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'hardware': 0.61; 'length': 0.61; 'back': 0.62; 'networking': 0.64; 'total': 0.65; 'ages': 0.84; 'characters,': 0.84; 'parity': 0.84; 'received:biz.rr.com': 0.84; 'subject:Managing': 0.84; 'transmitting': 0.84; 'imagine': 0.93 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Roy Smith Subject: Re: ASCII and Unicode [was Re: Managing Google Groups headaches] Date: Fri, 6 Dec 2013 20:54:03 +0000 (UTC) References: <5f370a06-8d2c-4d7d-bc22-b9a489c15c59@googlegroups.com> <132658ff-d06a-4136-ade6-353189da5769@googlegroups.com> <51007240-6bc9-4f0b-9937-4883bcc0ceb6@googlegroups.com> <52a21ec1$0$30003$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: sea.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 24.136.109.108 (Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36) X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 17 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1386363607 news.xs4all.nl 2832 [2001:888:2000:d::a6]:55335 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:61183 Steven D'Aprano pearwood.info> writes: > Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about > encodings and character sets. It doesn't just assume things are ASCII, > but makes a half-hearted attempt to be charset-aware, but badly. I can > only imagine that it was written back in the Dark Ages Indeed. The basic codebase probably goes back 20 years. I'm posting this from gmane, just so people don't think I'm a total luddite. > When transmitting ASCII characters, the networking protocol could include > various start and stop bits and parity codes. A single 7-bit ASCII > character might be anything up to 12 bits in length on the wire. Not to mention that some really old hardware used 1.5 stop bits!