Path: csiph.com!usenet.pasdenom.info!news.albasani.net!feeder.erje.net!1.eu.feeder.erje.net!bcyclone05.am1.xlned.com!bcyclone05.am1.xlned.com!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.021 X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; 'subject:Python': 0.05; 'bug.': 0.07; 'character,': 0.07; 'backwards': 0.09; 'substitution': 0.09; 'cc:addr:python-list': 0.10; 'thu,': 0.15; 'different,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'invisible': 0.16; 'wrote:': 0.16; '>>>': 0.20; '(or': 0.21; 'cc:2**0': 0.21; 'cc:addr:python.org': 0.21; "aren't": 0.22; 'permitted': 0.22; '2015': 0.23; "i've": 0.24; 'header:In-Reply-To:1': 0.24; 'raise': 0.24; 'feature': 0.24; 'chris': 0.26; 'figure': 0.27; 'right.': 0.27; 'separate': 0.27; 'issue,': 0.27; 'message-id:@mail.gmail.com': 0.28; 'character': 0.29; 'no,': 0.29; 'possibly': 0.32; "d'aprano": 0.33; 'steven': 0.33; 'surely': 0.33; 'received:google.com': 0.34; 'something': 0.35; "isn't": 0.35; 'but': 0.36; 'cases': 0.36; "didn't": 0.37; 'subject:: ': 0.37; 'pm,': 0.39; 'well.': 0.40; 'why': 0.40; 'some': 0.40; 'learn': 0.60; 'skip:u 10': 0.62; 'show': 0.62; 'day.': 0.63; 'series': 0.65; 'box.': 0.66; 'treat': 0.72; 'chrisa': 0.84; 'fat': 0.84; 'fonts': 0.84; "it'd": 0.84; 'to:none': 0.90; 'url:youtu': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=8FBlcRmnO+uKnqGZXFigTVVXzJKOL9FCNh4fGbfO3m4=; b=Mw+HHufdTJ4N5UMkLxAC4DbRp5qUAAKtNxoLst1muW7BKCEHIdItCO53hsMz5Lcrfh X7k5j+3Iqy2A3JUkUfcGYeiEmzeNcgVY0wmvFbhl1gUy7BMfZ516CLrE3shW/LY/xyWr CB2axp6KJu8P0nxQUOL4t9Gxe1b945JnkM4FfKaazsPKXg5SPGwt5HkgBO+Ebxbky9Rl L9KZWNM2xQVUebKwHDF+FRJUX38daPIPD0BY0ovZvtRhf2tZbHPfZbvc9jPBqw46VwsQ perc3uDAAxe1OcHamIv2b83dQY+58WX0Xmo4QUpiyJEk20tQ0StU69r4cio0/vAWq1aW ApeA== MIME-Version: 1.0 X-Received: by 10.43.0.67 with SMTP id nl3mr814281icb.59.1433993879804; Wed, 10 Jun 2015 20:37:59 -0700 (PDT) In-Reply-To: <5579000e$0$12986$c3e8da3$5496439d@news.astraweb.com> References: <20150610082812.2ce887c3@bigbox.christie.dr> <55786fd5$0$13003$c3e8da3$5496439d@news.astraweb.com> <5578f1be$0$12979$c3e8da3$5496439d@news.astraweb.com> <5579000e$0$12986$c3e8da3$5496439d@news.astraweb.com> Date: Thu, 11 Jun 2015 13:37:59 +1000 Subject: Re: Python NBSP DWIM From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 28 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1433993883 news.xs4all.nl 2914 [2001:888:2000:d::a6]:42782 X-Complaints-To: abuse@xs4all.nl X-Received-Bytes: 4936 X-Received-Body-CRC: 2800557782 Xref: csiph.com comp.lang.python:92451 On Thu, Jun 11, 2015 at 1:27 PM, Steven D'Aprano wrote: > On Thu, 11 Jun 2015 01:05 pm, Chris Angelico wrote: > [...] >>> Why do the subtitles contain ZWNBSP in the first place? Surely they're >>> not English subtitles? >> >> No, they're not :) The character comes up in the Cantonese and >> Japanese subs for Once Upon A December. >> >> http://youtu.be/CEpcUeWP0bg >> http://youtu.be/WFZAaHrHens >> >> Possibly some others in the series as well. It may well be a fault in >> the subtitles, but most programs I've seen don't show U+FEFF as a big >> fat box. > > I think that for backwards compatibility, applications (or fonts) are > permitted to treat U+FEFF as a zero-width invisible character, so perhaps > you can raise a feature request with VLC. Yeah. Well, like I said - learn something new every day. I didn't know it wasn't a bug. (Though it'd still be a font issue, not a VLC one. With other fonts, it comes up looking different, in some cases invisible. Unfortunately, the fonts that look good aren't the fonts that have glyphs for all characters, so I need to figure out why font substitution isn't working right. But that's a separate issue.) ChrisA