Path: csiph.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Peter Pearson Newsgroups: comp.lang.python Subject: Re: Unicode normalisation [was Re: [beginner] What's wrong?] Date: 8 Apr 2016 17:21:58 GMT Lines: 26 Message-ID: References: <99234e90-fcd4-4a05-b97f-b47228dde20c@googlegroups.com> <1459571270.714249.566352882.6ADCD0CC@webmail.messagingengine.com> <87bn5sqcac.fsf@elektro.pacujo.net> <56ffedf1$0$1611$c3e8da3$5496439d@news.astraweb.com> <87h9fkq7tl.fsf@elektro.pacujo.net> <3524319.g0I1c1cpMS@PointedEars.de> <2796705.edb3E9ArW3@PointedEars.de> <1584744.4h7ToaqLat@PointedEars.de> <5705b9ef$0$1611$c3e8da3$5496439d@news.astraweb.com> <570748ec$0$1620$c3e8da3$5496439d@news.astraweb.com> X-Trace: individual.net MrliUgdxBCR588axnywnCwm8MYYx1OCKSA7sKULissXaR3uygd Cancel-Lock: sha1:sg2F3LY56moSkz/BtyUH0+lYVvo= User-Agent: slrn/pre1.0.0-18 (Linux) Xref: csiph.com comp.lang.python:106696 On Fri, 08 Apr 2016 16:00:10 +1000, Steven D'Aprano wrote: > On Fri, 8 Apr 2016 02:51 am, Peter Pearson wrote: >> >> The Unicode consortium was certifiably insane when it went into the >> typesetting business. > > They are not, and never have been, in the typesetting business. Perhaps > characters are not the only things easily confused *wink* Defining codepoints that deal with appearance but not with meaning is going into the typesetting business. Examples: ligatures, and spaces of varying widths with specific typesetting properties like being non-breaking. Typesetting done in MS Word using such Unicode codepoints will never be more than a goofy approximation to real typesetting (e.g., TeX), but it will cost a huge amount of everybody's time, with the current discussion of ligatures in variable names being just a straw in the wind. Getting all the world's writing systems into a single, coherent standard was an extraordinarily ambitious, monumental undertaking, and I'm baffled that the urge to broaden its scope in this irrelevant direction was entertained at all. (Should this have been in cranky-geezer font?) -- To email me, substitute nowhere->runbox, invalid->com.