Path: csiph.com!weretis.net!feeder9.news.weretis.net!panix!.POSTED.panix5.panix.com!qz!not-for-mail From: Eli the Bearded <*@eli.users.panix.com> Newsgroups: comp.os.linux.misc,alt.folklore.computers Subject: Re: Recent history of vi Date: Thu, 4 Dec 2025 07:00:08 -0000 (UTC) Organization: Some absurd concept Message-ID: References: <10ga6r1$7ph$1@news.misty.com> <10gpatq$jpt$3@news.misty.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Date: Thu, 4 Dec 2025 07:00:08 -0000 (UTC) Injection-Info: reader2.panix.com; posting-host="panix5.panix.com:166.84.1.5"; logging-data="3661"; mail-complaints-to="abuse@panix.com" User-Agent: Vectrex rn 2.1 (beta) X-Liz: It's actually happened, the entire Internet is a massive game of Redcode X-Motto: "Erosion of rights never seems to reverse itself." -- kenny@panix X-US-Congress: Moronic Fucks. X-Attribution: EtB XFrom: is a real address Encrypted: double rot-13 Xref: csiph.com comp.os.linux.misc:78273 alt.folklore.computers:232366 In comp.os.linux.misc, Johnny Billquist wrote: > I know that Unicode is here to stay. Said as much before. But it has > introduced a whole range of problems that people tend to pretend don't > exist. The most immediate one coming to my mind are all kind of scammers > creating fake domains to phish stuff. Using known, trusted company > names, but letters replaced by things that look visually equivalent, but > actually are other characters, and then through those domains fool > people to give information, such as passwords, account numbers, money, > and god knows what else. As opposed to scammers posting as J0HNNY BILLQUIST, or Johnny Bi11quist, or JOHNNY BILLQUlST in ordinary ASCII. More alphabets compound the problem, sure, but it was always there. > A big part of the problem is that Unicode don't even seem to have known > what problem is was supposed to solve. Was it about representing > different characters that have different meanings? Was it about > representing same characters but with different visual effects? Was it > supposed to be some kind of generic system to modify characters through > some clever system design? It's pretty much never about "visual effects" although there are semantic differences to some visually similar characters. Math is a big offender in wanting ℤ meaning something different than Z or 𝐙. But you could argue that Japanese style "fullwidth" Z is a visual effect. I would say the problem Unicode is trying to solve, albeit with some inconsistency, is the communication of all written languages in a standardized system of encoding. There are huge problems in that many written languages have implicit presentation rules based on context. The fullwidth Roman alphaphet, for example, is there because English letters in Japanese text are supposed to be the same size to fit the grid of the surrounding material. At different stages Unicode has solved this problem in different ways. More recently there has been a trend towards encoding things with combining characters (backspace overstrike style in the old manual typewriter days) and with ligatures of a sort. Flags being represented as a pair of "regional indicator" letters, where the letters are the same country codes used in DNS, is an example of that. Elijah ------ "Weird AI != Weird Al" being a confusable forming some recent jokes