Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #77629 > unrolled thread
| Started by | fir <profesor.fir@gmail.com> |
|---|---|
| First post | 2015-12-02 08:01 -0800 |
| Last post | 2015-12-06 13:45 +0000 |
| Articles | 20 on this page of 158 — 25 participants |
Back to article view | Back to comp.lang.c
unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 08:01 -0800
Re: unicode is a fail me <self@example.org> - 2015-12-02 16:12 +0000
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:09 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 08:18 -0800
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:07 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 11:21 -0600
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:40 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 11:22 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 15:59 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 16:25 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 19:47 -0600
Re: unicode is a fail supercat@casperkitty.com - 2015-12-02 14:38 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 16:26 -0800
Re: unicode is a fail Tim Rentsch <txr@alumni.caltech.edu> - 2015-12-09 11:33 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 12:21 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 11:28 +0100
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 08:50 -0600
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 16:38 +0100
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 10:01 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-03 09:46 -0800
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:39 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-03 08:26 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-03 18:42 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-03 17:14 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 19:02 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-04 06:35 +0000
Re: unicode is a fail David Thompson <dave.thompson2@verizon.net> - 2015-12-28 05:11 -0500
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 10:24 -0600
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 22:37 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-04 11:32 +0100
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 11:10 -0600
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:24 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 13:10 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-02 19:45 +0000
Re: unicode is a fail Ian Collins <ian-news@hotmail.com> - 2015-12-03 09:08 +1300
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 14:10 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 11:27 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 15:21 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 15:18 -0800
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:45 +0000
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 09:43 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 11:40 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 12:19 -0800
Re: unicode is a fail Nobody <nobody@nowhere.invalid> - 2015-12-02 21:23 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 10:12 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 02:13 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 14:11 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 05:17 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 15:33 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 07:05 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 16:42 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 07:58 -0800
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 10:38 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 14:17 +0100
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:54 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-04 14:25 +0100
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-04 13:46 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-02 23:24 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-03 00:45 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 20:59 -0600
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 19:13 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-03 07:00 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 04:45 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-04 18:04 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-04 13:22 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 07:35 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-04 19:17 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-04 11:49 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 15:39 -0600
Re: unicode is a fail supercat@casperkitty.com - 2015-12-04 14:19 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-06 12:57 -0600
Re: unicode is a fail supercat@casperkitty.com - 2015-12-06 15:47 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-05 01:13 +0000
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-05 01:59 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-05 17:17 +0100
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 06:28 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-04 23:46 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-05 01:04 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 03:21 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 13:03 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-05 11:47 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 04:40 -0800
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-05 13:26 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 13:35 -0600
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-06 02:23 +0000
Re: unicode is a fail Udyant Wig <udyantw@gmail.com> - 2015-12-06 16:09 +0530
Re: unicode is a fail Xavier <zaz.colmant@free.fr> - 2015-12-05 15:45 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 07:42 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-05 16:32 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 18:11 -0800
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-06 02:19 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-06 13:09 +0000
Re: unicode is a fail Martin Shobe <martin.shobe@yahoo.com> - 2015-12-06 18:38 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 01:55 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-06 19:14 -0800
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-07 13:53 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-07 06:31 -0800
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-07 21:22 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 15:34 -0600
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-07 16:36 -0800
Re: unicode is a fail Lowell Gilbert <lgusenet@be-well.ilk.org> - 2015-12-08 11:40 -0500
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-08 17:18 +0000
Re: unicode is a fail "Osmium" <r124c4u102@comcast.net> - 2015-12-09 08:36 -0600
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-09 10:06 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 09:35 -0800
Re: unicode is a fail supercat@casperkitty.com - 2015-12-09 10:07 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 12:04 -0800
Re: unicode is a fail supercat@casperkitty.com - 2015-12-09 12:35 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-09 23:46 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-09 16:15 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-10 03:49 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-09 18:12 -0600
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-09 13:12 -0500
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 12:12 -0800
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-10 20:48 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-09 23:44 +0000
Re: unicode is a fail Robert Wessel <robertwessel2@yahoo.com> - 2015-12-10 01:13 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-10 10:39 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-10 03:33 -0800
Re: unicode is a fail supercat@casperkitty.com - 2015-12-10 06:07 -0800
Re: unicode is a fail "Osmium" <r124c4u102@comcast.net> - 2015-12-10 08:21 -0600
Re: unicode is a fail Robert Wessel <robertwessel2@yahoo.com> - 2015-12-10 00:59 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 14:33 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-06 22:45 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 12:38 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 13:55 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 21:14 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 16:50 -0600
Re: unicode is a fail Robert Wessel <robertwessel2@yahoo.com> - 2015-12-07 02:38 -0600
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 07:34 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-06 00:24 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 19:49 -0600
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 21:32 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 13:50 -0800
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 22:15 +0000
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-05 17:27 -0500
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 23:06 +0000
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-05 18:29 -0500
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 23:50 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 06:38 +0000
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-06 13:33 +0000
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-05 16:51 -0500
Re: unicode is a fail Ian Collins <ian-news@hotmail.com> - 2015-12-06 10:59 +1300
Re: unicode is a fail Ian Collins <ian-news@hotmail.com> - 2015-12-06 11:00 +1300
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 06:31 +0000
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 17:48 -0800
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-03 01:20 -0800
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-03 02:02 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 09:43 -0600
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:55 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-04 18:29 +0000
Re: unicode is a fail Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-12-05 16:42 +0000
Re: unicode is a fail Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-12-05 10:06 +0000
OT: Usenet (Was: unicode is a fail) Steve Thompson <stevet810@gmail.com> - 2015-12-05 20:41 +0000
Re: OT: Usenet (Was: unicode is a fail) Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 13:18 -0800
Re: unicode is a fail Udyant Wig <udyantw@gmail.com> - 2015-12-06 10:21 +0530
OT: Facebook (was Re: unicode is a fail) Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-12-06 08:51 +0000
Re: OT: Facebook (was Re: unicode is a fail) raltbos@xs4all.nl (Richard Bos) - 2015-12-06 13:45 +0000
Page 3 of 8 — ← Prev page 1 2 [3] 4 5 6 7 8 Next page →
| From | Keith Thompson <kst-u@mib.org> |
|---|---|
| Date | 2015-12-02 09:43 -0800 |
| Message-ID | <lnvb8gsq4f.fsf@kst-u.example.com> |
| In reply to | #77631 |
Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
[...]
> UTF-8 is the best compromise. But there are some problem that are very
> hard to avoid., like supporting archaic ash and thorn in English
> (mediaeval, ye olde coffee shoppe), when half the population think the
> latter is a y as in yellow.
UTF-8 has problems, but the ash and thorn characters aren't among them
as far as I know.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2015-12-02 11:40 -0800 |
| Message-ID | <9300d527-1233-4875-b028-0ef7a1c1e2a1@googlegroups.com> |
| In reply to | #77643 |
On Wednesday, December 2, 2015 at 5:43:39 PM UTC, Keith Thompson wrote: > Malcolm McLean <malcolm.mclean5@btinternet.com> writes: > [...] > > UTF-8 is the best compromise. But there are some problem that are very > > hard to avoid., like supporting archaic ash and thorn in English > > (mediaeval, ye olde coffee shoppe), when half the population think the > > latter is a y as in yellow. > > UTF-8 has problems, but the ash and thorn characters aren't among them > as far as I know. > Pig-ignorant people will type "ye (as in yellow) olde coffee shoppe" into a web browser and expect it to match. Pig-ignorant coffee shoppe owners might even agree. No system of encoding can support phenomena like that.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <kst-u@mib.org> |
|---|---|
| Date | 2015-12-02 12:19 -0800 |
| Message-ID | <lnfuzksiwp.fsf@kst-u.example.com> |
| In reply to | #77652 |
Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> On Wednesday, December 2, 2015 at 5:43:39 PM UTC, Keith Thompson wrote:
>> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
>> [...]
>> > UTF-8 is the best compromise. But there are some problem that are very
>> > hard to avoid., like supporting archaic ash and thorn in English
>> > (mediaeval, ye olde coffee shoppe), when half the population think the
>> > latter is a y as in yellow.
>>
>> UTF-8 has problems, but the ash and thorn characters aren't among them
>> as far as I know.
>>
> Pig-ignorant people will type "ye (as in yellow) olde coffee shoppe"
> into a web browser and expect it to match. Pig-ignorant coffee shoppe
> owners might even agree.
> No system of encoding can support phenomena like that.
In what possible sense is that a UTF-8 problem?
(Completely off-topic, I reject your description of such people as
"pig-ignorant". It's a common and perfectly understandable error, and
it can be corrected by education, not by insults. And incidentally,
it's likely that it *will* match.)
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
[toc] | [prev] | [next] | [standalone]
| From | Nobody <nobody@nowhere.invalid> |
|---|---|
| Date | 2015-12-02 21:23 +0000 |
| Message-ID | <pan.2015.12.02.21.23.10.133000@nowhere.invalid> |
| In reply to | #77643 |
On Wed, 02 Dec 2015 09:43:28 -0800, Keith Thompson wrote: >> UTF-8 is the best compromise. But there are some problem that are very >> hard to avoid., like supporting archaic ash and thorn in English >> (mediaeval, ye olde coffee shoppe), when half the population think the >> latter is a y as in yellow. > > UTF-8 has problems, but the ash and thorn characters aren't among them as > far as I know. Once you provide a mechanism for a program to receive text that isn't limited to the Latin alphabet, you enable a wide range of issues which previously couldn't arise. One of the most obvious examples is case conversion. Not all scripts have case. Those which do may have different rules to English. E.g. Turkish (which primarily uses the Latin alphabet) has dotted and dotless I characters, each of which has upper and lower case versions. The lower-case dotted I and the upper-case dotless I are both present in the Latin alphabet. Unicode doesn't have separate characters for those characters depending upon whether the text in which they occur is Turkish. For e.g. English text, the upper case version of "i" is "I" and vice-versa. For Turkish, the upper-case version of "i" is "İ" while the lower-case version of "I" is "ı". More complex issues include right-to-left scripts (e.g. Hebrew, Arabic, Farsi), combining characters, and "what exactly is a 'character' anyhow?". In short, when it comes to internationalisation, getting the text into or out of the program is the least of your problems, and is the only one for which Unicode really helps.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2015-12-03 10:12 +0100 |
| Message-ID | <n3p0ta$b02$1@dont-email.me> |
| In reply to | #77631 |
On 02/12/15 17:18, Malcolm McLean wrote: > On Wednesday, December 2, 2015 at 4:02:14 PM UTC, fir wrote: >> Im personally still using asci in all my private apps and i shiver (a bit) to use unicode as i read >> from time to time text that says unicode is a pain (at least in some situations) >> >> This directs me to think that unicode is in general a fail.. Unicode could go the way and >> become something maybe even simpler than ascii but gone a bit in a wrong way of making >> a lot additional mess >> >> I thing then that maybe one posible recovery scenerio is to use damn utf-32 only, everywhere >> you coud and try to forget and deprecate the other part of the mess >> >> what do ya think? >> > If ascii had never achieved any traction outside of North America, then I think there would > be a strong case for UTF-32. Reality is that there are masses and masses of ascii interfaces > around, and it would be a nightmare job to track them all down and either rip them out > or write little adapter functions to make them talk to the rest of the world in UTF-32. > > UTF-8 is the best compromise. But there are some problem that are very hard to avoid., > like supporting archaic ash and thorn in English (mediaeval, ye olde coffee shoppe), > when half the population think the latter is a y as in yellow. > "þe olde coffee shoppe" - Unicode 00DE for the capital Þ, 00FE for the small thorn þ. These are standard letters in modern Icelandic as well as Old English. I type them with altgr+t and altgr+T, or compose+t+h. So thorn works perfectly well in UTF-8 - it's a fine example of a useful non-ASCII character. I think you are wrong about half the population (presumably you mean the British population) thinking "ye olde" uses a "y". The proportion is /very/ much higher. In fact, I suspect that the only people who know that it should be a thorn are people who have studied mediaeval literature at a university level, people who are particularly interested in typography, and people who happened to have watched the QI episode where it turned up as a question. But of course you are right that UTF-8 is the best compromise for an encoding if plain ASCII is not enough - and it's no surprise that it is dominant.
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2015-12-03 02:13 -0800 |
| Message-ID | <39e42d64-58a5-466f-8564-e5b943f6f69b@googlegroups.com> |
| In reply to | #77714 |
On Thursday, December 3, 2015 at 9:12:39 AM UTC, David Brown wrote: > On 02/12/15 17:18, Malcolm McLean wrote: > > "þe olde coffee shoppe" - Unicode 00DE for the capital Þ, 00FE for the > small thorn þ. These are standard letters in modern Icelandic as well > as Old English. I type them with altgr+t and altgr+T, or compose+t+h. > So thorn works perfectly well in UTF-8 - it's a fine example of a useful > non-ASCII character. > > I think you are wrong about half the population (presumably you mean the > British population) thinking "ye olde" uses a "y". The proportion is > /very/ much higher. In fact, I suspect that the only people who know > that it should be a thorn are people who have studied mediaeval > literature at a university level, people who are particularly interested > in typography, and people who happened to have watched the QI episode > where it turned up as a question. > There's a philosophical question here. If the coffee shop owner who wrote the sign considers it to be a y, and the customer attracted inside sipping coffee also considers it to be a y, then can we hold that it is "really" a thorn?
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2015-12-03 14:11 +0100 |
| Message-ID | <n3pet8$tf2$1@dont-email.me> |
| In reply to | #77719 |
On 03/12/15 11:13, Malcolm McLean wrote: > On Thursday, December 3, 2015 at 9:12:39 AM UTC, David Brown wrote: >> On 02/12/15 17:18, Malcolm McLean wrote: >> >> "þe olde coffee shoppe" - Unicode 00DE for the capital Þ, 00FE for the >> small thorn þ. These are standard letters in modern Icelandic as well >> as Old English. I type them with altgr+t and altgr+T, or compose+t+h. >> So thorn works perfectly well in UTF-8 - it's a fine example of a useful >> non-ASCII character. >> >> I think you are wrong about half the population (presumably you mean the >> British population) thinking "ye olde" uses a "y". The proportion is >> /very/ much higher. In fact, I suspect that the only people who know >> that it should be a thorn are people who have studied mediaeval >> literature at a university level, people who are particularly interested >> in typography, and people who happened to have watched the QI episode >> where it turned up as a question. >> > There's a philosophical question here. > If the coffee shop owner who wrote the sign considers it to be a y, > and the customer attracted inside sipping coffee also considers it to > be a y, then can we hold that it is "really" a thorn? > I think in such cases, it is a "y" - we don't really need philosophy. You would say your name is "Malcolm", because that's how /you/ spell it, that's how your parents spelt it when they named you, and that's how everyone else spells it. The fact that it comes from an old name "Máel Coluim" is interesting historically, but it does not mean that your name is "really" Máel Coluim. (I didn't use my name as an example, because not everyone will see the glyphs דָּוִד .)
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2015-12-03 05:17 -0800 |
| Message-ID | <7d527eb1-5cbb-48e9-b57b-96bf7ca0f80b@googlegroups.com> |
| In reply to | #77731 |
On Thursday, December 3, 2015 at 1:11:32 PM UTC, David Brown wrote: > On 03/12/15 11:13, Malcolm McLean wrote: > > On Thursday, December 3, 2015 at 9:12:39 AM UTC, David Brown wrote: > >> On 02/12/15 17:18, Malcolm McLean wrote: > >> > >> "þe olde coffee shoppe" - Unicode 00DE for the capital Þ, 00FE for the > >> small thorn þ. These are standard letters in modern Icelandic as well > >> as Old English. I type them with altgr+t and altgr+T, or compose+t+h. > >> So thorn works perfectly well in UTF-8 - it's a fine example of a useful > >> non-ASCII character. > >> > >> I think you are wrong about half the population (presumably you mean the > >> British population) thinking "ye olde" uses a "y". The proportion is > >> /very/ much higher. In fact, I suspect that the only people who know > >> that it should be a thorn are people who have studied mediaeval > >> literature at a university level, people who are particularly interested > >> in typography, and people who happened to have watched the QI episode > >> where it turned up as a question. > >> > > There's a philosophical question here. > > If the coffee shop owner who wrote the sign considers it to be a y, > > and the customer attracted inside sipping coffee also considers it to > > be a y, then can we hold that it is "really" a thorn? > > > > I think in such cases, it is a "y" - we don't really need philosophy. > You would say your name is "Malcolm", because that's how /you/ spell it, > that's how your parents spelt it when they named you, and that's how > everyone else spells it. The fact that it comes from an old name "Máel > Coluim" is interesting historically, but it does not mean that your name > is "really" Máel Coluim. (I didn't use my name as an example, because > not everyone will see the glyphs דָּוִד .) > Actually you don't see it with that font. In the original Hebrew script your name consists of two triangles joined. So the reinforcing strut pattern for a round shield.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2015-12-03 15:33 +0100 |
| Message-ID | <n3pjnh$gio$1@dont-email.me> |
| In reply to | #77732 |
On 03/12/15 14:17, Malcolm McLean wrote: > On Thursday, December 3, 2015 at 1:11:32 PM UTC, David Brown wrote: >> On 03/12/15 11:13, Malcolm McLean wrote: >>> On Thursday, December 3, 2015 at 9:12:39 AM UTC, David Brown wrote: >>>> On 02/12/15 17:18, Malcolm McLean wrote: >>>> >>>> "þe olde coffee shoppe" - Unicode 00DE for the capital Þ, 00FE for the >>>> small thorn þ. These are standard letters in modern Icelandic as well >>>> as Old English. I type them with altgr+t and altgr+T, or compose+t+h. >>>> So thorn works perfectly well in UTF-8 - it's a fine example of a useful >>>> non-ASCII character. >>>> >>>> I think you are wrong about half the population (presumably you mean the >>>> British population) thinking "ye olde" uses a "y". The proportion is >>>> /very/ much higher. In fact, I suspect that the only people who know >>>> that it should be a thorn are people who have studied mediaeval >>>> literature at a university level, people who are particularly interested >>>> in typography, and people who happened to have watched the QI episode >>>> where it turned up as a question. >>>> >>> There's a philosophical question here. >>> If the coffee shop owner who wrote the sign considers it to be a y, >>> and the customer attracted inside sipping coffee also considers it to >>> be a y, then can we hold that it is "really" a thorn? >>> >> >> I think in such cases, it is a "y" - we don't really need philosophy. >> You would say your name is "Malcolm", because that's how /you/ spell it, >> that's how your parents spelt it when they named you, and that's how >> everyone else spells it. The fact that it comes from an old name "Máel >> Coluim" is interesting historically, but it does not mean that your name >> is "really" Máel Coluim. (I didn't use my name as an example, because >> not everyone will see the glyphs דָּוִד .) >> > Actually you don't see it with that font. > In the original Hebrew script your name consists of two triangles joined. So the > reinforcing strut pattern for a round shield. > It would be an understatement to say that my Hebrew is not as good as yours - it doesn't stretch much beyond "aleph" in mathematics, and recognising that something is written in Hebrew script. All I can say is that the דָּוִד looks on my machine pretty much the same as you get if you google for "David in Hebrew". But how that compares to early Hebrew scripts, I have no idea.
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2015-12-03 07:05 -0800 |
| Message-ID | <d9d303fd-80fd-48af-991f-074c44868a92@googlegroups.com> |
| In reply to | #77741 |
On Thursday, December 3, 2015 at 2:33:47 PM UTC, David Brown wrote: > On 03/12/15 14:17, Malcolm McLean wrote: > > It would be an understatement to say that my Hebrew is not as good as > yours - it doesn't stretch much beyond "aleph" in mathematics, and > recognising that something is written in Hebrew script. All I can say > is that the דָּוִד looks on my machine pretty much the same as you get if > you google for "David in Hebrew". But how that compares to early Hebrew > scripts, I have no idea. > That script was introduced shortly before the time of Jesus. Moses and King David used the paleo-Hebrew script. It was used for the last time seriously I think during the bar Kochba revolt, when it appeared on the rebel coinage. Dalet is a triangle, like a Greek delta. (It means "door", whether that means they had triangular doors I don't know). Put two of them together, and you have the star of David. It was also the pattern that appeared in a shield. So "David" probably wasn't a given name. It was a nickname or nom de guerre, meaning something like "our beloved leader".
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2015-12-03 16:42 +0100 |
| Message-ID | <n3pnpe$1on$1@dont-email.me> |
| In reply to | #77747 |
On 03/12/15 16:05, Malcolm McLean wrote: > On Thursday, December 3, 2015 at 2:33:47 PM UTC, David Brown wrote: >> On 03/12/15 14:17, Malcolm McLean wrote: >> >> It would be an understatement to say that my Hebrew is not as good as >> yours - it doesn't stretch much beyond "aleph" in mathematics, and >> recognising that something is written in Hebrew script. All I can say >> is that the דָּוִד looks on my machine pretty much the same as you get if >> you google for "David in Hebrew". But how that compares to early Hebrew >> scripts, I have no idea. >> > That script was introduced shortly before the time of Jesus. Moses and King David > used the paleo-Hebrew script. It was used for the last time seriously I think during > the bar Kochba revolt, when it appeared on the rebel coinage. > Dalet is a triangle, like a Greek delta. (It means "door", whether that means they > had triangular doors I don't know). Put two of them together, and you have the > star of David. It was also the pattern that appeared in a shield. > Do you know the historical connection between the Greek and Hebrew alphabets? There is enough similarity that there clearly was a connection or common ancestor (alpha, beta, aleph, beth), but also plenty of differences (Greek is left-to-right, Hebrew is right-to-left). > So "David" probably wasn't a given name. It was a nickname or nom de guerre, > meaning something like "our beloved leader". > That suits me :-)
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2015-12-03 07:58 -0800 |
| Message-ID | <e853c81a-c034-464c-b589-faa905193451@googlegroups.com> |
| In reply to | #77753 |
On Thursday, December 3, 2015 at 3:43:05 PM UTC, David Brown wrote: > On 03/12/15 16:05, Malcolm McLean wrote: > > Do you know the historical connection between the Greek and Hebrew > alphabets? There is enough similarity that there clearly was a > connection or common ancestor (alpha, beta, aleph, beth), but also > plenty of differences (Greek is left-to-right, Hebrew is right-to-left). > The Egyptians invented writing, and the Israelites and Phoenicians took it from them. Phoenician is similar to Hebrew and had effectively the same alphabet. The Greeks then adopted writing from the Phoenicians. However the Greek language is built on inflections - changing the ends of the words to adjust their meaning, whilst hebrew and Phoenician are built on the root principle - you have three consonants that express an concept, the permute the vowels and prefixes and suffixes to take the concept through its grammatical forms. (So KTB means "writing" and you've got forms for book, scribe, library, the verb "to write", and so on). It turns out that the first principle lends itself to left-right reading, the second to right-left reading, because the Hebrew reader understands the word as a whole whilst the Greek reader understands it as developing concept. So the Hebrew reader wants it in his left visual field, where it goes to the right hemisphere first, the Greek in his right visual field. So the Greek alphabet was quickly inverted, and so most of the letters look different. > > So "David" probably wasn't a given name. It was a nickname or nom de guerre, > > meaning something like "our beloved leader". > > > > That suits me :-)
[toc] | [prev] | [next] | [standalone]
| From | Richard Heathfield <rjh@cpax.org.uk> |
|---|---|
| Date | 2015-12-03 10:38 +0000 |
| Message-ID | <n3p5vf$s54$1@dont-email.me> |
| In reply to | #77714 |
On 03/12/15 09:12, David Brown wrote: <snip> > I think you are wrong about half the population (presumably you mean the > British population) thinking "ye olde" uses a "y". The proportion is > /very/ much higher. In fact, I suspect that the only people who know > that it should be a thorn are people who have studied mediaeval > literature at a university level, people who are particularly interested > in typography, and people who happened to have watched the QI episode > where it turned up as a question. Or, if you wished to be a little more precise, you might say: "In fact, I suspect that the only people who know that it should be a thorn are people who have studied mediaeval literature at a university level, people who are particularly interested in typography, people who happened to have watched the QI episode where it turned up as a question, and Richard." I haven't studied mediaeval literature at university level, I have the usual level of interest in typography but not enough interest for it to count as a /particular/ interest[1], and I have known about thorn since long before QI was a quite interesting gleam in John Lloyd's eye. [1] As any banker knows, interest is measured in per cent. My level of interest in typography is around 0.1% - just an ordinary typographical checking account. I think that to be /particularly/ interested in typography, you'd need to at least have the typographical equivalent of an ISA. -- Richard Heathfield Email: rjh at cpax dot org dot uk "Usenet is a strange place" - dmr 29 July 1999 Sig line 4 vacant - apply within
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2015-12-03 14:17 +0100 |
| Message-ID | <n3pf9d$v28$1@dont-email.me> |
| In reply to | #77722 |
On 03/12/15 11:38, Richard Heathfield wrote: > On 03/12/15 09:12, David Brown wrote: > > <snip> > >> I think you are wrong about half the population (presumably you mean the >> British population) thinking "ye olde" uses a "y". The proportion is >> /very/ much higher. In fact, I suspect that the only people who know >> that it should be a thorn are people who have studied mediaeval >> literature at a university level, people who are particularly interested >> in typography, and people who happened to have watched the QI episode >> where it turned up as a question. > > Or, if you wished to be a little more precise, you might say: > > "In fact, I suspect that the only people who know that it should be a > thorn are people who have studied mediaeval literature at a university > level, people who are particularly interested in typography, people who > happened to have watched the QI episode where it turned up as a > question, and Richard." > > I haven't studied mediaeval literature at university level, I have the > usual level of interest in typography but not enough interest for it to > count as a /particular/ interest[1], and I have known about thorn since > long before QI was a quite interesting gleam in John Lloyd's eye. Yes, there are always the know-it-alls - I should have included them in my list too. But I think they still would not bring the total proportion up to "about half". (I also knew about "ye" coming from "þe" - before reading the TeXbook and the METAFONT book, and long before watching QI. So my younger self was also missing from the list.) > > [1] As any banker knows, interest is measured in per cent. My level of > interest in typography is around 0.1% - just an ordinary typographical > checking account. I think that to be /particularly/ interested in > typography, you'd need to at least have the typographical equivalent of > an ISA. >
[toc] | [prev] | [next] | [standalone]
| From | raltbos@xs4all.nl (Richard Bos) |
|---|---|
| Date | 2015-12-04 12:54 +0000 |
| Message-ID | <56618cee.9818218@news.xs4all.nl> |
| In reply to | #77714 |
David Brown <david.brown@hesbynett.no> wrote: > I think you are wrong about half the population (presumably you mean the > British population) thinking "ye olde" uses a "y". The proportion is > /very/ much higher. In fact, I suspect that the only people who know > that it should be a thorn are people who have studied mediaeval > literature at a university level, people who are particularly interested > in typography, and people who happened to have watched the QI episode > where it turned up as a question. Which, by the way, got it wrong - as typical. It's a quite interesting show, but I wouldn't rely on it for facts. Richard
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2015-12-04 14:25 +0100 |
| Message-ID | <n3s44i$8gk$3@dont-email.me> |
| In reply to | #77816 |
On 04/12/15 13:54, Richard Bos wrote: > David Brown <david.brown@hesbynett.no> wrote: > >> I think you are wrong about half the population (presumably you mean the >> British population) thinking "ye olde" uses a "y". The proportion is >> /very/ much higher. In fact, I suspect that the only people who know >> that it should be a thorn are people who have studied mediaeval >> literature at a university level, people who are particularly interested >> in typography, and people who happened to have watched the QI episode >> where it turned up as a question. > > Which, by the way, got it wrong - as typical. It's a quite interesting > show, but I wouldn't rely on it for facts. > I have forgotten the details on this one in the show, but yes, their "facts" are not always accurate.
[toc] | [prev] | [next] | [standalone]
| From | Richard Heathfield <rjh@cpax.org.uk> |
|---|---|
| Date | 2015-12-04 13:46 +0000 |
| Message-ID | <n3s5bq$ec5$1@dont-email.me> |
| In reply to | #77822 |
On 04/12/15 13:25, David Brown wrote: > On 04/12/15 13:54, Richard Bos wrote: >> David Brown <david.brown@hesbynett.no> wrote: >> >>> I think you are wrong about half the population (presumably you mean the >>> British population) thinking "ye olde" uses a "y". The proportion is >>> /very/ much higher. In fact, I suspect that the only people who know >>> that it should be a thorn are people who have studied mediaeval >>> literature at a university level, people who are particularly interested >>> in typography, and people who happened to have watched the QI episode >>> where it turned up as a question. >> >> Which, by the way, got it wrong - as typical. It's a quite interesting >> show, but I wouldn't rely on it for facts. > > I have forgotten the details on this one in the show, but yes, their > "facts" are not always accurate. The idea of QI being a source of facts inescapably reminds me of Ed Zern's 1959 book review in "Field and Stream". (A Web search engine may prove to be of some service here.) -- Richard Heathfield Email: rjh at cpax dot org dot uk "Usenet is a strange place" - dmr 29 July 1999 Sig line 4 vacant - apply within
[toc] | [prev] | [next] | [standalone]
| From | Steve Thompson <stevet810@gmail.com> |
|---|---|
| Date | 2015-12-02 23:24 +0000 |
| Message-ID | <Lm9Wic.bdh.YZJEv@gmail.com> |
| In reply to | #77629 |
On Wed, Dec 02, 2015 at 08:01:52AM -0800, fir wrote: > Im personally still using asci in all my private apps and i shiver > (a bit) to use unicode as i read from time to time text that says > unicode is a pain (at least in some situations) > This directs me to think that unicode is in general a fail.. Unicode > could go the way and become something maybe even simpler than ascii > but gone a bit in a wrong way of making a lot additional mess I don't know what you mean by this. How could Unicode be made simpler than ASCII? > I thing then that maybe one possible recovery scenerio is to use damn > utf-32 only, everywhere you coud and try to forget and deprecate the > other part of the mess > what do ya think? In the long term, US-ASCII will be restricted to select niches, such as extremely tiny embedded processors. Or simulators of legacy hardware, etc. There is no reason to think that a text encoding scheme that cannot represent arbitrary language symbols will survive much into the future. I would like to see Usenet evolve to the point where client software supports something like the open-document format currently used by Openoffice. It makes sense to eventually transition to an article format that supports a rich set of styles, etc. Something like [Display] Postscript might be another good choice. As a character-cell text-only medium Usenet has perhaps little future, never mind the socio-politics and socio-economics pushing everything to Web-2.0. Whether or not this is true, UTF8 is a compact encoding scheme with a storage cost similar to that of ASCII. It is a natural successor, and absent a competing coding scheme that proves itself to be vastly superior, I would say it is here to stay. Currently I concern myself with extended ASCII, but before long I will have to make the transition in my code. To do otherwise is silly as it alienates everyone who does not wish to use English. Regards, Steve Thompson -- "If I had a nickel for every time some idiot called me about a computer problem that turned out to be user error, I would be able to retire and spend the rest of my days cultivating clues in my backyard hillside garden." -- MysteryDog in 24hoursupport.helpdesk.
[toc] | [prev] | [next] | [standalone]
| From | BartC <bc@freeuk.com> |
|---|---|
| Date | 2015-12-03 00:45 +0000 |
| Message-ID | <n3o36b$ud0$1@dont-email.me> |
| In reply to | #77682 |
On 02/12/2015 23:24, Steve Thompson wrote: > On Wed, Dec 02, 2015 at 08:01:52AM -0800, fir wrote: > In the long term, US-ASCII will be restricted to select niches, such > as extremely tiny embedded processors. Or simulators of legacy > hardware, etc. There is no reason to think that a text encoding > scheme that cannot represent arbitrary language symbols will survive > much into the future. An encoding scheme that can only describe English? (Or, with a few dozen additions to fill those spare 128 slots, French, Italian, Spanish, German, maybe even pin-yin.) You're right, that is no use at all... -- bARTC
[toc] | [prev] | [next] | [standalone]
| From | Stephen Sprunk <stephen@sprunk.org> |
|---|---|
| Date | 2015-12-02 20:59 -0600 |
| Message-ID | <n3ob1f$f28$1@dont-email.me> |
| In reply to | #77685 |
On 02-Dec-15 18:45, BartC wrote: > On 02/12/2015 23:24, Steve Thompson wrote: >> In the long term, US-ASCII will be restricted to select niches, >> such as extremely tiny embedded processors. Or simulators of >> legacy hardware, etc. There is no reason to think that a text >> encoding scheme that cannot represent arbitrary language symbols >> will survive much into the future. > > An encoding scheme that can only describe English? > > (Or, with a few dozen additions to fill those spare 128 slots, > French, Italian, Spanish, German, ISO 8859-1 (aka Latin-1) supports Albanian, Catalan, Danish, Dutch, English, Faeroese, Finnish, French, German, Galician, Irish, Icelandic, Italian, Norwegian, Portuguese, Spanish, and Swedish, except that it's missing Dutch's IJ/ij, French's Œ/œ and German's „/‟ because they ran out of code space. Windows-1252 (aka Western European) adds Œ/œ and „/‟ plus a few other letters (but still not IJ/ij) and various symbols (including €) in place of the C1 control codes (0x80-0x9F). However, that still leaves you without support for ~5983 of the world's ~6000 languages, which isn't much of an accomplishment. > maybe even pin-yin.) Not if you require precomposed characters. Pinyin would require at least 50 of them (plus most of ASCII), which would indeed fit in a single code page of its own, but there are few other scripts you could pair it with. That is why, until Unicode and combining accents, nearly everyone used Wade-Giles to transliterate Chinese. Then you have to convince a billion people to change from the script they've been using for 4000+ years to Pinyin. Easy, right? The PRC actually tried that decades ago, and it was overwhelmingly rejected. And then there are still over a billion people using _other_ scripts. S -- Stephen Sprunk "God does not play dice." --Albert Einstein CCIE #3723 "God is an inveterate gambler, and He throws the K5SSS dice at every possible opportunity." --Stephen Hawking
[toc] | [prev] | [next] | [standalone]
Page 3 of 8 — ← Prev page 1 2 [3] 4 5 6 7 8 Next page →
Back to top | Article view | comp.lang.c
csiph-web