Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #7947
| From | markspace <-@.> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: unicode |
| Date | 2011-09-12 20:16 -0700 |
| Organization | A noiseless patient Spider |
| Message-ID | <j4mhtv$ppb$1@dont-email.me> (permalink) |
| References | <6c991195-ab57-417c-92e0-6d5ee1c451dc@dq7g2000vbb.googlegroups.com> <nfss679ije8c4r70tn9kmnr055vm6nfua0@4ax.com> <4e6e7a2a$0$309$14726298@news.sunsite.dk> <j4m4rs$l5g$1@dont-email.me> <88ff0d8c-af5f-4086-8232-26c80e5d8270@glegroupsg2000goo.googlegroups.com> |
On 9/12/2011 5:46 PM, Lew wrote: > > That would defeat its purpose, which is somewhat similar to the > purpose of trigraphs in C, AIUI. There's only nine trigraphs, they're a lot harder to "hit" accidentally. > That is, if your keyboard lacks > certain characters, you can express source in "\u" notation and the > source parser will read it correctly. The problem is that \u is a lot more common than ??-. For example, \u also occurs in regex, which unfortunately seems to be the OP's confusion. > Its whole raison d'etre is to > precede compilation, not to be part of it. So how could it go away? > What would you do instead? I'd make the \u sequence a string and character escape. \u00A0 would be interpreted the same as \n. It would put a new line in the string, not in the compiler input. Every other type of \u escape (comments, parts of code) would be interpreted literally. Legacy code that relies on \u outside of strings and character constants would break. If you need to type a character that your keyboard doesn't have, get your editor to recognize an escape sequence, not the compiler. There's also digraphs in C, which are only recognized in tokenization, not as a preprocessed type of substitution. These are much better, as they are not recognized in string literals, character literals, or comments. I'd consider replacing \u for "missing keys" with C's digraphs. There's only five digraphs in C. The presence of \u in comments is especially pernicious, imo. The Java doc tool already has HTML escapes, we don't need a second redundant method of specifying unusual characters.
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar
unicode bob <bob@coolgroups.com> - 2011-09-12 12:24 -0700
Re: unicode Knute Johnson <nospam@knutejohnson.com> - 2011-09-12 14:04 -0700
Re: unicode Roedy Green <see_website@mindprod.com.invalid> - 2011-09-12 14:08 -0700
Re: unicode Arne Vajhøj <arne@vajhoej.dk> - 2011-09-12 17:31 -0400
Re: unicode markspace <-@.> - 2011-09-12 16:33 -0700
Re: unicode Lew <lewbloch@gmail.com> - 2011-09-12 17:46 -0700
Re: unicode markspace <-@.> - 2011-09-12 20:16 -0700
Re: unicode Roedy Green <see_website@mindprod.com.invalid> - 2011-09-12 22:05 -0700
Re: unicode Roedy Green <see_website@mindprod.com.invalid> - 2011-09-12 22:10 -0700
Re: unicode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2011-09-13 07:18 +0000
Re: unicode Arne Vajhøj <arne@vajhoej.dk> - 2011-09-12 20:57 -0400
Re: unicode markspace <-@.> - 2011-09-12 19:51 -0700
Re: unicode Arne Vajhøj <arne@vajhoej.dk> - 2011-09-13 20:17 -0400
Re: unicode markspace <-@.> - 2011-09-13 19:32 -0700
Re: unicode Roedy Green <see_website@mindprod.com.invalid> - 2011-09-14 11:49 -0700
Re: unicode Paul Cager <paul.cager@googlemail.com> - 2011-09-13 04:05 -0700
Re: unicode Roedy Green <see_website@mindprod.com.invalid> - 2011-09-12 22:02 -0700
Re: unicode Arne Vajhøj <arne@vajhoej.dk> - 2011-09-13 20:30 -0400
Re: unicode Arne Vajhøj <arne@vajhoej.dk> - 2011-09-12 17:29 -0400
Re: unicode Lew <lewbloch@gmail.com> - 2011-09-12 15:48 -0700
csiph-web