Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #7828

Re: ascii char 26

From Joshua Cranmer <Pidgeot18@verizon.invalid>
Newsgroups comp.lang.java.programmer
Subject Re: ascii char 26
Date 2011-09-11 16:52 -0500
Organization A noiseless patient Spider
Message-ID <j4jakd$dfl$1@dont-email.me> (permalink)
References <16f8836c-27b9-483b-a71f-61d7d6cfd188@i2g2000yqm.googlegroups.com>

Show all headers | View raw


On 9/11/2011 4:33 PM, bob wrote:
> Anyone know why ASCII char 26 is used in place of a hyphen in UTF-8?

The US-ASCII encoder only properly encodes characters in the range of 
0-127, i.e., the characters that are present in ASCII. Any other 
character is replaced with some sort of substitution character; in this 
case, it looks like the charset has chosen to use ^Z as the "I don't 
know what this character is" character (I would have guessed '?' 
instead, but I suppose they decided to go with the less-commonly used 
variant).

My guess is your input is using one of the characters like the minus 
sign, em dash, or perhaps an en dash instead (there may be others), 
which are visually close in appearance to a hyphen but do not share the 
same Unicode codepoint.

-- 
Beware of bugs in the above code; I have only proved it correct, not 
tried it. -- Donald E. Knuth

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

ascii char 26 bob <bob@coolgroups.com> - 2011-09-11 14:33 -0700
  Re: ascii char 26 Arne Vajhøj <arne@vajhoej.dk> - 2011-09-11 17:48 -0400
  Re: ascii char 26 Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-09-11 16:52 -0500
    Re: ascii char 26 Eric Sosman <esosman@ieee-dot-org.invalid> - 2011-09-11 18:28 -0400
    Re: ascii char 26 bob <bob@coolgroups.com> - 2011-09-11 19:12 -0700
      Re: ascii char 26 Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-09-11 21:25 -0500
        Re: ascii char 26 bob <bob@coolgroups.com> - 2011-09-12 01:30 -0700
  Re: ascii char 26 Roedy Green <see_website@mindprod.com.invalid> - 2011-09-11 15:25 -0700
  Re: ascii char 26 Bent C Dalager <bcd@pvv.ntnu.no> - 2011-09-11 23:18 +0000
    Re: ascii char 26 Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-09-11 18:37 -0500
    Re: ascii char 26 Retahiv Oopsiscame <roopsisc@gmail.com> - 2011-09-11 16:53 -0700
      Re: ascii char 26 Roedy Green <see_website@mindprod.com.invalid> - 2011-09-14 11:55 -0700

csiph-web