Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.help > #1863
| Newsgroups | comp.lang.java.help |
|---|---|
| From | "Peter J. Holzer" <hjp-usenet2@hjp.at> |
| Subject | Re: Actual width of unicode chracters. |
| References | (2 earlier) <jr9h8n$ark$1@dont-email.me> <jr9j3p$qg0$1@tnews.hananet.net> <ac90eba4-9efe-4c35-9f2b-2590326f1fe5@googlegroups.com> <75foa9-b46.ln1@s.simpson148.btinternet.com> <4fd99db2$0$6118$426a74cc@news.free.fr> |
| Date | 2012-06-14 15:32 +0200 |
| Message-ID | <slrnjtjpvk.8bu.hjp-usenet2@hrunkner.hjp.at> (permalink) |
On 2012-06-14 08:26, mayeul.marguet <mayeul.marguet@free.fr> wrote: > On 14/06/2012 09:06, Steven Simpson wrote: >> On 13/06/12 22:14, Lew wrote: >>> Young wrote: >>>> Thank you for the tries, I don't understand why I should use >>>> codePointCount() method. The length() method gives same result. I >>>> want to >>> Not in general it doesn't. >>> >>> Read the Javadocs for the two methods and you'll see why. >> >> I've just read it, and not seen any surprises - it doesn't seem to have >> anything to do with the OP's problem, counting spaces occupied by a >> character when displayed on a console. Whether a code point takes up two >> chars inside a program is unrelated to whether it takes up two display >> positions on a console. Am I missing something? Right. Counting Java chars is very wrong. Counting code points is less wrong, but still wrong, since not every code point takes the same amount of screen space: If we assume a text terminal, a code point may take up 0, 1 or 2 positions. You'll have to loop over the code points and add up the width of each code point. (A method which does this probably already exists, but it isn't codePointCount()) > From the start, what the OP calls a 'width' is actually the number of > bytes used to represent the character. > Korean characters might be big and large, but not to the point that > they'd be twice as large as a monospace roman character. Even when using > strange fonts where that would happen, they wouldn't be /exactly/ twice > as large, and therefore trying to maintain alignment would be futile. If the OP is trying to align them on a text terminal: No it wouldn't be futile. Text terminals have a fixed character grid, and wide Asian characters occupy 2 character cells. This is what the Unicode wide, narrow, fullwidth and halfwidth properties are about (Somebody already posted a link to the relevant specs). Just start a text terminal (xterm, gnome-terminal, konsole, or whatever) and look at some text with Asian characters. > Some encodings for korean characters use two bytes for korean characters > and one byte for ASCII characters. Yes, but that's irrelevant for the OPs problem (although in some Asian encodings the two-byte characters are exactly those which also occupy two positions on the screen, so converting to such an encoding and counting the number of bytes would yield the right answer). hp -- _ | Peter J. Holzer | Deprecating human carelessness and |_|_) | Sysadmin WSR | ignorance has no successful track record. | | | hjp@hjp.at | __/ | http://www.hjp.at/ | -- Bill Code on asrg@irtf.org
Back to comp.lang.java.help | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Actual width of unicode chracters. Young <ycp101@gmail.com> - 2012-06-13 02:23 +0000
Re: Actual width of unicode chracters. Roedy Green <see_website@mindprod.com.invalid> - 2012-06-12 20:42 -0700
Re: Actual width of unicode chracters. markspace <-@.> - 2012-06-12 20:45 -0700
Re: Actual width of unicode chracters. markspace <-@.> - 2012-06-13 00:58 -0700
Re: Actual width of unicode chracters. Young <ycp101@gmail.com> - 2012-06-13 08:30 +0000
Re: Actual width of unicode chracters. markspace <-@.> - 2012-06-13 08:45 -0700
Re: Actual width of unicode chracters. Lew <lewbloch@gmail.com> - 2012-06-13 14:14 -0700
Re: Actual width of unicode chracters. Steven Simpson <ss@domain.invalid> - 2012-06-14 08:06 +0100
Re: Actual width of unicode chracters. "mayeul.marguet" <mayeul.marguet@free.fr> - 2012-06-14 10:26 +0200
Re: Actual width of unicode chracters. "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-06-14 15:32 +0200
Re: Actual width of unicode chracters. "mayeul.marguet" <mayeul.marguet@free.fr> - 2012-06-14 16:47 +0200
Re: Actual width of unicode chracters. "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-06-16 20:39 +0200
Re: Actual width of unicode chracters. Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-06-13 00:00 -0400
Re: Actual width of unicode chracters. markspace <-@.> - 2012-06-13 00:24 -0700
Re: Actual width of unicode chracters. Steven Simpson <ss@domain.invalid> - 2012-06-13 10:24 +0100
csiph-web