Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #15940

Re: number of bytes for each (uni)code point while using utf-8 as encoding ...

From rossum <rossum48@coldmail.com>
Newsgroups comp.lang.java.programmer
Subject Re: number of bytes for each (uni)code point while using utf-8 as encoding ...
Date 2012-07-11 16:09 +0100
Message-ID <0t4rv7d9lokdbm0287lf7h76u41a0qunvu@4ax.com> (permalink)
References <1341965282.664308@nntp.aceinnovative.com>

Show all headers | View raw


On 11 Jul 2012 00:08:02 GMT, lbrt chx _ gemale wrote:

> how to get the length of the sequence of bytes defining a code point
Use a look up table.

Start Code Point   End Code Point   Num Bytes  
----------------   --------------   ---------
     U+0000           U+007F            1
     U+0080           U+07FF            2
     U+0800           U+FFFF            3
     U+10000          U+1FFFFF          4
     U+200000         U+3FFFFFF         5
     U+4000000        U+7FFFFFFF        6


rossum

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

number of bytes for each (uni)code point while using utf-8 as encoding ... lbrt chx _ gemale - 2012-07-11 00:08 +0000
  Re: number of bytes for each (uni)code point while using utf-8 as encoding ... rossum <rossum48@coldmail.com> - 2012-07-11 16:09 +0100
  Re: number of bytes for each (uni)code point while using utf-8 as encoding ... Robert Klemme <shortcutter@googlemail.com> - 2012-07-11 22:03 +0200
  Re: number of bytes for each (uni)code point while using utf-8 as encoding ... Lew <lewbloch@gmail.com> - 2012-07-11 14:05 -0700

csiph-web