Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #15952 > unrolled thread

number of bytes for each (uni)code point while using utf-8 as encoding ...

Started bylbrt chx _ gemale
First post2012-07-11 22:29 +0000
Last post2012-07-11 22:29 +0000
Articles 1 — 1 participant

Back to article view | Back to comp.lang.java.programmer


Contents

  number of bytes for each (uni)code point while using utf-8 as encoding ... lbrt chx _ gemale - 2012-07-11 22:29 +0000

#15952 — number of bytes for each (uni)code point while using utf-8 as encoding ...

Fromlbrt chx _ gemale
Date2012-07-11 22:29 +0000
Subjectnumber of bytes for each (uni)code point while using utf-8 as encoding ...
Message-ID<1342045748.366554@nntp.aceinnovative.com>
~ 
 OK, in case someone is looking for something like that. There was some little statement that could (and should!) be optimized if you want for the compiler to inline your code. No conditional statement whatsoever, so the sanity checks should be done in the calling env
~ 
// __ 
class UniKd00{
// __ unicode.org/versions/Unicode6.1.0/
 private final long[] lKpPntLims = new long[]{ 
           128
        , 2048
       , 65536
     , 2097152
    , 67108864
  , 2147483648L
 };

// __ 
 public final long lLastUniKd = lKpPntLims[lKpPntLims.length - 1];

// __ 
 public final String aUniKdVer = "6.1.0";

// __ 
 UniKd00(){}

// __ ((lKdPnt > -1) && (lKdPnt < lLastUniKd)) should be checked in calling env
// __ fewer conditional statements -> more inlin[e|able> by the compiler
 public  final int getKdPntLBytes(long lKdPnt){
  int iByts = 0;
  boolean Is = false;
  for(; ((iByts < lKpPntLims.length) && !Is); ++iByts){ Is = (lKdPnt < lKpPntLims[iByts]); }// iByts [0, lKpPntLims.length)
  return(iByts);
 }
}
~ 
 and the test harness in the calling env. looks like this:
~ 
    lKdPnt = (long)" ... get() codepoint";
    if((lKdPnt > -1) && (lKdPnt < UniKd.lLastUniKd)){
     System.out.printf("// __ |%2d|%10d|%1d|\n", l, lKdPnt, UniKd.getKdPntLBytes(lKdPnt));
    }
    else{ throw new IOException("// __ Code point not mapped by Unicode Standard " + UniKd.aUniKdVer + "! lKdPnt: |" + lKdPnt + "|"); }
~ 
> Would you also disclose why you need that information btw. what you want to do with it?  I don't see the use case.
~ 
 Well, I probably was so into those things (I mentioned) that I "naturally" thought it should be part of the API
~ 
 lbrtchx

[toc] | [standalone]


Back to top | Article view | comp.lang.java.programmer


csiph-web