Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.basic.visual.misc > #893

Re: How to handle LARGE UTF-8 file

From stevegdula@yahoo.com
Newsgroups comp.lang.basic.visual.misc
Subject Re: How to handle LARGE UTF-8 file
Date 2012-03-08 17:51 -0800
Organization http://groups.google.com
Message-ID <17156310.66.1331257903071.JavaMail.geo-discussion-forums@vbkc1> (permalink)
References <29897294.1014.1331222704653.JavaMail.geo-discussion-forums@vblb5> <jjb6ma$4nq$1@speranza.aioe.org> <jjb6uu$5h2$1@speranza.aioe.org>

Show all headers | View raw


Farnsworth,

Your first reply, byte order actually seems to match my sample data.

ASCII(254)
UTF-8 Two Byte Representation: 1100 0011 1011 1110 &HC3BE

I haven't currently digested the detailed UTF-8 Wiki explanation yet and I hopefully won't have to unless I end up needing to write my own UTF-8 record decoder.

I am hoping to merely strip out the Byte Order Mark(BOM) &HEFBBBF,inspect for end of record &H0D0A (one line = one record), and pass that to the afore mentioned API call.

Thanks,

~Steve

On Thursday, March 8, 2012 3:05:30 PM UTC-6, Farnsworth wrote:
> Farnsworth wrote:
> > Besides what others suggested, check this link to see how the
> > characters are encoded:
> >
> > http://en.wikipedia.org/wiki/Utf-8#Description
> >
> > So ASCII 254(1111 1110) =
> >
> > Byte 1: 110 00011 = &HC3
> > Byte 2: 10 111110 = &HBE
> 
> I made a mistake in the byte order, so it should be the other way around:
> 
> Byte 1: 110 11110 = &HDE
> Byte 2: 10 000111 = &H87

Back to comp.lang.basic.visual.misc | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

How to handle LARGE UTF-8 file stevegdula@yahoo.com - 2012-03-08 08:05 -0800
  Re: How to handle LARGE UTF-8 file Deanna Earley <dee.earley@icode.co.uk> - 2012-03-08 16:55 +0000
    Re: How to handle LARGE UTF-8 file "Bob Butler" <bob_butler@cox.invalid> - 2012-03-08 10:13 -0800
      Re: How to handle LARGE UTF-8 file stevegdula@yahoo.com - 2012-03-08 10:49 -0800
  Re: How to handle LARGE UTF-8 file "Farnsworth" <nospam@nospam.com> - 2012-03-08 16:00 -0500
    Re: How to handle LARGE UTF-8 file "Farnsworth" <nospam@nospam.com> - 2012-03-08 16:05 -0500
      Re: How to handle LARGE UTF-8 file stevegdula@yahoo.com - 2012-03-08 17:51 -0800
        Re: How to handle LARGE UTF-8 file "Farnsworth" <nospam@nospam.com> - 2012-03-08 23:32 -0500
        Re: How to handle LARGE UTF-8 file Schmidt <sss@online.de> - 2012-03-09 07:32 +0100
          Re: How to handle LARGE UTF-8 file "Farnsworth" <nospam@nospam.com> - 2012-03-09 13:40 -0500
            Re: How to handle LARGE UTF-8 file stevegdula@yahoo.com - 2012-03-14 08:54 -0700

csiph-web