Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.basic.visual.misc > #894
| From | "Farnsworth" <nospam@nospam.com> |
|---|---|
| Newsgroups | comp.lang.basic.visual.misc |
| Subject | Re: How to handle LARGE UTF-8 file |
| Date | 2012-03-08 23:32 -0500 |
| Organization | Aioe.org NNTP Server |
| Message-ID | <jjc14v$v0i$1@speranza.aioe.org> (permalink) |
| References | <29897294.1014.1331222704653.JavaMail.geo-discussion-forums@vblb5> <jjb6ma$4nq$1@speranza.aioe.org> <jjb6uu$5h2$1@speranza.aioe.org> <17156310.66.1331257903071.JavaMail.geo-discussion-forums@vbkc1> |
stevegdula@yahoo.com wrote: > Farnsworth, > > Your first reply, byte order actually seems to match my sample data. > > ASCII(254) > UTF-8 Two Byte Representation: 1100 0011 1011 1110 &HC3BE > > I haven't currently digested the detailed UTF-8 Wiki explanation yet > and I hopefully won't have to unless I end up needing to write my own > UTF-8 record decoder. > > I am hoping to merely strip out the Byte Order Mark(BOM) > &HEFBBBF,inspect for end of record &H0D0A (one line = one record), > and pass that to the afore mentioned API call. If you look at the list at Wiki article, you notice each of the extra bytes is always >= 128, so you can read a large chunk, 1MB+, and you would know if you need to read few extra bytes or not if the last byte is >=128. As for CR LF, InStrB can be used for byte arrays. Example: Debug.Print InStrB(arr, vbCrLf) Finally, check ParseCSV01 routine at this page to parse the lines: http://www.xbeat.net/vbspeed/c_ParseCSV.php
Back to comp.lang.basic.visual.misc | Previous | Next — Previous in thread | Next in thread | Find similar
How to handle LARGE UTF-8 file stevegdula@yahoo.com - 2012-03-08 08:05 -0800
Re: How to handle LARGE UTF-8 file Deanna Earley <dee.earley@icode.co.uk> - 2012-03-08 16:55 +0000
Re: How to handle LARGE UTF-8 file "Bob Butler" <bob_butler@cox.invalid> - 2012-03-08 10:13 -0800
Re: How to handle LARGE UTF-8 file stevegdula@yahoo.com - 2012-03-08 10:49 -0800
Re: How to handle LARGE UTF-8 file "Farnsworth" <nospam@nospam.com> - 2012-03-08 16:00 -0500
Re: How to handle LARGE UTF-8 file "Farnsworth" <nospam@nospam.com> - 2012-03-08 16:05 -0500
Re: How to handle LARGE UTF-8 file stevegdula@yahoo.com - 2012-03-08 17:51 -0800
Re: How to handle LARGE UTF-8 file "Farnsworth" <nospam@nospam.com> - 2012-03-08 23:32 -0500
Re: How to handle LARGE UTF-8 file Schmidt <sss@online.de> - 2012-03-09 07:32 +0100
Re: How to handle LARGE UTF-8 file "Farnsworth" <nospam@nospam.com> - 2012-03-09 13:40 -0500
Re: How to handle LARGE UTF-8 file stevegdula@yahoo.com - 2012-03-14 08:54 -0700
csiph-web