Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #26205
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: replace extended characters |
| Date | 2011-02-11 15:11 -0800 |
| Organization | Canadian Mind Products |
| Message-ID | <0bgbl698d19hku2vlldf5rldbsebis933u@4ax.com> (permalink) |
| References | <15bd3363-c781-487b-98d5-2243eff7cc8f@24g2000yqa.googlegroups.com> <i4gbl6houh7bon6rvfjii3rdel829ql2hi@4ax.com> |
On Fri, 11 Feb 2011 15:07:22 -0800, Roedy Green <see_website@mindprod.com.invalid> wrote, quoted or indirectly quoted someone who said : > >Have at look at http://mindprod.com/products1.html#ENTITIES > >It includes a program called Entify that finds awkward chars and >replaces them with &xxxx; entities in a set of files. There is also a >program that does the reverse, DeEntify. > >You could use the code almost as is and simply modify the table of >entities with your unaccented versions of the chars. My version reads the entire file into RAM in one I/O. You could modify it to read one line of a file at it a time. For the code to do that talk to http://mindprod.com/applet/fileio.html By making whacking huge buffers, you can ensure the bottleneck is the CPU. I use a big switch statement. For extra speed you could use an array lookup of the replacement string for each char. The compiler/JVM is not all that clever about generating code for switch statements. -- Roedy Green Canadian Mind Products http://mindprod.com Refactor early. If you procrastinate, you will have even more code to adjust based on the faulty design. .
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Find similar
replace extended characters VIDEO MAN <bigmush7@googlemail.com> - 2011-02-10 15:33 -0800
Re: replace extended characters RedGrittyBrick <RedGrittyBrick@spamweary.invalid> - 2011-02-11 15:31 +0000
Re: replace extended characters Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-02-10 21:27 -0400
Re: replace extended characters Arne Vajhøj <arne@vajhoej.dk> - 2011-02-10 21:42 -0500
Re: replace extended characters Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-02-11 15:35 +1300
Re: replace extended characters Lew <noone@lewscanon.com> - 2011-02-10 21:29 -0500
Re: replace extended characters Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-11 18:40 -0500
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 16:57 -0800
Re: replace extended characters v_borchert@despammed.com (Volker Borchert) - 2011-02-12 05:58 +0000
Re: replace extended characters Arne Vajhøj <arne@vajhoej.dk> - 2011-02-10 21:52 -0500
Re: replace extended characters Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-10 19:37 -0500
Re: replace extended characters Lew <noone@lewscanon.com> - 2011-02-10 19:18 -0500
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 16:55 -0800
Re: replace extended characters Owen Jacobson <angrybaldguy@gmail.com> - 2011-02-11 22:15 -0500
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 15:07 -0800
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 15:11 -0800
csiph-web