Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #26127
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: replace extended characters |
| Date | 2011-02-11 15:07 -0800 |
| Organization | Canadian Mind Products |
| Message-ID | <i4gbl6houh7bon6rvfjii3rdel829ql2hi@4ax.com> (permalink) |
| References | <15bd3363-c781-487b-98d5-2243eff7cc8f@24g2000yqa.googlegroups.com> |
On Thu, 10 Feb 2011 15:33:39 -0800 (PST), VIDEO MAN <bigmush7@googlemail.com> wrote, quoted or indirectly quoted someone who said : >I'm trying to create a java utility that will read in a file that may >or may not contain extended ascii characters and replace these >characters with a predetermined character e.g. replace =E9 with e and >then write the amended file out. > >How would people suggest I approach this from an efficiency point of >view given that the input files could be pretty large? Have at look at http://mindprod.com/products1.html#ENTITIES It includes a program called Entify that finds awkward chars and replaces them with &xxxx; entities in a set of files. There is also a program that does the reverse, DeEntify. You could use the code almost as is and simply modify the table of entities with your unaccented versions of the chars. -- Roedy Green Canadian Mind Products http://mindprod.com Refactor early. If you procrastinate, you will have even more code to adjust based on the faulty design. .
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar
replace extended characters VIDEO MAN <bigmush7@googlemail.com> - 2011-02-10 15:33 -0800
Re: replace extended characters RedGrittyBrick <RedGrittyBrick@spamweary.invalid> - 2011-02-11 15:31 +0000
Re: replace extended characters Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-02-10 21:27 -0400
Re: replace extended characters Arne Vajhøj <arne@vajhoej.dk> - 2011-02-10 21:42 -0500
Re: replace extended characters Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-02-11 15:35 +1300
Re: replace extended characters Lew <noone@lewscanon.com> - 2011-02-10 21:29 -0500
Re: replace extended characters Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-11 18:40 -0500
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 16:57 -0800
Re: replace extended characters v_borchert@despammed.com (Volker Borchert) - 2011-02-12 05:58 +0000
Re: replace extended characters Arne Vajhøj <arne@vajhoej.dk> - 2011-02-10 21:52 -0500
Re: replace extended characters Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-10 19:37 -0500
Re: replace extended characters Lew <noone@lewscanon.com> - 2011-02-10 19:18 -0500
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 16:55 -0800
Re: replace extended characters Owen Jacobson <angrybaldguy@gmail.com> - 2011-02-11 22:15 -0500
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 15:07 -0800
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 15:11 -0800
csiph-web