Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #26025
| From | Lew <noone@lewscanon.com> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: replace extended characters |
| Date | 2011-02-10 19:18 -0500 |
| Organization | albasani.net |
| Message-ID | <ij1v7l$jel$1@news.albasani.net> (permalink) |
| References | <15bd3363-c781-487b-98d5-2243eff7cc8f@24g2000yqa.googlegroups.com> |
VIDEO MAN wrote: > I'm trying to create a java [sic] utility that will read in a file that may > or may not contain extended ascii [sic] characters and replace these > characters with a predetermined character [sic] e.g. [sic] replace é with e and > then write the amended file out. > > How would people suggest I approach this from an efficiency point of > view given that the input files could be pretty large? > > Any guidance appreciated. Read from a BufferedReader. Write to a BufferedWriter. Process one character at a time. It won't be efficient unless you are guaranteed a limited character-set input. The Unicode character space is on the order of 2^24 characters large. "Extended ASCII" is a very tiny subset of that, and also depends on the character encoding. If you are certain that the set of possible input characters is small, and those you wish to substitute even smaller, you can use a lookup table. Use a 'Map<Character,Character>' (will choke on supplementary code points) for those, and only those, you wish to substitute. If the key is absent, pass the source character through unchanged. If present, replace with the associated value. -- Lew Ceci n'est pas une fenêtre. .___________. |###] | [###| |##/ | *\##| |#/ * | \#| |#----|----#| || | * || |o * | o| |_____|_____| |===========|
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar
replace extended characters VIDEO MAN <bigmush7@googlemail.com> - 2011-02-10 15:33 -0800
Re: replace extended characters RedGrittyBrick <RedGrittyBrick@spamweary.invalid> - 2011-02-11 15:31 +0000
Re: replace extended characters Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-11 18:40 -0500
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 16:57 -0800
Re: replace extended characters v_borchert@despammed.com (Volker Borchert) - 2011-02-12 05:58 +0000
Re: replace extended characters Arne Vajhøj <arne@vajhoej.dk> - 2011-02-10 21:52 -0500
Re: replace extended characters Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-10 19:37 -0500
Re: replace extended characters Lew <noone@lewscanon.com> - 2011-02-10 19:18 -0500
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 16:55 -0800
Re: replace extended characters Owen Jacobson <angrybaldguy@gmail.com> - 2011-02-11 22:15 -0500
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 15:07 -0800
Re: replace extended characters Roedy Green <see_website@mindprod.com.invalid> - 2011-02-11 15:11 -0800
csiph-web