Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #17160

Re: retriving escape unicode sequences from files ...

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!news-out.readnews.com!transit3.readnews.com!news-out.news.tds.net!newsreading01.news.tds.net!53ab2750!not-for-mail
From "Daniel Pitts" <daniel.pitts@1:261/38.remove-5qr-this>
Subject Re: retriving escape unicode sequences from files ...
Message-ID <501D6353.56120.calajapr@time.synchro.net> (permalink)
X-Comment-To Arne Vajhøj
Newsgroups comp.lang.java.programmer
In-Reply-To <501D6353.56117.calajapr@time.synchro.net>
References <501D6353.56117.calajapr@time.synchro.net>
X-FTN-AREA COMP.LANG.JAVA.PROGRAMMER
X-FTN-MSGID 1:261/38 d8bcb221
X-FTN-REPLY 1:261/38 5a4efe48
Content-Type text/plain; charset=IBM437
Content-Transfer-Encoding 8bit
X-Gateway time.synchro.net [Synchronet 3.16a-Win32 NewsLink 1.98]
Lines 43
Date Sat, 04 Aug 2012 18:41:42 GMT
NNTP-Posting-Host 69.21.70.65
X-Complaints-To news@tds.net
X-Trace newsreading01.news.tds.net 1344105702 69.21.70.65 (Sat, 04 Aug 2012 13:41:42 CDT)
NNTP-Posting-Date Sat, 04 Aug 2012 13:41:42 CDT
Organization tds.net
Xref csiph.com comp.lang.java.programmer:17160

Show key headers only | View raw


  To: Arne Vajhøj
From: Daniel Pitts <newsgroup.nospam@virtualinfinity.net>

On 8/3/12 5:37 PM, Arne Vajhoj wrote:
> On 8/2/2012 11:52 PM, qwertmonkey@syberianoutpost.ru wrote:
>>   Why is it that if you save a unicode sequence in a file, say "frantais"
>> ~
>> \u0066\u0072\u0061\u006e\u00e7\u0061\u0069\u0073
>> ~
>>   and then retrieve as a String you can't then convert it back to a
>> UTF-8 String
>> ~
>
> Some code from my shelf:
>
> import java.util.regex.Matcher;
> import java.util.regex.Pattern;
>
> public class Unescape {
>      private static final Pattern p =
> Pattern.compile("\\\\u([0-9A-F]{4})");
>      public static String U2U(String s) {
>          String res = s;
>          Matcher m = p.matcher(res);
>          while(m.find()) {
>              res = res.replaceAll("\\" + m.group(0),
> Character.toString((char)Integer.parseInt(m.group(1), 16)));
>          }
>          return res;
>      }
>      public static void main(String[] args) {
>
> System.out.println(U2U("\\u0041\\u0042\\u0043\\u000A\\u0031\\u0032\\u0033"));
>
>      }
> }
And if you wanted this to be effecient, you'd use appendReplacement instead of 
res.replaceAll()

--- BBBS/Li6 v4.10 Dada-1
 * Origin: Prism bbs (1:261/38)
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

retriving escape unicode sequences from files ... "qwertmonkey" <qwertmonkey@1:261/38.remove-yy0-this> - 2012-08-03 18:54 +0000
  Re: retriving escape unicode sequences from files ... "markspace" <markspace@1:261/38.remove-yy0-this> - 2012-08-03 18:54 +0000
  Re: retriving escape unicode sequences from files ... "Roedy Green" <roedy.green@1:261/38.remove-yy0-this> - 2012-08-03 18:54 +0000
  Re: retriving escape unicode sequences from files ... "glen herrmannsfeldt" <glen.herrmannsfeldt@1:261/38.remove-5qr-this> - 2012-08-04 18:41 +0000
  Re: retriving escape unicode sequences from files ... "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-5qr-this> - 2012-08-04 18:41 +0000
    Re: retriving escape unicode sequences from files ... "Daniel Pitts" <daniel.pitts@1:261/38.remove-5qr-this> - 2012-08-04 18:41 +0000
      Re: retriving escape unicode sequences from files ... "markspace" <markspace@1:261/38.remove-5qr-this> - 2012-08-04 18:41 +0000
        Re: retriving escape unicode sequences from files ... "Lew" <lew@1:261/38.remove-5qr-this> - 2012-08-04 18:41 +0000
      Re: retriving escape unicode sequences from files ... "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-p82-this> - 2012-08-08 06:20 +0000

csiph-web