Path: csiph.com!usenet.pasdenom.info!news.albasani.net!eternal-september.org!feeder.eternal-september.org!mx04.eternal-september.org!.POSTED!not-for-mail From: markspace <-@.> Newsgroups: comp.lang.java.programmer Subject: Re: retriving escape unicode sequences from files ... Date: Thu, 02 Aug 2012 21:00:15 -0700 Organization: A noiseless patient Spider Lines: 20 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Fri, 3 Aug 2012 04:00:19 +0000 (UTC) Injection-Info: mx04.eternal-september.org; posting-host="61282af8d6595e8d991edb5ac03d6e00"; logging-data="3589"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1++oZtCaG03/b1e21l2iV8b/+udn5IuQss=" User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20120713 Thunderbird/14.0 In-Reply-To: Cancel-Lock: sha1:rLtGB837LH2xkzQqNa45aYpy8BA= Xref: csiph.com comp.lang.java.programmer:17028 On 8/2/2012 8:52 PM, qwertmonkey@syberianoutpost.ru wrote: > Why is it that if you save a unicode sequence in a file, say "français" > ~ > \u0066\u0072\u0061\u006e\u00e7\u0061\u0069\u0073 > ~ > and then retrieve as a String you can't then convert it back to a UTF-8 String Because it isn't French, it's just the ASCII characters \, u, 0, 0, 6, 6 etc. This is a totally different concept from the idea of escape sequences that the compiler interprets for you. If you want to read French out of a file, put *French* in the file, not ASCII. It can't work any other way. If you want to interpret ASCII as escape sequences, you'll have to write the interpreter. The Java Properties object reads escape sequences, but I don't think you can separate just the escape parser out.