Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #19926
| From | Sebastian <sebastian@undisclosed.invalid> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: Detect XML document encodings with SAX |
| Date | 2012-11-25 10:50 +0100 |
| Organization | albasani.net |
| Message-ID | <k8sphg$hn4$1@news.albasani.net> (permalink) |
| References | (1 earlier) <0b3b04bf-24dd-4d59-a16d-14c745b66c76@googlegroups.com> <50b02ee6$0$283$14726298@news.sunsite.dk> <d64baf3c-d582-4308-b6b4-714ef3049ef5@googlegroups.com> <k8rdfq$gbg$1@news.albasani.net> <50b14516$0$282$14726298@news.sunsite.dk> |
Am 24.11.2012 23:07, schrieb Arne Vajhøj:
[snip]
> I would consider it tempting to rewrite that app to use a standard
> XML parser.
>
> It would solve this problem and possibly also some future problems.
Yes, I wish I could do that (or rather, have that done...) It seems that
app also handles other types of files (like csv) and regardless of
file type they always do the same, namely open an InputStreamReader
given a charset name.
[snip]
> What about just reading the first few lines until you have the
> XML declaration.
>
> Parsing the encoding out of that should be simple.
>
> private static final Pattern encpat =
> Pattern.compile("encoding\\s*=\\s*['\"]([^'\"]+)['\"]");
> private static String detectSimple(String fnm) throws IOException {
> BufferedReader br = new BufferedReader(new FileReader(fnm));
> String firstpart = "";
> while(!firstpart.contains(">")) firstpart += br.readLine();
> br.close();
> Matcher m = encpat.matcher(firstpart);
> if(m.find()) {
> return m.group(1);
> } else {
> return "Unknown";
> }
> }
>
> I do not like the solution, but given the restrictions in the
> context, then maybe it is what you need.
Thanks for the suggestion. I'll use that idea until a better solution
becomes feasible.
-- Sebastian
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Detect XML document encodings with SAX Sebastian <sebastian@undisclosed.invalid> - 2012-11-21 15:32 +0100
Re: Detect XML document encodings with SAX Lew <lewbloch@gmail.com> - 2012-11-21 11:31 -0800
Re: Detect XML document encodings with SAX Sebastian <sebastian@undisclosed.invalid> - 2012-11-22 00:39 +0100
Re: Detect XML document encodings with SAX Lew <lewbloch@gmail.com> - 2012-11-21 16:37 -0800
Re: Detect XML document encodings with SAX Sebastian <sebastian@undisclosed.invalid> - 2012-11-22 07:41 +0100
Re: Detect XML document encodings with SAX markspace <-@.> - 2012-11-21 23:18 -0800
Re: Detect XML document encodings with SAX Steven Simpson <ss@domain.invalid> - 2012-11-22 07:53 +0000
Re: Detect XML document encodings with SAX markspace <-@.> - 2012-11-22 08:31 -0800
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-23 21:21 -0500
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-23 21:11 -0500
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-23 21:20 -0500
Re: Detect XML document encodings with SAX Lew <lewbloch@gmail.com> - 2012-11-24 02:14 -0800
Re: Detect XML document encodings with SAX Sebastian <sebastian@undisclosed.invalid> - 2012-11-24 22:18 +0100
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 17:07 -0500
Re: Detect XML document encodings with SAX Sebastian <sebastian@undisclosed.invalid> - 2012-11-25 10:50 +0100
Re: Detect XML document encodings with SAX markspace <-@.> - 2012-11-24 17:12 -0800
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 20:17 -0500
Re: Detect XML document encodings with SAX markspace <-@.> - 2012-11-24 18:02 -0800
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 21:10 -0500
Re: Detect XML document encodings with SAX markspace <-@.> - 2012-11-24 18:25 -0800
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 21:37 -0500
Re: Detect XML document encodings with SAX markspace <-@.> - 2012-11-24 21:01 -0800
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 16:30 -0500
Re: Detect XML document encodings with SAX Gene Wirchenko <genew@telus.net> - 2012-12-12 18:03 -0800
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-12-12 21:09 -0500
Re: Detect XML document encodings with SAX Lew <lewbloch@gmail.com> - 2012-12-12 18:58 -0800
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-12-12 22:17 -0500
Re: Detect XML document encodings with SAX Lew <lewbloch@gmail.com> - 2012-12-12 22:51 -0800
Re: Detect XML document encodings with SAX Gene Wirchenko <genew@telus.net> - 2012-12-12 21:52 -0800
Re: Detect XML document encodings with SAX Sebastian <sebastian@undisclosed.invalid> - 2012-11-25 10:45 +0100
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 16:23 -0500
Re: Detect XML document encodings with SAX markspace <-@.> - 2012-11-25 13:24 -0800
Re: Detect XML document encodings with SAX Sebastian <sebastian@undisclosed.invalid> - 2012-11-25 10:58 +0100
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 17:13 -0500
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 17:19 -0500
Re: Detect XML document encodings with SAX Roedy Green <see_website@mindprod.com.invalid> - 2012-11-22 03:24 -0800
Re: Detect XML document encodings with SAX "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-11-24 00:13 +0100
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-23 21:22 -0500
Re: Detect XML document encodings with SAX Steven Simpson <ss@domain.invalid> - 2012-11-25 11:00 +0000
Re: Detect XML document encodings with SAX Sebastian <sebastian@undisclosed.invalid> - 2012-11-25 12:32 +0100
Re: Detect XML document encodings with SAX Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:41 -0500
Re: Detect XML document encodings with SAX Roedy Green <see_website@mindprod.com.invalid> - 2012-12-12 20:32 -0800
Re: Detect XML document encodings with SAX Stanimir Stamenkov <s7an10@netscape.net> - 2012-12-16 17:43 +0200
csiph-web