Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #20839

Re: Problem w/ DocumentBuilder parse method

From Stanimir Stamenkov <s7an10@netscape.net>
Newsgroups comp.lang.java.programmer
Subject Re: Problem w/ DocumentBuilder parse method
Date 2013-01-01 02:17 +0200
Organization A noiseless patient Spider
Message-ID <kbt9un$gll$1@dont-email.me> (permalink)
References <0a82c44b-eec4-4ea8-92e2-af61192eee1a@googlegroups.com>

Show all headers | View raw


Sun, 30 Dec 2012 11:30:24 -0800 (PST), /John L./:

> I'm pre-processing a file in an attempt to use the subject method, and receive the following error:
>
> [Fatal Error] EXTRACT.TMP:51:23: The entity "nbsp" was referenced, but not declared.
> [...]
> What is the required declaration syntax for &nbsp; to allow the file to be parsed?

As Arne Vajhøj points in another reply, there should be an XHTML 
DOCTYPE declaration at the beginning of the document.  Browsers 
usually don't have problem processing XHTML containing entity 
references from the XHTML DTD, even without DOCTYPE declaration, 
because either:

1. The document is served as text/html, which is not processed as 
XML at all, or;

2. Browsers have and refer to the XHTML DTD locally and are 
automatically associating it automatically based on content-type: 
application/xhtml+xml, or xmlns="http://www.w3.org/1999/xhtml" on 
the root html element.

If the document you're trying to parse is at your control, you could:

1. Add the XHTML DOCTYPE declaration manually:

<!DOCTYPE html
     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
            "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

    or even:

<!DOCTYPE html
     SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

    You may still want to supply EntityResolver [1] to serve this 
DTD from a local resource;

2. Add a DOCTYPE with a local subset containing just the necessary 
entity declarations, like:

<!DOCTYPE html [
   <!ENTITY nbsp "&#160;">
]>

If you're parsing documents which don't have DOCTYPE declaration and 
are not in your control, you may supply EntityResolver2 
implementation which defines additional interface for just that purpose:

http://docs.oracle.com/javase/6/docs/api/org/xml/sax/ext/EntityResolver2.html#getExternalSubset%28java.lang.String,%20java.lang.String%29

[1] 
http://docs.oracle.com/javase/6/docs/api/javax/xml/parsers/DocumentBuilder.html#setEntityResolver%28org.xml.sax.EntityResolver%29

-- 
Stanimir

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

Problem w/ DocumentBuilder parse method "John L." <johnlarew@sbcglobal.net> - 2012-12-30 11:30 -0800
  Re: Problem w/ DocumentBuilder parse method Arne Vajhøj <arne@vajhoej.dk> - 2012-12-30 14:46 -0500
  Re: Problem w/ DocumentBuilder parse method Roedy Green <see_website@mindprod.com.invalid> - 2012-12-31 14:09 -0800
  Re: Problem w/ DocumentBuilder parse method Stanimir Stamenkov <s7an10@netscape.net> - 2013-01-01 02:17 +0200

csiph-web