Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #15654

Re: SAX parser splits URL ...

Path csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!newsfeed.kamp.net!newsfeed.kamp.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From Robert Klemme <shortcutter@googlemail.com>
Newsgroups comp.lang.java.programmer
Subject Re: SAX parser splits URL ...
Date Wed, 27 Jun 2012 07:34:18 +0200
Lines 55
Message-ID <a4vkb1F60fU1@mid.individual.net> (permalink)
References <1340769034.526896@nntp.aceinnovative.com>
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-Trace individual.net qPva+XnC1SGODFZ26aU+bwrIuorNPIkCjQmrbrS+MTd/lTIfFzTlo9Bug7LO1pSa8=
Cancel-Lock sha1:SF0+NYfXfRZzTvxJ69+UJwVe1Lk=
User-Agent Mozilla/5.0 (Windows NT 6.0; WOW64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1
In-Reply-To <1340769034.526896@nntp.aceinnovative.com>
Xref csiph.com comp.lang.java.programmer:15654

Show key headers only | View raw


On 27.06.2012 05:50, lbrt chx _ gemale wrote:
>   I have an URL in an XML file that looks like this:
> ~
> ...
>    <Location>http://pagesinxt.com/?dn=www.outfo.org&flrdr=yes&nxte=zip</Location>
> ...
> ~
>   http://xsdvalidation.utilities-online.info/
> ~
> is telling me the document itself is valid, but the SAX parser is
> splitting the value at every "&"
> ~
> // __ start element iIxLvl: |3|Location
> // __ start characters iIxLvl: |3|http://pagesinxt.com/?dn=www.outfo.org|
> // __ start characters iIxLvl: |3|&|
> // __ start characters iIxLvl: |3|flrdr=yes|
> // __ start characters iIxLvl: |3|&|
> // __ start characters iIxLvl: |3|nxte=zip|
> // __ end element   iIxLvl: |2|Location|
> ~
>   I found some sort of an explanation here:
> ~
>   http://stackoverflow.com/questions/1328538/how-do-i-escape-ampersands-in-xml
> ~
>   I couldn't make much sense of (I tried a few things)
> ~
>   Is this related to a setting in the parser? Is there a way to fix that problem?

That's not related to the parser - at least not to a particular one.  It 
is a feature of XML which allows you to include characters in the 
document which are not supported by the native encoding you use when 
writing the document.

The concept is known as "XML entity".  Please see
http://www.tizag.com/xmlTutorial/xmlentity.php
http://www.javacommerce.com/displaypage.jsp?name=entities.sql&id=18238

The standard
http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-references

Bottom line, you can do

<Location>http://pagesinxt.com/?dn=www.outfo.org&amp;flrdr=yes&amp;nxte=zip</Location>

But please read up on XML more thoroughly - it pays off.

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

SAX parser splits URL ... lbrt chx _ gemale - 2012-06-27 03:50 +0000
  Re: SAX parser splits URL ... Robert Klemme <shortcutter@googlemail.com> - 2012-06-27 07:34 +0200
    Re: SAX parser splits URL ... Robert Klemme <shortcutter@googlemail.com> - 2012-06-26 23:21 -0700
  Re: SAX parser splits URL ... "mayeul.marguet" <mayeul.marguet@free.fr> - 2012-06-27 11:32 +0200

csiph-web