Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.text.xml > #783

Re: Extract values from xml file using xpath/namespace/simplexml/php

From Peter Flynn <peter@silmaril.ie>
Newsgroups comp.text.xml
Subject Re: Extract values from xml file using xpath/namespace/simplexml/php
Date 2014-08-24 22:22 +0100
Message-ID <c5v3cpFdbkU1@mid.individual.net> (permalink)
References <3f75d978-fb96-4f8b-a3f1-058a94d56c45@googlegroups.com>

Show all headers | View raw


On 08/13/2014 11:22 AM, ofuuzo@gmail.com wrote:
> Hi,
> I am new in XML. I can't figure out how to combine
> xpath/namespace/simplexml/php to extract the values of "dc.title" and
"dc.date" in this xml file:

> <?xml version="1.0" encoding="UTF-8"?>

Tip: always post a complete, well-formed instance, not one with the
bottom snipped off the file, especially when it's in an uncommon vocaulary.

Given your sample instance...

<OAI-PMH
  xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
  http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"
  xmlns="http://www.openarchives.org/OAI/2.0/"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ListRecords>
    <record>
      <metadata
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/">
	<oai_dc:dc
	  xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/">
	  <dc:title>GAFBone</dc:title>
	  <dc:date>2014-05-01T00:00:00Z</dc:date>
	</oai_dc:dc>
      </metadata>
    </record>
  </ListRecords>
</OAI-PMH>

This XSLT2 script will extract the title and date:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:oai="http://www.openarchives.org/OAI/2.0/"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
                version="2.0">

  <xsl:output method="text"/>

  <xsl:template match="/">
    <xsl:value-of
	select="oai:OAI-PMH/oai:ListRecords/oai:record/oai:metadata"/>
  </xsl:template>

  <xsl:template match="oai:metadata">
    <xsl:value-of select="oai_dc:dc/dc:title"/>
    <xsl:value-of select="oai_dc:dc/dc:date"/>
  </xsl:template>

</xsl:stylesheet>

The trick is that you have to write the XPaths using the namespaces
which are in effect in the document. A validating parser such as rxp
will let you check what namespaces are in effect on which elements.

Alternatively, use a tool which lets you omit the default namespace,
such as lxprintf, eg (assuming your document is test.xml):

$ lxprintf -e oai_dc:dc "%s\t%s\n" dc:title dc:date test.xml
GAFBone	2014-05-01T00:00:00Z

How to do the equivalent in PHP is left as an exercise to the reader.

///Peter

Back to comp.text.xml | Previous | NextPrevious in thread | Find similar


Thread

Extract  values from xml file using xpath/namespace/simplexml/php ofuuzo@gmail.com - 2014-08-13 03:22 -0700
  Re: Extract  values from xml file using xpath/namespace/simplexml/php Peter Flynn <peter@silmaril.ie> - 2014-08-24 22:22 +0100

csiph-web