Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.ruby > #2500 > unrolled thread
| Started by | "Kyle X." <haebooty@yahoo.com> |
|---|---|
| First post | 2011-04-07 23:00 -0500 |
| Last post | 2011-04-08 01:46 -0500 |
| Articles | 4 — 3 participants |
Back to article view | Back to comp.lang.ruby
REXML Speed Question "Kyle X." <haebooty@yahoo.com> - 2011-04-07 23:00 -0500
Re: REXML Speed Question Ryan Davis <ryand-ruby@zenspider.com> - 2011-04-08 00:03 -0500
Re: REXML Speed Question Mark Kremer <mark@without-brains.net> - 2011-04-08 00:07 -0500
Re: REXML Speed Question "Kyle X." <haebooty@yahoo.com> - 2011-04-08 01:46 -0500
| From | "Kyle X." <haebooty@yahoo.com> |
|---|---|
| Date | 2011-04-07 23:00 -0500 |
| Subject | REXML Speed Question |
| Message-ID | <59db28904786b1c32859ccdd60747e08@ruby-forum.com> |
Hello, I have been using REXML to extract information from an XML file
and I am having an issue with the amount of time it is taking. If I
point directly to what I want it is pretty fast. The issue arises when
I have to grab a reference id, then research for that id to get another
id, until I finally get to the piece of information I want. This is
what a snippet my code currently looks like:
---------------------------------------------
result = []
wall_refs1 = XPath.match( $doc,
"doc:iso_10303_28/uos/IfcWallStandardCase//*[@pos='1']" )
wall_refs1 = grab_id(wall_refs1,'ref')
#grab_id simply puts the ref's id and puts them into an array
#output from this would be [["i1741"]]
wall_ref2 = []
wall_refs1.each do |ref|
x =
REXML::XPath.first($doc,"//*[@id='#{ref}']//IfcExtrudedAreaSolid").attribute("ref").value
wall_ref2 << x
end
#Output [["i1738"]]
wall_depth = []
wall_ref2.each do |ref|
x = REXML::XPath.match($doc,"//*[@id='#{ref}']//Depth").map {|element|
element.text}
wall_depth << x
end
#Output [["120."]]
wall_depth_final = wall_depth.map do |arr|
arr.map do |arr2|
#this is simply converting to float and rounding to 2 decimles
arr2.to_f.round_to(2)
end
end
wall_depth_final
#Output [["120.00"]
-----------------------------------------
The problem with doing this is that it takes substantial time for the
computer to run this, doing this for say 200 elements can take 25
minutes (I would be guessing the reason it takes so long to run is
because as some of the xml files are 10,000+ lines and I image it takes
a while to comb through that). I have to start from the first location
and work my way to the final one, and simply cannot run a search to grab
//depth unfortunately.
Is there a quicker way of accomplishing the same thing, or is time
always going to be a burden?
Thank you for your time.
This would be the xml I am reading:
<IfcWallStandardCase id="i1677">
<Representation>
<IfcProductDefinitionShape id="i1747">
<Representations id="i1750" exp:cType="list">
<IfcShapeRepresentation exp:pos="0" xsi:nil="true" ref="i1708"/>
<IfcShapeRepresentation exp:pos="1" xsi:nil="true" ref="i1741"/>
</Representations>
</IfcProductDefinitionShape>
</Representation>
</IfcWallStandardCase>
<IfcShapeRepresentation id="i1741">
<Items id="i1746" exp:cType="set">
<IfcExtrudedAreaSolid exp:pos="0" xsi:nil="true" ref="i1738"/>
</Items>
</IfcShapeRepresentation>
<IfcExtrudedAreaSolid id="i1738">
<Depth>120.</Depth>
</IfcExtrudedAreaSolid>
--
Posted via http://www.ruby-forum.com/.
[toc] | [next] | [standalone]
| From | Ryan Davis <ryand-ruby@zenspider.com> |
|---|---|
| Date | 2011-04-08 00:03 -0500 |
| Message-ID | <BE7E937E-C1D5-41B0-B63D-9D4FA36FDAA1@zenspider.com> |
| In reply to | #2500 |
On Apr 7, 2011, at 21:00 , Kyle X. wrote: > Hello, I have been using REXML to extract information from an XML file > and I am having an issue with the amount of time it is taking. If I > point directly to what I want it is pretty fast. The issue arises when > I have to grab a reference id, then research for that id to get another > id, until I finally get to the piece of information I want. This is > what a snippet my code currently looks like: Switch to nokogiri and you'll be much much happier.
[toc] | [prev] | [next] | [standalone]
| From | Mark Kremer <mark@without-brains.net> |
|---|---|
| Date | 2011-04-08 00:07 -0500 |
| Message-ID | <4D9E980E.5030709@without-brains.net> |
| In reply to | #2500 |
For larger XML documents SAX parsing can really improve performance
(specifically because SAX parsing doesn't create an entire DOM
structure, it only extracts the bits you are interested in). Programming
with a SAX parser is very different though :)
You can also switch to another library for handling your XML, the most
popular library (at least to my knowledge) is Nokogiri
(http://nokogiri.org/) and it is a great deal faster than REXML
On 8-4-2011 6:00, Kyle X. wrote:
> Hello, I have been using REXML to extract information from an XML file
> and I am having an issue with the amount of time it is taking. If I
> point directly to what I want it is pretty fast. The issue arises when
> I have to grab a reference id, then research for that id to get another
> id, until I finally get to the piece of information I want. This is
> what a snippet my code currently looks like:
>
> ---------------------------------------------
> result = []
> wall_refs1 = XPath.match( $doc,
> "doc:iso_10303_28/uos/IfcWallStandardCase//*[@pos='1']" )
>
> wall_refs1 = grab_id(wall_refs1,'ref')
> #grab_id simply puts the ref's id and puts them into an array
> #output from this would be [["i1741"]]
>
> wall_ref2 = []
> wall_refs1.each do |ref|
> x =
> REXML::XPath.first($doc,"//*[@id='#{ref}']//IfcExtrudedAreaSolid").attribute("ref").value
> wall_ref2<< x
> end
> #Output [["i1738"]]
>
> wall_depth = []
> wall_ref2.each do |ref|
> x = REXML::XPath.match($doc,"//*[@id='#{ref}']//Depth").map {|element|
> element.text}
> wall_depth<< x
> end
> #Output [["120."]]
>
> wall_depth_final = wall_depth.map do |arr|
> arr.map do |arr2|
> #this is simply converting to float and rounding to 2 decimles
> arr2.to_f.round_to(2)
> end
> end
>
> wall_depth_final
> #Output [["120.00"]
> -----------------------------------------
>
> The problem with doing this is that it takes substantial time for the
> computer to run this, doing this for say 200 elements can take 25
> minutes (I would be guessing the reason it takes so long to run is
> because as some of the xml files are 10,000+ lines and I image it takes
> a while to comb through that). I have to start from the first location
> and work my way to the final one, and simply cannot run a search to grab
> //depth unfortunately.
>
> Is there a quicker way of accomplishing the same thing, or is time
> always going to be a burden?
>
> Thank you for your time.
>
> This would be the xml I am reading:
>
> <IfcWallStandardCase id="i1677">
> <Representation>
> <IfcProductDefinitionShape id="i1747">
> <Representations id="i1750" exp:cType="list">
> <IfcShapeRepresentation exp:pos="0" xsi:nil="true" ref="i1708"/>
> <IfcShapeRepresentation exp:pos="1" xsi:nil="true" ref="i1741"/>
> </Representations>
> </IfcProductDefinitionShape>
> </Representation>
> </IfcWallStandardCase>
> <IfcShapeRepresentation id="i1741">
> <Items id="i1746" exp:cType="set">
> <IfcExtrudedAreaSolid exp:pos="0" xsi:nil="true" ref="i1738"/>
> </Items>
> </IfcShapeRepresentation>
> <IfcExtrudedAreaSolid id="i1738">
> <Depth>120.</Depth>
> </IfcExtrudedAreaSolid>
>
[toc] | [prev] | [next] | [standalone]
| From | "Kyle X." <haebooty@yahoo.com> |
|---|---|
| Date | 2011-04-08 01:46 -0500 |
| Message-ID | <cc2b08835e493e882b186d680d3a428e@ruby-forum.com> |
| In reply to | #2500 |
Thanks for the info. I am going to try Nokogiri, if I can only figure out how to get it to work in SketchUp.... There is a surprisingly a dearth of information on the topic, after a few hours of trying to find out online.... Any chance anyone know how? -- Posted via http://www.ruby-forum.com/.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.ruby
csiph-web