Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #11401 > unrolled thread
| Started by | Mausam <mausamhere@gmail.com> |
|---|---|
| First post | 2012-01-17 07:03 -0800 |
| Last post | 2012-01-17 21:29 -0800 |
| Articles | 8 — 3 participants |
Back to article view | Back to comp.lang.java.programmer
what is the bettter/performant way to compare org.w3c.dom.DocumentFragment Mausam <mausamhere@gmail.com> - 2012-01-17 07:03 -0800
Re: what is the bettter/performant way to compare org.w3c.dom.DocumentFragment Jeff Higgins <jeff@invalid.invalid> - 2012-01-17 12:32 -0500
Re: what is the bettter/performant way to compare org.w3c.dom.DocumentFragment Mausam <mausamhere@gmail.com> - 2012-01-17 17:56 -0800
Re: what is the bettter/performant way to compare org.w3c.dom.DocumentFragment Jeff Higgins <jeff@invalid.invalid> - 2012-01-17 21:33 -0500
Re: what is the bettter/performant way to compare org.w3c.dom.DocumentFragment Arne Vajhøj <arne@vajhoej.dk> - 2012-01-17 21:56 -0500
Re: what is the bettter/performant way to compare org.w3c.dom.DocumentFragment Arne Vajhøj <arne@vajhoej.dk> - 2012-01-17 18:38 -0500
Re: what is the bettter/performant way to compare org.w3c.dom.DocumentFragment Arne Vajhøj <arne@vajhoej.dk> - 2012-01-17 21:55 -0500
Re: what is the bettter/performant way to compare org.w3c.dom.DocumentFragment Mausam <mausamhere@gmail.com> - 2012-01-17 21:29 -0800
| From | Mausam <mausamhere@gmail.com> |
|---|---|
| Date | 2012-01-17 07:03 -0800 |
| Subject | what is the bettter/performant way to compare org.w3c.dom.DocumentFragment |
| Message-ID | <16090858.1983.1326812606093.JavaMail.geo-discussion-forums@prig11> |
I have a java class, whose contains a DocumentFragment. In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String. I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements. So in such cases this equality will fail. Please suggest a better approach.
[toc] | [next] | [standalone]
| From | Jeff Higgins <jeff@invalid.invalid> |
|---|---|
| Date | 2012-01-17 12:32 -0500 |
| Message-ID | <jf4asm$g9h$1@dont-email.me> |
| In reply to | #11401 |
On 01/17/2012 10:03 AM, Mausam wrote: > I have a java class, whose contains a DocumentFragment. > > In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String. > > I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements. > > So in such cases this equality will fail. > > Please suggest a better approach. A my class is equal to another my class if and only if ...
[toc] | [prev] | [next] | [standalone]
| From | Mausam <mausamhere@gmail.com> |
|---|---|
| Date | 2012-01-17 17:56 -0800 |
| Message-ID | <11466894.1272.1326851809373.JavaMail.geo-discussion-forums@prdh15> |
| In reply to | #11411 |
On Tuesday, 17 January 2012 23:02:29 UTC+5:30, Jeff Higgins wrote: > On 01/17/2012 10:03 AM, Mausam wrote: > > I have a java class, whose contains a DocumentFragment. > > > > In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String. > > > > I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements. > > > > So in such cases this equality will fail. > > > > Please suggest a better approach. > A my class is equal to another my class if and only if ... Thanks Jeff, I understand what you mean. BTW, I was checking the API http://docs.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/Node.html#isEqualNode%28org.w3c.dom.Node%29 The attributes NamedNodeMaps are equal. This is: they are both null, or they have the same length and for each node that exists in one map there is a node that exists in the other map and is equal, although not necessarily at the same index. The childNodes NodeLists are equal. This is: they are both null, or they have the same length and contain equal nodes at the same index. Note that normalization can affect equality; to avoid this, nodes should be normalized before being compared. Here for attributes, they take care of "NOT necessarily at the same index" but in case of childNodes its not being taken care of. So if there is a sequence of unordered elements (<emp/><dept/> and <dept/><emp/> ) they will be treated as NOT equal. So either I iterate through each node and attribute and do a comparison. That's the fall back. But before that, I wanted to check the experts if there are better options.
[toc] | [prev] | [next] | [standalone]
| From | Jeff Higgins <jeff@invalid.invalid> |
|---|---|
| Date | 2012-01-17 21:33 -0500 |
| Message-ID | <jf5ajc$arn$1@dont-email.me> |
| In reply to | #11445 |
On 01/17/2012 08:56 PM, Mausam wrote: > On Tuesday, 17 January 2012 23:02:29 UTC+5:30, Jeff Higgins wrote: >> On 01/17/2012 10:03 AM, Mausam wrote: >>> I have a java class, whose contains a DocumentFragment. >>> >>> In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String. >>> >>> I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements. >>> >>> So in such cases this equality will fail. >>> >>> Please suggest a better approach. >> A my class is equal to another my class if and only if ... > > Thanks Jeff, I understand what you mean. > > BTW, I was checking the API http://docs.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/Node.html#isEqualNode%28org.w3c.dom.Node%29 > > The attributes NamedNodeMaps are equal. This is: they are both null, or they have the same length and for each node that exists in one map there is a node that exists in the other map and is equal, although not necessarily at the same index. > > > The childNodes NodeLists are equal. This is: they are both null, or they have the same length and contain equal nodes at the same index. Note that normalization can affect equality; to avoid this, nodes should be normalized before being compared. > > Here for attributes, they take care of "NOT necessarily at the same index" but in case of childNodes its not being taken care of. So if there is a sequence of unordered elements (<emp/><dept/> and<dept/><emp/> ) they will be treated as NOT equal. > > So either I iterate through each node and attribute and do a comparison. That's the fall back. But before that, I wanted to check the experts if there are better options. Yep. I based my hair trigger response upon the .equals(Object) of the "known implementing classes" of Node. Sorry. I'll be interested in finding out the "cost" associated with Arne Vajhøj's response.
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-01-17 21:56 -0500 |
| Message-ID | <4f1634eb$0$287$14726298@news.sunsite.dk> |
| In reply to | #11447 |
On 1/17/2012 9:33 PM, Jeff Higgins wrote: > On 01/17/2012 08:56 PM, Mausam wrote: >> On Tuesday, 17 January 2012 23:02:29 UTC+5:30, Jeff Higgins wrote: >>> On 01/17/2012 10:03 AM, Mausam wrote: >>>> I have a java class, whose contains a DocumentFragment. >>>> >>>> In the equals method of my class, I am converting the >>>> DocumentFragment to a String and comparing an equals on the String. >>>> >>>> I know this is not the best way, because "attributes" e.g can change >>>> order in Element of DocumentFragment, or e.g documents differ only >>>> in the sequence of unordered elements. >>>> >>>> So in such cases this equality will fail. >>>> >>>> Please suggest a better approach. >>> A my class is equal to another my class if and only if ... >> >> Thanks Jeff, I understand what you mean. >> >> BTW, I was checking the API >> http://docs.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/Node.html#isEqualNode%28org.w3c.dom.Node%29 >> >> >> The attributes NamedNodeMaps are equal. This is: they are both null, >> or they have the same length and for each node that exists in one map >> there is a node that exists in the other map and is equal, although >> not necessarily at the same index. >> >> >> The childNodes NodeLists are equal. This is: they are both null, or >> they have the same length and contain equal nodes at the same index. >> Note that normalization can affect equality; to avoid this, nodes >> should be normalized before being compared. >> >> Here for attributes, they take care of "NOT necessarily at the same >> index" but in case of childNodes its not being taken care of. So if >> there is a sequence of unordered elements (<emp/><dept/> >> and<dept/><emp/> ) they will be treated as NOT equal. >> >> So either I iterate through each node and attribute and do a >> comparison. That's the fall back. But before that, I wanted to check >> the experts if there are better options. > > Yep. I based my hair trigger response upon the .equals(Object) of the > "known implementing classes" of Node. Sorry. I'll be interested in > finding out the "cost" associated with Arne Vajhøj's response. The cost is CPU time. It cost a bit of CPU time to parse and reorganize and serialize again. Arne
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-01-17 18:38 -0500 |
| Message-ID | <4f160666$0$289$14726298@news.sunsite.dk> |
| In reply to | #11401 |
On 1/17/2012 10:03 AM, Mausam wrote: > I have a java class, whose contains a DocumentFragment. > > In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String. > > I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements. > > So in such cases this equality will fail. I think XML Canonicalization will solve the problem. It comes as a cost though. Arne
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-01-17 21:55 -0500 |
| Message-ID | <4f1634ad$0$287$14726298@news.sunsite.dk> |
| In reply to | #11435 |
On 1/17/2012 6:38 PM, Arne Vajhøj wrote:
> On 1/17/2012 10:03 AM, Mausam wrote:
>> I have a java class, whose contains a DocumentFragment.
>>
>> In the equals method of my class, I am converting the DocumentFragment
>> to a String and comparing an equals on the String.
>>
>> I know this is not the best way, because "attributes" e.g can change
>> order in Element of DocumentFragment, or e.g documents differ only in
>> the sequence of unordered elements.
>>
>> So in such cases this equality will fail.
>
> I think XML Canonicalization will solve the problem.
>
> It comes as a cost though.
Example:
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import javax.xml.parsers.ParserConfigurationException;
import org.apache.xml.security.Init;
import org.apache.xml.security.c14n.CanonicalizationException;
import org.apache.xml.security.c14n.Canonicalizer;
import org.apache.xml.security.c14n.InvalidCanonicalizerException;
import org.xml.sax.SAXException;
public class XmlComp {
static {
Init.init();
}
private static String canonicalize(String s) throws
InvalidCanonicalizerException, UnsupportedEncodingException,
CanonicalizationException, ParserConfigurationException, IOException,
SAXException {
Canonicalizer c14n =
Canonicalizer.getInstance(Canonicalizer.ALGO_ID_C14N_OMIT_COMMENTS);
String res = new
String(c14n.canonicalize(s.getBytes(Canonicalizer.ENCODING)),
Canonicalizer.ENCODING);
return res;
}
public static void main(String[] args) throws Exception {
String s1 = "<a><b c='1' d='2'/></a>";
String s2 = "<a><b d='2' c='1'/></a>";
System.out.println(s1);
System.out.println(s2);
System.out.println(canonicalize(s1));
System.out.println(canonicalize(s2));
}
}
outputs:
<a><b c='1' d='2'/></a>
<a><b d='2' c='1'/></a>
<a><b c="1" d="2"></b></a>
<a><b c="1" d="2"></b></a>
Arne
[toc] | [prev] | [next] | [standalone]
| From | Mausam <mausamhere@gmail.com> |
|---|---|
| Date | 2012-01-17 21:29 -0800 |
| Message-ID | <17325668.425.1326864585936.JavaMail.geo-discussion-forums@yqih6> |
| In reply to | #11448 |
Thanks Arne,
I can achieve that using Node.isEqualTo(Node) API post JDK1.5.
I am worried of following usecases (wondering if its even valid usecase or not)
1)
Are these two Nodes equal? (check that one has empty street element and other has no street element. That implies that value for street is empty in both cases. So as per employee object is considered in Java, both will be equal.
<Employee company="example" xmlns="http://example.com" debug="true">
<Employeename>mausam</Employeename>
<email>a @example.com</email>
<street/>
</Employee>
<Employee debug="true" company="example" xmlns="http://example.com">
<Employeename>mausam</Employeename>
<email>a @example.com</email>
</Employee>
2)
Check the sequence of street element. In Node 1 it is after email and in node2 it is before.
<Employee company="example" xmlns="http://example.com" debug="true">
<Employeename>mausam</Employeename>
<email>a @example.com</email>
<street>Marienplatz</street>
</Employee>
<Employee debug="true" company="example" xmlns="http://example.com">
<Employeename>mausam</Employeename>
<street>Marienplatz</street>
<email>a @example.com</email>
</Employee>
--
Please note that I can not create java objects from XMLs as those are free xml fragments and does not comply to schema. But thanks a lot for your effort and code example.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.java.programmer
csiph-web