Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!feeder.news-service.com!postnews.google.com!hd10g2000vbb.googlegroups.com!not-for-mail From: Yosi Izaq Newsgroups: comp.lang.java.security Subject: Re: X500Principal and UTF-16 encoded certificates Date: Sun, 24 Apr 2011 02:21:36 -0700 (PDT) Organization: http://groups.google.com Lines: 136 Message-ID: References: NNTP-Posting-Host: 84.228.148.23 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1303636896 1232 127.0.0.1 (24 Apr 2011 09:21:36 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Sun, 24 Apr 2011 09:21:36 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: hd10g2000vbb.googlegroups.com; posting-host=84.228.148.23; posting-account=WigZsQoAAACf8E5vIXR8Tnw042atWeRO User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.205 Safari/534.16,gzip(gfe) Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.security:21 On Apr 22, 6:35=A0pm, Daniele Futtorovic wrote: > On 21/04/2011 17:27, Yosi Izaq allegedly wrote: > > > > > > > > > > > On Apr 21, 4:22 pm, Yosi Izaq =A0wrote: > >> Hi, > > >> I have a java application that parses certificates. It works perfectly > >> for certificates that have their fields encoded in UTF-8. > >> It doesn't work well for UTF-16 encoding. While debugging the problem > >> I've found that getName(X500Principal.RFC2253) function returns the > >> name with extra 0x00 bytes (as if it confuses the first byte of UTF-16 > >> to be a UTF-8 byte). > > >> I've also found in Java doc (http://download.oracle.com/javase/1.4.2/ > >> docs/api/javax/security/auth/x500/ > >> X500Principal.html#getName(java.lang.String) ) that: > >> "If "RFC2253" is specified as the format, this method emits the > >> attribute type keywords defined in RFC 2253 (CN, L, ST, O, OU, C, > >> STREET, DC, UID). Any other attribute type is emitted as an OID. Under > >> a strict reading, RFC 2253 only specifies a UTF-8 string > >> representation. The String returned by this method is the Unicode > >> string achieved by decoding this UTF-8 representation." > >> This is consistent with the behavior that I've observed. > > >> I would like to ask what are my options for correctly parsing the name > >> value in accordance with RFC2253 when encoded in UTF-16? > > >> TIA, > >> Yosi > > > Just an update, rfc2253 (http://www.ietf.org/rfc/rfc2253.txt) states > > it's objective as "UTF-8 String Representation of Distinguished > > Names". Clearly, the legacy code I'm dealing with didn't take this > > into account. > > I'm currently experimenting with rfc1779 (http://www.ietf.org/rfc/ > > rfc1779.txt?number=3D1779) using all manner of UTF-16 encoded > > certificate subjects. > > Is there any specific reason why > > X500Principal:getName(X500Principal.RFC2253) may be preferable to > > X500Principal:getName(X500Principal.RFC1779)? > > > 10x, > > Yosi My response inline. Thanks, Yosi > I doubt your finding, for the very simple reason that > X500Principal#getName returns a String, not a byte[]. So your extra null > byte would have to come from whichever part it is that transforms the > String to a byte[], or possibly from X500Principal#getEncoded(). I'm running the whole thing using eclipse debugger and watched expression of X500Principal#getName shows the extra null while watched expression of toString doesn't. In addition when I'm trying to serialize the object to DB using jibx I get an exception: Error writing marshalled document java.io.IOException: Illegal character code 0x0 in content text at org.jibx.runtime.impl.UTF8Escaper.writeContent(UTF8Escaper.java: 128) at org.jibx.runtime.impl.GenericXMLWriter.writeTextContent(GenericXMLWriter.ja= va: 221) at org.jibx.runtime.impl.MarshallingContext.element(MarshallingContext.java: 707) at my.app.im.certificate.Certificate.JiBX_jibxBinding_marshal_3_1(Certificate.= java) at my.app.im.certificate.TrustCertificate.JiBX_jibxBinding_marshal_4_0(TrustCe= rtificate.java) at my.app.im.certificate.JiBX_jibxBindingTrustCertificate_access.marshal() at my.app.im.certificate.TrustCertificate.marshal(TrustCertificate.java) at org.jibx.runtime.impl.MarshallingContext.marshalRoot(MarshallingContext.jav= a: 1044) at org.jibx.runtime.impl.MarshallingContext.marshalDocument(MarshallingContext= .java: 1070) at my.app.mgmt.replication.RuntimeXMLTranslator.generateObject(RuntimeXMLTrans= lator.java: 261) at my.app.mgmt.configloader.ConfigLoader.loadObjectXML(ConfigLoader.java: 657) at my.app.mgmt.configloader.ConfigLoader.loadAllImObjectXML(ConfigLoader.java: 400) at my.app.mgmt.configloader.ConfigLoader.loadAllImXML(ConfigLoader.java: 360) at my.app.mgmt.configloader.ConfigLoader.main(ConfigLoader.java:1494) > The > problem may also be with the input, i.e. when and if the X500Principal > instance is created using the byte[] or java.io.InputStream c'tor. If that's so then why does toString() not return the extra null?- and why do I see (eclipse, debugger, watch X500Principal) that the cert. subject is stored correctly? I will double check the instantiation just to be sure (It will take some time since I'm on PTO currently w/o access to source). > > I would suggest you posted an SSCCE . Thanks. I'll do that. > > AFAIK, there is no intrinsic reason to use RFC2253 over RFC1779, > although the former appears to me more recently widespread. I would say > it boils down to what the entity you communicate with (be it a library > or a third party) understands. It's JIBX and according to it's documentation it understand UTF-8. > > -- > DF. > An escaped convict once said to me: > "Alcatraz is the place to be"