Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!feeder.news-service.com!postnews.google.com!hd10g2000vbb.googlegroups.com!not-for-mail
From: Yosi Izaq <izaqyos@gmail.com>
Newsgroups: comp.lang.java.security
Subject: Re: X500Principal and UTF-16 encoded certificates
Date: Sun, 24 Apr 2011 02:21:36 -0700 (PDT)
Organization: http://groups.google.com
Lines: 136
Message-ID: <cbfc653f-bc68-4d75-ad6c-7004c299e156@hd10g2000vbb.googlegroups.com>
References: <f3317f71-49c9-448d-9baa-8cb439a19b4b@l36g2000vbp.googlegroups.com> <ed8d8950-6fb4-4082-800f-1609258ceb96@hd10g2000vbb.googlegroups.com> <ios78t$c2b$1@dont-email.me>
NNTP-Posting-Host: 84.228.148.23
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1303636896 1232 127.0.0.1 (24 Apr 2011 09:21:36 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sun, 24 Apr 2011 09:21:36 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: hd10g2000vbb.googlegroups.com; posting-host=84.228.148.23; posting-account=WigZsQoAAACf8E5vIXR8Tnw042atWeRO
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.205 Safari/534.16,gzip(gfe)
Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.security:21

On Apr 22, 6:35=A0pm, Daniele Futtorovic <da.futt.n...@laposte-dot-
net.invalid> wrote:
> On 21/04/2011 17:27, Yosi Izaq allegedly wrote:
>
>
>
>
>
>
>
>
>
> > On Apr 21, 4:22 pm, Yosi Izaq<izaq...@gmail.com> =A0wrote:
> >> Hi,
>
> >> I have a java application that parses certificates. It works perfectly
> >> for certificates that have their fields encoded in UTF-8.
> >> It doesn't work well for UTF-16 encoding. While debugging the problem
> >> I've found that getName(X500Principal.RFC2253) function returns the
> >> name with extra 0x00 bytes (as if it confuses the first byte of UTF-16
> >> to be a UTF-8 byte).
>
> >> I've also found in Java doc (http://download.oracle.com/javase/1.4.2/
> >> docs/api/javax/security/auth/x500/
> >> X500Principal.html#getName(java.lang.String) ) that:
> >> "If "RFC2253" is specified as the format, this method emits the
> >> attribute type keywords defined in RFC 2253 (CN, L, ST, O, OU, C,
> >> STREET, DC, UID). Any other attribute type is emitted as an OID. Under
> >> a strict reading, RFC 2253 only specifies a UTF-8 string
> >> representation. The String returned by this method is the Unicode
> >> string achieved by decoding this UTF-8 representation."
> >> This is consistent with the behavior that I've observed.
>
> >> I would like to ask what are my options for correctly parsing the name
> >> value in accordance with RFC2253 when encoded in UTF-16?
>
> >> TIA,
> >> Yosi
>
> > Just an update, rfc2253 (http://www.ietf.org/rfc/rfc2253.txt) states
> > it's objective as "UTF-8 String Representation of Distinguished
> > Names". Clearly, the legacy code I'm dealing with didn't take this
> > into account.
> > I'm currently experimenting with rfc1779 (http://www.ietf.org/rfc/
> > rfc1779.txt?number=3D1779) using all manner of UTF-16 encoded
> > certificate subjects.
> > Is there any specific reason why
> > X500Principal:getName(X500Principal.RFC2253) may be preferable to
> > X500Principal:getName(X500Principal.RFC1779)?
>
> > 10x,
> > Yosi

My response inline.
Thanks,
Yosi

> I doubt your finding, for the very simple reason that
> X500Principal#getName returns a String, not a byte[]. So your extra null
> byte would have to come from whichever part it is that transforms the
> String to a byte[], or possibly from X500Principal#getEncoded().
I'm running the whole thing using eclipse debugger and watched
expression of X500Principal#getName shows the extra null while watched
expression of toString doesn't.
In addition when I'm trying to serialize the object to DB using jibx I
get an exception:
Error writing marshalled document
java.io.IOException: Illegal character code 0x0 in content text
	at org.jibx.runtime.impl.UTF8Escaper.writeContent(UTF8Escaper.java:
128)
	at
org.jibx.runtime.impl.GenericXMLWriter.writeTextContent(GenericXMLWriter.ja=
va:
221)
	at
org.jibx.runtime.impl.MarshallingContext.element(MarshallingContext.java:
707)
	at
my.app.im.certificate.Certificate.JiBX_jibxBinding_marshal_3_1(Certificate.=
java)
	at
my.app.im.certificate.TrustCertificate.JiBX_jibxBinding_marshal_4_0(TrustCe=
rtificate.java)
	at
my.app.im.certificate.JiBX_jibxBindingTrustCertificate_access.marshal()
	at
my.app.im.certificate.TrustCertificate.marshal(TrustCertificate.java)
	at
org.jibx.runtime.impl.MarshallingContext.marshalRoot(MarshallingContext.jav=
a:
1044)
	at
org.jibx.runtime.impl.MarshallingContext.marshalDocument(MarshallingContext=
.java:
1070)
	at
my.app.mgmt.replication.RuntimeXMLTranslator.generateObject(RuntimeXMLTrans=
lator.java:
261)
	at
my.app.mgmt.configloader.ConfigLoader.loadObjectXML(ConfigLoader.java:
657)
	at
my.app.mgmt.configloader.ConfigLoader.loadAllImObjectXML(ConfigLoader.java:
400)
	at
my.app.mgmt.configloader.ConfigLoader.loadAllImXML(ConfigLoader.java:
360)
	at my.app.mgmt.configloader.ConfigLoader.main(ConfigLoader.java:1494)

> The
> problem may also be with the input, i.e. when and if the X500Principal
> instance is created using the byte[] or java.io.InputStream c'tor.
If that's so then why does toString() not return the extra null?- and
why do I see (eclipse, debugger, watch X500Principal) that the cert.
subject is stored correctly?
I will double check the instantiation just to be sure (It will take
some time since I'm on PTO currently w/o access to source).

>
> I would suggest you posted an SSCCE <http://sscce.org/>.
Thanks. I'll do that.

>
> AFAIK, there is no intrinsic reason to use RFC2253 over RFC1779,
> although the former appears to me more recently widespread. I would say
> it boils down to what the entity you communicate with (be it a library
> or a third party) understands.
It's JIBX and according to it's documentation it understand UTF-8.

>
> --
> DF.
> An escaped convict once said to me:
> "Alcatraz is the place to be"