Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: Daniele Futtorovic Newsgroups: comp.lang.java.security Subject: Re: X500Principal and UTF-16 encoded certificates Date: Fri, 22 Apr 2011 17:35:56 +0200 Organization: A noiseless patient Spider Lines: 60 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Fri, 22 Apr 2011 15:35:57 +0000 (UTC) Injection-Info: mx03.eternal-september.org; posting-host="JgzAXvgbe1leCK0HBfr1eg"; logging-data="12363"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19hHUHK3OU+dmMD6CRKe4On" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 In-Reply-To: Cancel-Lock: sha1:p9IJX94Y2WyXx/X+xlf4DeQU4pI= Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.security:17 On 21/04/2011 17:27, Yosi Izaq allegedly wrote: > On Apr 21, 4:22 pm, Yosi Izaq wrote: >> Hi, >> >> I have a java application that parses certificates. It works perfectly >> for certificates that have their fields encoded in UTF-8. >> It doesn't work well for UTF-16 encoding. While debugging the problem >> I've found that getName(X500Principal.RFC2253) function returns the >> name with extra 0x00 bytes (as if it confuses the first byte of UTF-16 >> to be a UTF-8 byte). >> >> I've also found in Java doc (http://download.oracle.com/javase/1.4.2/ >> docs/api/javax/security/auth/x500/ >> X500Principal.html#getName(java.lang.String) ) that: >> "If "RFC2253" is specified as the format, this method emits the >> attribute type keywords defined in RFC 2253 (CN, L, ST, O, OU, C, >> STREET, DC, UID). Any other attribute type is emitted as an OID. Under >> a strict reading, RFC 2253 only specifies a UTF-8 string >> representation. The String returned by this method is the Unicode >> string achieved by decoding this UTF-8 representation." >> This is consistent with the behavior that I've observed. >> >> I would like to ask what are my options for correctly parsing the name >> value in accordance with RFC2253 when encoded in UTF-16? >> >> TIA, >> Yosi > > Just an update, rfc2253 (http://www.ietf.org/rfc/rfc2253.txt) states > it's objective as "UTF-8 String Representation of Distinguished > Names". Clearly, the legacy code I'm dealing with didn't take this > into account. > I'm currently experimenting with rfc1779 (http://www.ietf.org/rfc/ > rfc1779.txt?number=1779) using all manner of UTF-16 encoded > certificate subjects. > Is there any specific reason why > X500Principal:getName(X500Principal.RFC2253) may be preferable to > X500Principal:getName(X500Principal.RFC1779)? > > 10x, > Yosi I doubt your finding, for the very simple reason that X500Principal#getName returns a String, not a byte[]. So your extra null byte would have to come from whichever part it is that transforms the String to a byte[], or possibly from X500Principal#getEncoded(). The problem may also be with the input, i.e. when and if the X500Principal instance is created using the byte[] or java.io.InputStream c'tor. I would suggest you posted an SSCCE . AFAIK, there is no intrinsic reason to use RFC2253 over RFC1779, although the former appears to me more recently widespread. I would say it boils down to what the entity you communicate with (be it a library or a third party) understands. -- DF. An escaped convict once said to me: "Alcatraz is the place to be"