Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #46108

Re: Ldap module and base64 oncoding

From Michael Ströder <michael@stroeder.com>
Newsgroups comp.lang.python
Subject Re: Ldap module and base64 oncoding
Date 2013-05-26 21:48 +0200
Organization A noiseless patient Spider
Message-ID <kntomp$jsn$1@dont-email.me> (permalink)
References <knt87q$cm3$1@dont-email.me> <mailman.2178.1369585225.3114.python-list@python.org>

Show all headers | View raw


Joseph L. Casale wrote:
>> I'm not sure what exactly you're asking for.
>> Especially "is not being interpreted as a string requiring base64 encoding" is
>> written without giving the right context.
>>
>> So I'm just guessing that this might be the usual misunderstandings with use
>> of base64 in LDIF. Read more about when LDIF requires base64-encoding here:
>>
>> http://tools.ietf.org/html/rfc2849
>>
>> To me everything looks right:
>>
>> Python 2.7.3 (default, Apr 14 2012, 08:58:41) [GCC] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> 'ZGV0XDMzMTB3YmJccGc='.decode('base64').decode('utf-8')
>> u'det\\3310wbb\\pg'
>>>>>
>>
>> What do you think is a problem?
> 
> Thanks for the reply. The issues I am sure are in my code, I read the ldif source file and up
> with a values such as 'det\3310wbb\pg' after the base64 encoded entries are decoded.
> 
> The problem I am having is when I add this to an add/mod entry list and write it back out.
> As it does not get re-encoded to base64 the ldif file ends up seeing a text entry with a ^]
> character which if I re-read it with the parser it causes the handle method to break midway
> through the entry dict and so the last half re-appears disjoint without a dn.
> 
> Like I said, I am pretty sure its my poor misunderstanding of decoding and encoding.
> I am using the build from http://www.lfd.uci.edu/~gohlke/pythonlibs/ on a windows
> 2008 r2 server.
> 
> I have re-implemented handle to create a cidict holding all the dn/entry's that are parsed as
> I then perform some processing such as manipulating attribute values in the entry dict. I
> am pretty sure I am breaking things here. The data I am reading is coming from utf-16-le
> encoded files and has Unicode characters as the source directory is globally available, being
> written to in just about every country.

Processing LDIF is one thing, doing LDAP operations another.

LDIF itself is meant to be ASCII-clean. But each attribute value can carry any
byte sequence (e.g. attribute 'jpegPhoto'). There's no further processing by
module LDIF - it simply returns byte sequences.

The access protocol LDAPv3 mandates UTF-8 encoding for Unicode strings on the
wire if attribute syntax is DirectoryString, IA5String (mainly ASCII) or similar.

So if you're LDIF input returns UTF-16 encoded attribute values for e.g.
attribute 'cn' or 'o' or another attribute not being of OctetString or Binary
syntax something's wrong with the producer of the LDIF data.

> Is there a process for manipulating/adding data to the entry dict before I write it out that I
> should adhere to? For example, if I am adding a new attribute to be composed of part of
> another parsed attr for use in a modlist:
> 
>   {'customAttr': ['foo.{}.bar'.format(entry['uid'])]}
> 
> By looking at the value from above, 'det\3310wbb\pg', I gather the entry dict was parsed
> into byte strings. I should have decoded this, where as some of the data is Unicode and
> as such I should have encoded it?

I wonder what the string really is. At least the base64-encoding you provided
before decodes as UTF-8 but I'm not sure whether it's the right sequence of
Unicode code points you're expecting.

>>> 'ZGV0XDMzMTB3YmJccGc='.decode('base64').decode('utf-8')
u'det\\3310wbb\\pg'

I still can't figure out what you're really doing though. I'd recommend to
strip down your operations to a very simple test code snippet illustrating the
issue and post that here.

Ciao, Michael.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Ldap module and base64 oncoding "Joseph L. Casale" <jcasale@activenetwerx.com> - 2013-05-24 21:00 +0000
  Re: Ldap module and base64 oncoding Michael Ströder <michael@stroeder.com> - 2013-05-26 17:07 +0200
    RE: Ldap module and base64 oncoding "Joseph L. Casale" <jcasale@activenetwerx.com> - 2013-05-26 16:19 +0000
      Re: Ldap module and base64 oncoding Michael Ströder <michael@stroeder.com> - 2013-05-26 21:48 +0200
        RE: Ldap module and base64 oncoding "Joseph L. Casale" <jcasale@activenetwerx.com> - 2013-05-27 05:15 +0000
          Re: Ldap module and base64 oncoding Michael Ströder <michael@stroeder.com> - 2013-05-27 09:56 +0200
            RE: Ldap module and base64 oncoding "Joseph L. Casale" <jcasale@activenetwerx.com> - 2013-05-28 00:12 +0000
              Re: Ldap module and base64 oncoding Michael Ströder <michael@stroeder.com> - 2013-05-28 09:45 +0200
        Re: Ldap module and base64 oncoding dieter <dieter@handshake.de> - 2013-05-27 08:04 +0200

csiph-web