Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #46108
| Path | csiph.com!usenet.pasdenom.info!gegeweb.org!eternal-september.org!feeder.eternal-september.org!mx05.eternal-september.org!.POSTED!not-for-mail |
|---|---|
| From | Michael Ströder <michael@stroeder.com> |
| Newsgroups | comp.lang.python |
| Subject | Re: Ldap module and base64 oncoding |
| Date | Sun, 26 May 2013 21:48:38 +0200 |
| Organization | A noiseless patient Spider |
| Lines | 74 |
| Message-ID | <kntomp$jsn$1@dont-email.me> (permalink) |
| References | <knt87q$cm3$1@dont-email.me> <mailman.2178.1369585225.3114.python-list@python.org> |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Content-Transfer-Encoding | 7bit |
| Injection-Date | Sun, 26 May 2013 19:44:25 +0000 (UTC) |
| Injection-Info | mx05.eternal-september.org; posting-host="854464e21e8cb84c218111ee10e8b116"; logging-data="20375"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+QFiMkAL8gMvWOJJWh5EVZV5qu/ylLqXE=" |
| User-Agent | Mozilla/5.0 (X11; Linux x86_64; rv:20.0) Gecko/20100101 Firefox/20.0 SeaMonkey/2.17.1 |
| In-Reply-To | <mailman.2178.1369585225.3114.python-list@python.org> |
| Cancel-Lock | sha1:KM0A6oxSl/ntJHU0vj8sTlvrRRw= |
| Xref | csiph.com comp.lang.python:46108 |
Show key headers only | View raw
Joseph L. Casale wrote:
>> I'm not sure what exactly you're asking for.
>> Especially "is not being interpreted as a string requiring base64 encoding" is
>> written without giving the right context.
>>
>> So I'm just guessing that this might be the usual misunderstandings with use
>> of base64 in LDIF. Read more about when LDIF requires base64-encoding here:
>>
>> http://tools.ietf.org/html/rfc2849
>>
>> To me everything looks right:
>>
>> Python 2.7.3 (default, Apr 14 2012, 08:58:41) [GCC] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> 'ZGV0XDMzMTB3YmJccGc='.decode('base64').decode('utf-8')
>> u'det\\3310wbb\\pg'
>>>>>
>>
>> What do you think is a problem?
>
> Thanks for the reply. The issues I am sure are in my code, I read the ldif source file and up
> with a values such as 'det\3310wbb\pg' after the base64 encoded entries are decoded.
>
> The problem I am having is when I add this to an add/mod entry list and write it back out.
> As it does not get re-encoded to base64 the ldif file ends up seeing a text entry with a ^]
> character which if I re-read it with the parser it causes the handle method to break midway
> through the entry dict and so the last half re-appears disjoint without a dn.
>
> Like I said, I am pretty sure its my poor misunderstanding of decoding and encoding.
> I am using the build from http://www.lfd.uci.edu/~gohlke/pythonlibs/ on a windows
> 2008 r2 server.
>
> I have re-implemented handle to create a cidict holding all the dn/entry's that are parsed as
> I then perform some processing such as manipulating attribute values in the entry dict. I
> am pretty sure I am breaking things here. The data I am reading is coming from utf-16-le
> encoded files and has Unicode characters as the source directory is globally available, being
> written to in just about every country.
Processing LDIF is one thing, doing LDAP operations another.
LDIF itself is meant to be ASCII-clean. But each attribute value can carry any
byte sequence (e.g. attribute 'jpegPhoto'). There's no further processing by
module LDIF - it simply returns byte sequences.
The access protocol LDAPv3 mandates UTF-8 encoding for Unicode strings on the
wire if attribute syntax is DirectoryString, IA5String (mainly ASCII) or similar.
So if you're LDIF input returns UTF-16 encoded attribute values for e.g.
attribute 'cn' or 'o' or another attribute not being of OctetString or Binary
syntax something's wrong with the producer of the LDIF data.
> Is there a process for manipulating/adding data to the entry dict before I write it out that I
> should adhere to? For example, if I am adding a new attribute to be composed of part of
> another parsed attr for use in a modlist:
>
> {'customAttr': ['foo.{}.bar'.format(entry['uid'])]}
>
> By looking at the value from above, 'det\3310wbb\pg', I gather the entry dict was parsed
> into byte strings. I should have decoded this, where as some of the data is Unicode and
> as such I should have encoded it?
I wonder what the string really is. At least the base64-encoding you provided
before decodes as UTF-8 but I'm not sure whether it's the right sequence of
Unicode code points you're expecting.
>>> 'ZGV0XDMzMTB3YmJccGc='.decode('base64').decode('utf-8')
u'det\\3310wbb\\pg'
I still can't figure out what you're really doing though. I'd recommend to
strip down your operations to a very simple test code snippet illustrating the
issue and post that here.
Ciao, Michael.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Ldap module and base64 oncoding "Joseph L. Casale" <jcasale@activenetwerx.com> - 2013-05-24 21:00 +0000
Re: Ldap module and base64 oncoding Michael Ströder <michael@stroeder.com> - 2013-05-26 17:07 +0200
RE: Ldap module and base64 oncoding "Joseph L. Casale" <jcasale@activenetwerx.com> - 2013-05-26 16:19 +0000
Re: Ldap module and base64 oncoding Michael Ströder <michael@stroeder.com> - 2013-05-26 21:48 +0200
RE: Ldap module and base64 oncoding "Joseph L. Casale" <jcasale@activenetwerx.com> - 2013-05-27 05:15 +0000
Re: Ldap module and base64 oncoding Michael Ströder <michael@stroeder.com> - 2013-05-27 09:56 +0200
RE: Ldap module and base64 oncoding "Joseph L. Casale" <jcasale@activenetwerx.com> - 2013-05-28 00:12 +0000
Re: Ldap module and base64 oncoding Michael Ströder <michael@stroeder.com> - 2013-05-28 09:45 +0200
Re: Ldap module and base64 oncoding dieter <dieter@handshake.de> - 2013-05-27 08:04 +0200
csiph-web