Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #76523

Re: Coding challenge: Optimise a custom string encoding

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'argument': 0.05; 'encoding': 0.05; 'encoded': 0.07; 'padding': 0.07; 'mixed': 0.09; 'subject:string': 0.09; 'willmer': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'ascii,': 0.16; 'base64': 0.16; 'character.': 0.16; 'encoding.': 0.16; 'fits': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'obviously,': 0.16; 'rounding': 0.16; 'sign.': 0.16; 'two,': 0.16; 'usernames': 0.16; 'wrote:': 0.18; 'alex': 0.19; "python's": 0.19; '(the': 0.22; '8bit%:5': 0.22; '>>>': 0.22; 'input': 0.22; 'import': 0.22; 'aug': 0.22; 'cc:addr:python.org': 0.22; 'config': 0.24; 'instead.': 0.24; 'specifies': 0.24; 'unicode': 0.24; 'cc:2**0': 0.24; 'this:': 0.26; 'second': 0.26; 'header:In-Reply- To:1': 0.27; 'am,': 0.29; "doesn't": 0.30; 'message- id:@mail.gmail.com': 0.30; 'file': 0.32; 'could': 0.34; 'usual': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'done': 0.36; 'step': 0.37; 'needed': 0.38; 'that,': 0.38; 'expect': 0.39; 'simply': 0.61; 'back': 0.62; '2.7.': 0.84; 'story:': 0.84; 'to:none': 0.92
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type:content-transfer-encoding; bh=EXOErG+wDL3icY7dxNHYJufNNE/LFZhsQtpBw18JKig=; b=bExh/1x6YSn1kJH5cTtVWb2H51MKz2SKxQE60RSBAwmfZcfPvuHjiDhL/zOnfXOHOa Xvw9swjzjXRpK3QRhm/L2ADPrUMRKlq3I5d7229Ot5e7Ms9t2ReNQ063mLWvclbSK8rC 5pm9EBixI5E8uJ966IB8eXk/LpdwYzsknMDZ8upqGjmVUw2tCByJzoZaL8qddgFtFa58 WxoGcVHu8kRT0RdED8GrH39Jtq7Eb7sYgymmW2oRzUbR+lIg0EDK8RlD7/WrJbrjf+35 5UJeYILiveGpxG1UYdZL0xYQj5v90x3PH50cXrCgfx289HHJPByyKFhaOeTAI6BflmUk OiEg==
MIME-Version 1.0
X-Received by 10.180.94.234 with SMTP id df10mr2323132wib.76.1408404488621; Mon, 18 Aug 2014 16:28:08 -0700 (PDT)
In-Reply-To <6e869040-98e9-437b-b024-4ffe7abc3054@googlegroups.com>
References <6e869040-98e9-437b-b024-4ffe7abc3054@googlegroups.com>
Date Tue, 19 Aug 2014 09:28:08 +1000
Subject Re: Coding challenge: Optimise a custom string encoding
From Chris Angelico <rosuav@gmail.com>
Cc "python-list@python.org" <python-list@python.org>
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding quoted-printable
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.13121.1408404495.18130.python-list@python.org> (permalink)
Lines 32
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1408404495 news.xs4all.nl 2875 [2001:888:2000:d::a6]:56932
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:76523

Show key headers only | View raw


On Tue, Aug 19, 2014 at 5:16 AM, Alex Willmer <alex@moreati.org.uk> wrote:
> Back story:
> Last week we needed a custom encoding to store unicode usernames in a config file that only allowed mixed case ascii, digits, underscore, dash, at-sign and plus sign. We also wanted to keeping the encoded usernames somewhat human readable.
>

If you can drop the "somewhat human readable" requirement, this fits
perfectly into a Base 64 encoding. All you need to do is this:

>>> import base64
>>> base64.b64encode("alic€123".encode(),b"+@").replace(b'=',b'-')
b'YWxpY+KCrDEyMw--'


The second argument specifies that, instead of the usual + and / for
the last two, + and @ are used instead. (The last step is because
Python's b64encode doesn't allow customization of the padding
character. Alternatively, you could simply rstrip() them, and
reinstate them by rounding up to four input bytes.)

Decoding is, obviously, the reverse:

>>> base64.b64decode(_.replace(b'-',b'='),b"+@").decode()
'alic€123'

This is done in Python 3, not Python 2. But I expect it'll work the
same way in 2.7.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Coding challenge: Optimise a custom string encoding Alex Willmer <alex@moreati.org.uk> - 2014-08-18 12:16 -0700
  Re: Coding challenge: Optimise a custom string encoding Terry Reedy <tjreedy@udel.edu> - 2014-08-18 16:16 -0400
    Re: Coding challenge: Optimise a custom string encoding Alex Willmer <alex@moreati.org.uk> - 2014-08-18 14:27 -0700
      Re: Coding challenge: Optimise a custom string encoding Peter Otten <__peter__@web.de> - 2014-08-19 01:35 +0200
  Re: Coding challenge: Optimise a custom string encoding Chris Angelico <rosuav@gmail.com> - 2014-08-19 09:28 +1000
  Re: Coding challenge: Optimise a custom string encoding Lele Gaifax <lele@metapensiero.it> - 2014-08-19 12:00 +0200

csiph-web