Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #76508
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed3a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <python-python-list@m.gmane.org> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.000 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'exercise': 0.04; 'cpython': 0.05; 'encoding': 0.05; '"""': 0.07; "'',": 0.07; 'encoded': 0.07; 'pypy': 0.07; 'utf-8': 0.07; 'string': 0.09; 'ascii': 0.09; 'mixed': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject:string': 0.09; 'variant': 0.09; 'willmer': 0.09; 'python': 0.11; 'def': 0.12; 'jan': 0.12; "'+'": 0.16; '(ubuntu': 0.16; 'ascii,': 0.16; 'charset': 0.16; 'function?': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'reedy': 0.16; 'sign.': 0.16; "skip:' 60": 0.16; 'usernames': 0.16; 'wrote:': 0.18; 'alex': 0.19; 'result.': 0.19; 'machine': 0.22; 'import': 0.22; 'header :User-Agent:1': 0.23; 'byte': 0.24; 'config': 0.24; 'unicode': 0.24; "i've": 0.25; 'gets': 0.27; 'header:X-Complaints-To:1': 0.27; 'header:In-Reply-To:1': 0.27; 'tried': 0.27; '8bit%:3': 0.30; '100000': 0.31; 'fast.': 0.31; 'file': 0.32; 'should': 0.36; 'being': 0.38; 'needed': 0.38; 'to:addr:python-list': 0.38; 'pm,': 0.38; 'received:71': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'challenge': 0.61; 'back': 0.62; 'skip:n 10': 0.64; 'euro': 0.69; 'evening': 0.84; 'examples.': 0.84; 'received:fios.verizon.net': 0.84; 'story:': 0.84 |
| X-Injected-Via-Gmane | http://gmane.org/ |
| To | python-list@python.org |
| From | Terry Reedy <tjreedy@udel.edu> |
| Subject | Re: Coding challenge: Optimise a custom string encoding |
| Date | Mon, 18 Aug 2014 16:16:26 -0400 |
| References | <6e869040-98e9-437b-b024-4ffe7abc3054@googlegroups.com> |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset=UTF-8; format=flowed |
| Content-Transfer-Encoding | quoted-printable |
| X-Gmane-NNTP-Posting-Host | pool-71-175-90-87.phlapa.fios.verizon.net |
| User-Agent | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 |
| In-Reply-To | <6e869040-98e9-437b-b024-4ffe7abc3054@googlegroups.com> |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.13113.1408393206.18130.python-list@python.org> (permalink) |
| Lines | 56 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1408393206 news.xs4all.nl 2940 [2001:888:2000:d::a6]:50337 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:76508 |
Show key headers only | View raw
On 8/18/2014 3:16 PM, Alex Willmer wrote:
> A challenge, just for fun. Can you speed up this function?
You should give a specification here, with examples. You should perhaps
be using .maketrans and .translate.
> import string
>
> charset = set(string.ascii_letters + string.digits + '@_-')
> byteseq = [chr(i) for i in xrange(256)]
> bytemap = {byte: byte if byte in charset else '+' + byte.encode('hex')
> for byte in byteseq}
>
> def plus_encode(s):
> """Encode a unicode string with only ascii letters, digits, _, -, @, +
> """
> bytemap_ = bytemap
> s_utf8 = s.encode('utf-8')
> return ''.join([bytemap[byte] for byte in s_utf8])
>
> On my machine (Ubuntu 14.04, CPython 2.7.6, PyPy 2.2.1) this gets
>
> alex@martha:~$ python -m timeit -s 'import plus_encode' 'plus_encode.plus_encode(u"""qwertyuiop1234567890!"£$%^&*()EURO""")'
> 100000 loops, best of 3: 2.96 usec per loop
>
> alex@martha:~$ pypy -m timeit -s 'import plus_encode' 'plus_encode.plus_encode(u"""qwertyuiop1234567890!"£$%^&*()EURO""")'
> 1000000 loops, best of 3: 1.24 usec per loop
>
> Back story:
> Last week we needed a custom encoding to store unicode usernames in a config file that only allowed mixed case ascii, digits, underscore, dash, at-sign and plus sign. We also wanted to keeping the encoded usernames somewhat human readable.
>
> My design was utf-8 and a variant of %-escaping, using the plus symbol. So u'alic EURO 123' would be encoded as b'alic+e2+82+ac123'. This evening as a learning exercise I've tried to make it fast. This is the result.
>
> This challenge is just for fun. The chosen solution ended up being
>
> def name_encode(s):
> return %s_%s' % (s.encode('utf-8').encode('hex'),
> re.replace('[A-Za-z0-9]', '', s))
>
> Regards, Alex
>
--
Terry Jan Reedy
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Coding challenge: Optimise a custom string encoding Alex Willmer <alex@moreati.org.uk> - 2014-08-18 12:16 -0700
Re: Coding challenge: Optimise a custom string encoding Terry Reedy <tjreedy@udel.edu> - 2014-08-18 16:16 -0400
Re: Coding challenge: Optimise a custom string encoding Alex Willmer <alex@moreati.org.uk> - 2014-08-18 14:27 -0700
Re: Coding challenge: Optimise a custom string encoding Peter Otten <__peter__@web.de> - 2014-08-19 01:35 +0200
Re: Coding challenge: Optimise a custom string encoding Chris Angelico <rosuav@gmail.com> - 2014-08-19 09:28 +1000
Re: Coding challenge: Optimise a custom string encoding Lele Gaifax <lele@metapensiero.it> - 2014-08-19 12:00 +0200
csiph-web