Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #52362
| Path | csiph.com!usenet.pasdenom.info!gegeweb.org!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <joshua.landau.ws@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.044 |
| X-Spam-Evidence | '*H*': 0.91; '*S*': 0.00; 'subject:Python': 0.06; '"""': 0.07; 'see:': 0.07; 'none)': 0.09; 'wrong,': 0.09; 'python': 0.11; '(none,': 0.16; 'broken.': 0.16; 'buffer,': 0.16; 'lie,': 0.16; 'running:': 0.16; 'subject:Could': 0.16; 'subject:Unicode': 0.16; 'sender:addr:gmail.com': 0.17; 'trying': 0.19; 'slightly': 0.19; 'seems': 0.21; '>>>': 0.22; 'import': 0.22; 'skip:l 30': 0.24; 'url:dev': 0.24; 'sort': 0.25; "i've": 0.25; 'post': 0.26; 'least': 0.26; 'testing': 0.29; 'skip:p 30': 0.29; 'list:': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'getting': 0.31; "user's": 0.31; 'linux': 0.33; 'to:name :python-list': 0.33; 'subject:the': 0.34; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'like,': 0.36; 'subject:List': 0.36; 'done': 0.36; 'subject:?': 0.36; 'to:addr :python-list': 0.38; 'skip:_ 30': 0.39; 'to:addr:python.org': 0.39; 'read': 0.60; 'tell': 0.60; 'length': 0.61; 'full': 0.61; "you're": 0.61; 'further': 0.61; 'first': 0.61; 'linked': 0.65; 'anything.': 0.68; 'obvious': 0.74; 'article': 0.77; 'subject:this': 0.83; '*really*': 0.84; 'copy-paste': 0.84; 'end.': 0.84; 'subject:you': 0.87; 'smoke': 0.91; 'url:comments': 0.91 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=/BMqJeg/UhHqcxW+Hz4BMhbWNp9avSDKM1vJPXeewII=; b=ZTg5Ufc84Nl8qqOfpokmdbypwR1Wy1XazjNIR2Jo1IvvOS/6IhKzslX5iW3CGi7Te4 FBTeaQ0+ZRghe7+smXlxqORlXSmI9x1zKp+d9hgpPSMejLVw5AN0m9zDbg7SmhyCu1cP BhQXH4STWk7GMKD6dhfUehf3FKXqrn3c4cREUMw2DCTCPvjYQogfuAsrou/XaOXqF7At M5qDG+ocZ28wwRE6n7KIJQDRh+hNcqQTVUiJ+4gyvlsutQURepGFtlVf5EfdTrqTJ5VI jGtReZxAZLaYY2Ks2IzwqSnnPpaWjcwDhOqt8N7uc4968Q9vc1eBRTZN8ZPrd/upK9h8 pcAg== |
| X-Received | by 10.152.22.170 with SMTP id e10mr8891465laf.78.1376201902989; Sat, 10 Aug 2013 23:18:22 -0700 (PDT) |
| MIME-Version | 1.0 |
| Sender | joshua.landau.ws@gmail.com |
| From | Joshua Landau <joshua@landau.ws> |
| Date | Sun, 11 Aug 2013 07:17:42 +0100 |
| X-Google-Sender-Auth | UbqMxJ9Is_OETuMtAdjtxzmSd3c |
| Subject | Could you verify this, Oh Great Unicode Experts of the Python-List? |
| To | python-list <python-list@python.org> |
| Content-Type | text/plain; charset=UTF-8 |
| Content-Transfer-Encoding | quoted-printable |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.468.1376201912.1251.python-list@python.org> (permalink) |
| Lines | 56 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1376201912 news.xs4all.nl 15930 [2001:888:2000:d::a6]:59274 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:52362 |
Show key headers only | View raw
Basically, I think Twitter's broken.
For my full discusion on the matter, see:
http://www.reddit.com/r/learnpython/comments/1k2yrn/help_with_len_and_input_function_33/cbku5e8
Here's the first post of mine, ineffectually edited for this list:
"""
<strikethrough>The obvious solution [to getting the length of a tweet]
is wrong. Like, slightly wrong¹.</strikethrough>
Given tweet = b"caf\x65\xCC\x81".decode():
>>> tweet
'café'
But:
>>> len(tweet)
5
So the solution is:
>>> import unicodedata
>>> len(unicodedata.normalize("NFC", tweet))
4
<strikethrough>Read twitter's commentary¹ for proof.</strikethrough>
<strikethrough>There are additional complications I'm trying to sort
out.</strikethrough>
________________________________
After further testing (I don't actually use Twitter) it seems the
whole thing was just smoke and mirrors. The linked article is a lie,
at least on the user's end.
On Linux you can prove this by running:
>>> p = subprocess.Popen(['xsel', '-bi'], stdin=subprocess.PIPE)
>>> p.communicate(input=b"caf\x65\xCC\x81")
(None, None)
"café" will be in your Copy-Paste buffer, and you can paste it in to
the tweet-box. It takes 5 characters. So much for testing ;).
________________________________
¹ https://dev.twitter.com/docs/counting-characters#Definition_of_a_Character
"""
I know this isn't *really* Python-related, but there's Python involved
and you're the sort of people who'll be able to tell me what I've done
wrong, if anything.
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 07:17 +0100
Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-08-11 09:09 +0000
Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 10:44 +0100
Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-08-11 11:14 +0000
Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Chris Angelico <rosuav@gmail.com> - 2013-08-11 12:45 +0100
Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 12:59 +0100
Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-13 09:40 +0100
Re: Could you verify this, Oh Great Unicode Experts of the Python-List? wxjmfauth@gmail.com - 2013-08-11 05:51 -0700
Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 14:07 +0100
csiph-web