Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #57104

Re: Looking for UNICODE to ASCII Conversioni Example Code

Path csiph.com!usenet.pasdenom.info!gegeweb.org!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <z@etiol.net>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.168
X-Spam-Level *
X-Spam-Evidence '*H*': 0.67; '*S*': 0.01; 'way:': 0.09; 'columbus': 0.16; 'identifiers.': 0.16; 'unambiguous': 0.16; 'sat,': 0.16; 'wrote:': 0.18; 'do.': 0.18; 'creating': 0.23; 'header:User- Agent:1': 0.23; 'subject:Code': 0.24; 'sort': 0.25; 'header:In- Reply-To:1': 0.27; "i'm": 0.30; 'that.': 0.31; 'url:wiki': 0.31; "d'aprano": 0.31; 'steven': 0.31; 'url:wikipedia': 0.31; 'username': 0.31; 'another': 0.32; 'text': 0.33; 'received:209.85': 0.35; 'case,': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'url:org': 0.36; 'example,': 0.37; 'received:209.85.216': 0.37; 'received:209': 0.37; 'to:addr :python-list': 0.38; 'extremely': 0.39; 'legitimate': 0.39; 'to:addr:python.org': 0.39; 'called': 0.40; 'users': 0.40; 'back': 0.62; 'content-disposition:inline': 0.62; 'received:190': 0.69; 'risk': 0.72; 'confusing': 0.84; 'received:190.163': 0.84; 'justice': 0.93; 'reducing': 0.93; '2013': 0.98
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:subject:message-id:references :mime-version:content-type:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=fH8Ud3s9IBnlqFTu91oXbo8lTuqQa2Z+uALiDf9uLww=; b=j6yrM7oNyhVatCHZKypJ2vwk+/4uAV0O2s5EqZNU+CoJfpT+xCR6tqcAAktgRdUi3V LA2CAsWNcp2aIebtlFFbHFI1eBAxLUkOs0qqs7nVJeLBxi99IDLl4Y4MwynXfa9TXpwB qQykxZ+jHIWx6RCE1VT/spHF1YgslvngT9oH5v1Vmy1ip9Agqw5Xi5q+4IbCO3jpPZdM bdONImn6yrOp7/gTZuuRO8fzWR6WrEyHB8F8h7X4gEsXafoNbO/j5bUwFWbrFz2QcV7p gnlnDVa9ffrzXT9nkGPkioyYqQPKAVhrlE5yhgY/SnKRUQDVZGEX8TH85A9fsNovtrPu lRcw==
X-Gm-Message-State ALoCoQk8VymfboaLHCg4s9xwHR8H9FwU+wKvIvqokJmcc2HJWSrQfMxfg3q0KCzu8d4b2gWoZjIH
X-Received by 10.224.40.138 with SMTP id k10mr11557911qae.67.1382192084691; Sat, 19 Oct 2013 07:14:44 -0700 (PDT)
Date Sat, 19 Oct 2013 11:14:30 -0300
From Zero Piraeus <z@etiol.net>
To python-list@python.org
Subject Re: Looking for UNICODE to ASCII Conversioni Example Code
References <e7c0c225-bfd0-43a7-adfc-1b7639014c48@googlegroups.com> <52624e8f$0$29981$c3e8da3$5496439d@news.astraweb.com>
MIME-Version 1.0
Content-Type text/plain; charset=iso-8859-1
Content-Disposition inline
Content-Transfer-Encoding 8bit
In-Reply-To <52624e8f$0$29981$c3e8da3$5496439d@news.astraweb.com>
X-PGP-Key http://etiol.net/pubkey.asc
User-Agent Mutt/1.5.21 (2010-09-15)
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.1260.1382192092.18130.python-list@python.org> (permalink)
Lines 30
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1382192092 news.xs4all.nl 16006 [2001:888:2000:d::a6]:58759
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:57104

Show key headers only | View raw


:

On Sat, Oct 19, 2013 at 09:19:12AM +0000, Steven D'Aprano wrote:
> Make no mistake, this sort of simple-minded stripping of accents and 
> diacritics is an extremely ham-fisted thing to do.

I used to live on a street called Calle Colón, so I'm aware of the
dangers of stripping diacritics:

https://es.wikipedia.org/wiki/Colón
https://es.wikipedia.org/wiki/Colon

... although in that particular case, there's a degree of poetic justice
in confusing Cristóbal Colón / Cristopher Columbus with the back end of
a digestive tract:

  http://theoatmeal.com/comics/columbus_day

Joking aside, there is a legitimate use for asciifying text in this way:
creating unambiguous identifiers.

For example, a miscreant may create the username 'míguel' in order to
pose as another user 'miguel', relying on other users inattentiveness.
Asciifying is one way of reducing the risk of that.

 -[]z.

-- 
Zero Piraeus: in ictu oculi
http://etiol.net/pubkey.asc

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Looking for UNICODE to ASCII Conversioni Example Code caldwellinva@gmail.com - 2013-10-18 13:45 -0700
  Re: Looking for UNICODE to ASCII Conversioni Example Code Zero Piraeus <z@etiol.net> - 2013-10-18 19:02 -0300
  Re: Looking for UNICODE to ASCII Conversioni Example Code Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-10-19 09:19 +0000
    Re: Looking for UNICODE to ASCII Conversioni Example Code Zero Piraeus <z@etiol.net> - 2013-10-19 11:14 -0300
      Re: Looking for UNICODE to ASCII Conversioni Example Code Roy Smith <roy@panix.com> - 2013-10-19 11:10 -0400
        Re: Looking for UNICODE to ASCII Conversioni Example Code rusi <rustompmody@gmail.com> - 2013-10-19 08:26 -0700
      Re: Looking for UNICODE to ASCII Conversioni Example Code Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-10-19 16:16 +0000
        Re: Looking for UNICODE to ASCII Conversioni Example Code Roy Smith <roy@panix.com> - 2013-10-19 09:49 -0700
          Re: Looking for UNICODE to ASCII Conversioni Example Code Chris Angelico <rosuav@gmail.com> - 2013-10-20 09:10 +1100
            Re: Looking for UNICODE to ASCII Conversioni Example Code Roy Smith <roy@panix.com> - 2013-10-19 21:52 -0400
              Re: Looking for UNICODE to ASCII Conversioni Example Code Chris Angelico <rosuav@gmail.com> - 2013-10-20 13:09 +1100
                Re: Looking for UNICODE to ASCII Conversioni Example Code Roy Smith <roy@panix.com> - 2013-10-19 22:13 -0400
                Re: Looking for UNICODE to ASCII Conversioni Example Code Ben Finney <ben+python@benfinney.id.au> - 2013-10-20 13:26 +1100
                Re: Looking for UNICODE to ASCII Conversioni Example Code Chris Angelico <rosuav@gmail.com> - 2013-10-20 13:29 +1100
                Re: Looking for UNICODE to ASCII Conversioni Example Code Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-20 10:11 +0100
  Re: Looking for UNICODE to ASCII Conversioni Example Code Roy Smith <roy@panix.com> - 2013-10-19 08:28 -0400
  Re: Looking for UNICODE to ASCII Conversioni Example Code caldwellinva@gmail.com - 2013-10-19 05:50 -0700

csiph-web