Groups > comp.lang.python > #70722 > unrolled thread

Unicode 7

Started by	wxjmfauth@gmail.com
First post	2014-04-29 10:37 -0700
Last post	2014-04-30 23:00 -0700
Articles	16 on this page of 56 — 16 participants

Back to article view | Back to comp.lang.python

  Unicode 7 wxjmfauth@gmail.com - 2014-04-29 10:37 -0700
    Re: Unicode 7 Tim Chase <python.list@tim.thechases.com> - 2014-04-29 12:59 -0500
      Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-04-29 21:53 -0700
        Re: Unicode 7 Steven D'Aprano <steve@pearwood.info> - 2014-05-01 05:00 +0000
          Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-01 11:04 -0700
            Re: Unicode 7 Terry Reedy <tjreedy@udel.edu> - 2014-05-01 18:38 -0400
              Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-01 19:29 -0700
                Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-01 19:39 -0700
                Re: Unicode 7 Chris Angelico <rosuav@gmail.com> - 2014-05-02 13:01 +1000
                  Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-01 20:16 -0700
                Re: Unicode 7 Terry Reedy <tjreedy@udel.edu> - 2014-05-02 01:05 -0400
              Re: Unicode 7 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-02 03:15 +0000
            Re: Unicode 7 MRAB <python@mrabarnett.plus.com> - 2014-05-02 00:33 +0100
              Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-01 19:02 -0700
                Re: Unicode 7 Ben Finney <ben@benfinney.id.au> - 2014-05-02 12:39 +1000
                  Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-01 19:59 -0700
                Re: Unicode 7 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-02 08:45 +0000
                  Re: Unicode 7 Chris Angelico <rosuav@gmail.com> - 2014-05-02 19:08 +1000
                    Re: Unicode 7 Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2014-05-02 13:04 +0300
                  Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-02 03:39 -0700
                    Re: Unicode 7 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-02 11:55 +0000
                      Re: Unicode 7 Marko Rauhamaa <marko@pacujo.net> - 2014-05-02 15:19 +0300
                        Re: Unicode 7 Ben Finney <ben@benfinney.id.au> - 2014-05-03 07:07 +1000
                          Re: Unicode 7 Roy Smith <roy@panix.com> - 2014-05-02 17:13 -0400
                      Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-02 09:03 -0700
                      Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-02 09:50 -0700
                        Re: Unicode 7 Michael Torrie <torriem@gmail.com> - 2014-05-02 11:39 -0600
                        Re: Unicode 7 Ned Batchelder <ned@nedbatchelder.com> - 2014-05-02 13:46 -0400
                        Re: Unicode 7 Peter Otten <__peter__@web.de> - 2014-05-02 20:07 +0200
                          Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-02 17:58 -0700
                            Re: Unicode 7 Ned Batchelder <ned@nedbatchelder.com> - 2014-05-02 21:18 -0400
                              Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-02 18:42 -0700
                                Re: Unicode 7 Chris Angelico <rosuav@gmail.com> - 2014-05-03 11:54 +1000
                                  Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-02 19:02 -0700
                            Re: Unicode 7 Chris Angelico <rosuav@gmail.com> - 2014-05-03 11:15 +1000
                            Re: Unicode 7 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-03 02:02 +0000
                              Re: Unicode 7 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-03 02:04 +0000
                              Re: Unicode 7 Chris Angelico <rosuav@gmail.com> - 2014-05-03 12:17 +1000
                            Re: Unicode 7 Terry Reedy <tjreedy@udel.edu> - 2014-05-02 22:19 -0400
                      Re: Unicode 7 Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-05-03 12:57 -0400
                  Re: Unicode 7 Tim Chase <python.list@tim.thechases.com> - 2014-05-02 07:58 -0500
                Re: Unicode 7 MRAB <python@mrabarnett.plus.com> - 2014-05-02 17:52 +0100
            Re: Unicode 7 Terry Reedy <tjreedy@udel.edu> - 2014-05-02 00:16 -0400
              Re: Unicode 7 Rustom Mody <rustompmody@gmail.com> - 2014-05-01 21:42 -0700
                Re: Unicode 7 Chris Angelico <rosuav@gmail.com> - 2014-05-02 14:54 +1000
                Re: Unicode 7 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-02 08:08 +0000
                  Re: Unicode 7 Chris Angelico <rosuav@gmail.com> - 2014-05-02 19:01 +1000
                    Re: Unicode 7 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-02 11:52 +0000
                  Re: Unicode 7 Ben Finney <ben@benfinney.id.au> - 2014-05-02 19:16 +1000
                    Re: Unicode 7 Marko Rauhamaa <marko@pacujo.net> - 2014-05-02 13:05 +0300
                  Re: Unicode 7 Chris Angelico <rosuav@gmail.com> - 2014-05-02 19:24 +1000
                  Re: Unicode 7 MRAB <python@mrabarnett.plus.com> - 2014-05-02 18:07 +0100
    Re: Unicode 7 MRAB <python@mrabarnett.plus.com> - 2014-04-29 19:12 +0100
      Re: Unicode 7 wxjmfauth@gmail.com - 2014-04-30 00:06 -0700
        Re: Unicode 7 Tim Chase <python.list@tim.thechases.com> - 2014-04-30 13:48 -0500
          Re: Unicode 7 wxjmfauth@gmail.com - 2014-04-30 23:00 -0700

Page 3 of 3 — ← Prev page 1 2 [3]

#70866

From	Tim Chase <python.list@tim.thechases.com>
Date	2014-05-02 07:58 -0500
Message-ID	<mailman.9653.1399035500.18130.python-list@python.org>
In reply to	#70853

On 2014-05-02 19:08, Chris Angelico wrote:
> This is another area where Unicode has given us "a great improvement
> over the old method of giving satisfaction". Back in the 1990s on
> OS/2, DOS, and Windows, a missing glyph might be (a) blank, (b) a
> simple square with no information, or (c) copied from some other
> font (common with dingbats fonts). With Unicode, the standard is to
> show a little box *with the hex digits in it*. Granted, those boxes
> are a LOT more readable for BMP characters than SMP (unless your
> text is huge, six digits in the space of one character will make
> them pretty tiny), and a "Unicode" font will generally include all
> (or at least most) of the BMP, but it's still better than having no
> information at all.

I'm pleased when applications & fonts work properly, using both the
placeholder fonts for "this character is legitimate but I can't
display it with a font, so here, have a box with the codepoint
numbers in it until I'm directed to use a more appropriate font at
which point you'll see it correctly" and the "somebody crammed garbage
in here, so I'll display it with "�" (U+FFFD) which is designated for
exactly this purpose".

-tkc

[toc] | [prev] | [next] | [standalone]

#70870

From	MRAB <python@mrabarnett.plus.com>
Date	2014-05-02 17:52 +0100
Message-ID	<mailman.9654.1399049579.18130.python-list@python.org>
In reply to	#70834

On 2014-05-02 03:39, Ben Finney wrote:
> Rustom Mody <rustompmody@gmail.com> writes:
>
>> Yes, the headaches go a little further back than Unicode.
>
> Okay, so can you change your article to reflect the fact that the
> headaches both pre-date Unicode, and are made much easier by Unicode?
>
>> There is a certain large old book...
>
> Ah yes, the neo-Sumerian story “Enmerkar_and_the_Lord_of_Aratta”
> <URL:https://en.wikipedia.org/wiki/Enmerkar_and_the_Lord_of_Aratta>.
> Probably inspired by stories older than that, of course.
>
>> In which is described the building of a 'tower that reached up to heaven'...
>> At which point 'it was decided'¶ to do something to prevent that.
>> And our headaches started.
>
> And other myths with fantastic reasons for the diversity of language
> <URL:https://en.wikipedia.org/wiki/Mythical_origins_of_language>.
>
>> I never knew of any of this in the good ol days of ASCII
>
> Yes, by ignoring all other writing systems except one's own – and
> thereby excluding most of the world's people – the system can be made
> simpler.
>
ASCII lacked even £. I can remember assembly listings in magazines
containing lines such as:

     LDA £0

I even (vaguely) remember an advert with a character that looked like
Ł, presumably because they didn't have £. In a UK magazine? Very
strange!

> Hopefully the proportion of programmers who still feel they can make
> such a parochial choice is rapidly shrinking.
>

[toc] | [prev] | [next] | [standalone]

#70845

From	Terry Reedy <tjreedy@udel.edu>
Date	2014-05-02 00:16 -0400
Message-ID	<mailman.9646.1399004255.18130.python-list@python.org>
In reply to	#70818

On 5/1/2014 7:33 PM, MRAB wrote:
> On 2014-05-01 23:38, Terry Reedy wrote:
>> On 5/1/2014 2:04 PM, Rustom Mody wrote:
>>
>>>>> Since its Unicode-troll time, here's my contribution
>>>>> http://blog.languager.org/2014/04/unicode-and-unix-assumption.html
>>
>> I will not comment on the Unix-assumption part, but I think you go wrong
>> with this:  "Unicode is a Headache". The major headache is that unicode
>> and its very few encodings are not universally used. The headache is all
>> the non-unicode legacy encodings still being used. So you better title
>> this section 'Non-Unicode is a Headache'.
>>
> [snip]
> I think he's right when he says "Unicode is a headache", but only
> because it's being used to handle languages which are, themselves, a
> "headache": left-to-right versus right-to-left, sometimes on the same
> line;

Handling that without unicode is even worse.

> diacritics, possibly several on a glyph; etc.

Ditto.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#70846

From	Rustom Mody <rustompmody@gmail.com>
Date	2014-05-01 21:42 -0700
Message-ID	<1c08837e-496b-4fb2-8ff9-f8a495b67d67@googlegroups.com>
In reply to	#70845

On Friday, May 2, 2014 9:46:36 AM UTC+5:30, Terry Reedy wrote:
> On 5/1/2014 7:33 PM, MRAB wrote:
> > On 2014-05-01 23:38, Terry Reedy wrote:
> >> On 5/1/2014 2:04 PM, Rustom Mody wrote:
> >>>>> Since its Unicode-troll time, here's my contribution
> >>>>> http://blog.languager.org/2014/04/unicode-and-unix-assumption.html
> >> I will not comment on the Unix-assumption part, but I think you go wrong
> >> with this:  "Unicode is a Headache". The major headache is that unicode
> >> and its very few encodings are not universally used. The headache is all
> >> the non-unicode legacy encodings still being used. So you better title
> >> this section 'Non-Unicode is a Headache'.
> > [snip]
> > I think he's right when he says "Unicode is a headache", but only
> > because it's being used to handle languages which are, themselves, a
> > "headache": left-to-right versus right-to-left, sometimes on the same
> > line;

> Handling that without unicode is even worse.

> > diacritics, possibly several on a glyph; etc.

> Ditto.

Whats the best cure for headache?

Cut off the head

Whats the best cure for Unicode?

Ascii

Saying however that there is no headache in unicode does not make the headache
go away:

http://lucumr.pocoo.org/2014/1/5/unicode-in-2-and-3/

No I am not saying that the contents/style/tone are right.
However people are evidently suffering the transition.
Denying it is not a help.

And unicode consortium's ways are not exactly helpful to its own cause:
Imagine the C standard committee deciding that adding mandatory garbage collection
to C is a neat idea

Unicode consortium's going from old BMP to current (6.0) SMPs to who-knows-what
in the future is similar.

[toc] | [prev] | [next] | [standalone]

#70847

From	Chris Angelico <rosuav@gmail.com>
Date	2014-05-02 14:54 +1000
Message-ID	<mailman.9647.1399006467.18130.python-list@python.org>
In reply to	#70846

On Fri, May 2, 2014 at 2:42 PM, Rustom Mody <rustompmody@gmail.com> wrote:
> Unicode consortium's going from old BMP to current (6.0) SMPs to who-knows-what
> in the future is similar.

Unicode 1.0: "Let's make a single universal character set that can
represent all the world's scripts. We'll define 65536 codepoints to do
that with."

Unicode 2.0: "Oh. That's not enough. Okay, let's define some more."

It's not a fundamental change, nor is it unhelpful to Unicode's cause.
It's simply an acknowledgement that 64K codepoints aren't enough. Yes,
that gave us the mess of UTF-16 being called "Unicode" (if it hadn't
been for Unicode 1.0, I doubt we'd now have so many languages using
and exposing UTF-16 - it'd be a simple judgment call, pick
UTF-8/UTF-16/UTF-32 based on what you expect your users to want to
use), but it doesn't change Unicode's goal, and it also doesn't
indicate that there's likely to be any more such changes in the
future. (Just look at how little of the Unicode space is allocated so
far.)

ChrisA

[toc] | [prev] | [next] | [standalone]

#70849

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2014-05-02 08:08 +0000
Message-ID	<53635299$0$29965$c3e8da3$5496439d@news.astraweb.com>
In reply to	#70846

On Thu, 01 May 2014 21:42:21 -0700, Rustom Mody wrote:

> Whats the best cure for headache?
> 
> Cut off the head

o_O

I don't think so.

> Whats the best cure for Unicode?
> 
> Ascii

Unicode is not a problem to be solved.

The inability to write standard human text in ASCII is a problem, e.g. 
one cannot write

“ASCII For Dummies” © 2014 by Zöe Smith, now on sale 99¢

so even *Americans* cannot represent all their common characters in 
ASCII, let alone specialised characters from mathematics, science, the 
printing industry, and law. And even Americans sometimes need to write 
text in Foreign. Where is your ASCII now?

The solution is to have at least one encoding which contains the 
additional characters needed.

The plethora of such additional encodings is a problem. The solution is a 
single encoding that covers all needed characters, like Unicode, so that 
there is no need to handle multiple encodings.

The inability for plain text files to record metadata of what encoding 
they use is a problem. The solution is to standardize on a single, world-
wide encoding, like Unicode.

> Saying however that there is no headache in unicode does not make the
> headache go away:
> 
> http://lucumr.pocoo.org/2014/1/5/unicode-in-2-and-3/
> 
> No I am not saying that the contents/style/tone are right. However
> people are evidently suffering the transition. Denying it is not a help.

Transitions are always more painful than after the transition has settled 
down. As I have said repeatedly, I look forward for the day when nobody 
but document archivists and academics need care about legacy encodings. 
But we're not there yet.

> And unicode consortium's ways are not exactly helpful to its own cause:
> Imagine the C standard committee deciding that adding mandatory garbage
> collection to C is a neat idea
> 
> Unicode consortium's going from old BMP to current (6.0) SMPs to
> who-knows-what in the future is similar.

I don't see the connection.

-- 
Steven D'Aprano
http://import-that.dreamwidth.org/

[toc] | [prev] | [next] | [standalone]

#70855

From	Chris Angelico <rosuav@gmail.com>
Date	2014-05-02 19:01 +1000
Message-ID	<mailman.9649.1399021311.18130.python-list@python.org>
In reply to	#70849

On Fri, May 2, 2014 at 6:08 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> ... even *Americans* cannot represent all their common characters in
> ASCII, let alone specialised characters from mathematics, science, the
> printing industry, and law.

Aside: What additional characters does law use that aren't in ASCII?
Section § and paragraph ¶ are used frequently, but you already
mentioned the printing industry. Are there other symbols?

ChrisA

[toc] | [prev] | [next] | [standalone]

#70863

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2014-05-02 11:52 +0000
Message-ID	<536386f5$0$29965$c3e8da3$5496439d@news.astraweb.com>
In reply to	#70855

On Fri, 02 May 2014 19:01:44 +1000, Chris Angelico wrote:

> On Fri, May 2, 2014 at 6:08 PM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
>> ... even *Americans* cannot represent all their common characters in
>> ASCII, let alone specialised characters from mathematics, science, the
>> printing industry, and law.
> 
> Aside: What additional characters does law use that aren't in ASCII?
> Section § and paragraph ¶ are used frequently, but you already mentioned
> the printing industry. Are there other symbols?

I was thinking of copyright, trademark, registered mark, and similar. I 
think these are all of relevant characters:

py> for c in '©®℗™':
...     unicodedata.name(c)
...
'COPYRIGHT SIGN'
'REGISTERED SIGN'
'SOUND RECORDING COPYRIGHT'
'TRADE MARK SIGN'



-- 
Steven D'Aprano
http://import-that.dreamwidth.org/

[toc] | [prev] | [next] | [standalone]

#70857

From	Ben Finney <ben@benfinney.id.au>
Date	2014-05-02 19:16 +1000
Message-ID	<mailman.9651.1399022203.18130.python-list@python.org>
In reply to	#70849

Chris Angelico <rosuav@gmail.com> writes:

> On Fri, May 2, 2014 at 6:08 PM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
> > ... even *Americans* cannot represent all their common characters in
> > ASCII, let alone specialised characters from mathematics, science,
> > the printing industry, and law.
>
> Aside: What additional characters does law use that aren't in ASCII?
> Section § and paragraph ¶ are used frequently, but you already
> mentioned the printing industry. Are there other symbols?

ASCII does not contain “©” (U+00A9 COPYRIGHT SIGN) nor “®” (U+00AE
REGISTERED SIGN), for instance.

-- 
 \     “I got some new underwear the other day. Well, new to me.” —Emo |
  `\                                                           Philips |
_o__)                                                                  |
Ben Finney

[toc] | [prev] | [next] | [standalone]

#70861

From	Marko Rauhamaa <marko@pacujo.net>
Date	2014-05-02 13:05 +0300
Message-ID	<87iopojy25.fsf@elektro.pacujo.net>
In reply to	#70857

Ben Finney <ben@benfinney.id.au>:

>> Aside: What additional characters does law use that aren't in ASCII?
>> Section § and paragraph ¶ are used frequently, but you already
>> mentioned the printing industry. Are there other symbols?
>
> ASCII does not contain “©” (U+00A9 COPYRIGHT SIGN) nor “®” (U+00AE
> REGISTERED SIGN), for instance.

The em-dash is mapped on my keyboard — I use it quite often.


Marko

[toc] | [prev] | [next] | [standalone]

#70858

From	Chris Angelico <rosuav@gmail.com>
Date	2014-05-02 19:24 +1000
Message-ID	<mailman.9652.1399022682.18130.python-list@python.org>
In reply to	#70849

On Fri, May 2, 2014 at 7:16 PM, Ben Finney <ben@benfinney.id.au> wrote:
> Chris Angelico <rosuav@gmail.com> writes:
>
>> On Fri, May 2, 2014 at 6:08 PM, Steven D'Aprano
>> <steve+comp.lang.python@pearwood.info> wrote:
>> > ... even *Americans* cannot represent all their common characters in
>> > ASCII, let alone specialised characters from mathematics, science,
>> > the printing industry, and law.
>>
>> Aside: What additional characters does law use that aren't in ASCII?
>> Section § and paragraph ¶ are used frequently, but you already
>> mentioned the printing industry. Are there other symbols?
>
> ASCII does not contain “©” (U+00A9 COPYRIGHT SIGN) nor “®” (U+00AE
> REGISTERED SIGN), for instance.

Heh! I forgot about those. U+00A9 in particular has gone so mainstream
that it's easy to think of it not as "I'm going to switch to my
'British English + Legal' dictionary now" and just as "This is a
critical part of the basic dictionary".

ChrisA

[toc] | [prev] | [next] | [standalone]

#70871

From	MRAB <python@mrabarnett.plus.com>
Date	2014-05-02 18:07 +0100
Message-ID	<mailman.9655.1399050429.18130.python-list@python.org>
In reply to	#70849

On 2014-05-02 09:08, Steven D'Aprano wrote:
> On Thu, 01 May 2014 21:42:21 -0700, Rustom Mody wrote:
>
>
>> Whats the best cure for headache?
>>
>> Cut off the head
>
> o_O
>
> I don't think so.
>
>
>> Whats the best cure for Unicode?
>>
>> Ascii
>
> Unicode is not a problem to be solved.
>
> The inability to write standard human text in ASCII is a problem, e.g.
> one cannot write
>
> “ASCII For Dummies” © 2014 by Zöe Smith, now on sale 99¢
>
[snip]

Shouldn't that be "Zoë"?

[toc] | [prev] | [next] | [standalone]

#70724

From	MRAB <python@mrabarnett.plus.com>
Date	2014-04-29 19:12 +0100
Message-ID	<mailman.9580.1398795170.18130.python-list@python.org>
In reply to	#70722

On 2014-04-29 18:37, wxjmfauth@gmail.com wrote:
> Let see how Python is ready for the next Unicode version
> (Unicode 7.0.0.Beta).
>
>
>>>> timeit.repeat("(x*1000 + y)[:-1]", setup="x = 'abc'; y = 'z'")
> [1.4027834829454946, 1.38714224331963, 1.3822586635296261]
>>>> timeit.repeat("(x*1000 + y)[:-1]", setup="x = 'abc'; y = '\u0fce'")
> [5.462776291480395, 5.4479432055423445, 5.447874284053398]
>>>>
>>>>
>>>> # more interesting
>>>> timeit.repeat("(x*1000 + y)[:-1]",\
> ...     setup="x = 'abc'.encode('utf-8'); y = '\u0fce'.encode('utf-8')")
> [1.3496489533188765, 1.328654286266783, 1.3300913977710707]
>>>>
>
Although the third example is the fastest, it's also the wrong way to
handle Unicode:

 >>> x = 'abc'.encode('utf-8'); y = '\u0fce'.encode('utf-8')
 >>> t = (x*1000 + y)[:-1].decode('utf-8')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 
3000-3001: unex
pected end of data

> Note 1:  "lookup" is not the problem.
>
> Note 2: From Unicode.org : "[...] We strongly encourage [...] and test
> them with their programs [...]"
>
> -> Done.
>
> jmf
>

[toc] | [prev] | [next] | [standalone]

#70765

From	wxjmfauth@gmail.com
Date	2014-04-30 00:06 -0700
Message-ID	<4fc3221d-9b2e-4c95-9d36-c10a8fca7259@googlegroups.com>
In reply to	#70724

@ Time Chase

I'm perfectly aware about what I'm doing.


@ MRAB

"...Although the third example is the fastest, it's also the wrong
way to handle Unicode: ..."

Maybe that's exactly the opposite. It illustrates very well,
the quality of coding schemes endorsed by Unicode.org.
I deliberately choose utf-8.


>>> sys.getsizeof('\u0fce')
40
>>> sys.getsizeof('\u0fce'.encode('utf-8'))
20
>>> sys.getsizeof('\u0fce'.encode('utf-16-be'))
19
>>> sys.getsizeof('\u0fce'.encode('utf-32-be'))
21
>>> 

Q. How to save memory without wasting time in encoding?
By using products using natively the unicode coding schemes?

Are you understanding unicode? Or are you understanding
unicode via Python?

---

A Tibetan monk [*] using Py32:

>>> timeit.repeat("(x*1000 + y)[:-1]", setup="x = 'abc'; y = 'z'")
[2.3394840182882186, 2.3145832750782653, 2.3207231951529685]
>>> timeit.repeat("(x*1000 + y)[:-1]", setup="x = 'abc'; y = '\u0fce'")
[2.328517624800078, 2.3169403900011076, 2.317586282812048]
>>>

[*] Your curiosity has certainly shown, what this code point means.
For the others:
U+0FCE TIBETAN SIGN RDEL NAG RDEL DKAR
signifies good luck earlier, bad luck later


(My comment: Good luck with Python or bad luck with Python)

jmf

[toc] | [prev] | [next] | [standalone]

#70789

From	Tim Chase <python.list@tim.thechases.com>
Date	2014-04-30 13:48 -0500
Message-ID	<mailman.9614.1398883748.18130.python-list@python.org>
In reply to	#70765

On 2014-04-30 00:06, wxjmfauth@gmail.com wrote:
> @ Time Chase
> 
> I'm perfectly aware about what I'm doing.

Apparently, you're quite adept at appending superfluous characters to
sensible strings...did you benchmark your email composition, too? ;-)

-tkc (aka "Tim", not "Time")

[toc] | [prev] | [next] | [standalone]

#70808

From	wxjmfauth@gmail.com
Date	2014-04-30 23:00 -0700
Message-ID	<58c7a006-3058-4137-8ae2-f5effbdc0e1e@googlegroups.com>
In reply to	#70789

Le mercredi 30 avril 2014 20:48:48 UTC+2, Tim Chase a écrit :
> On 2014-04-30 00:06, wxjmfauth@gmail.com wrote:
> 
> > @ Time Chase
> 
> > 
> 
> > I'm perfectly aware about what I'm doing.
> 
> 
> 
> Apparently, you're quite adept at appending superfluous characters to
> 
> sensible strings...did you benchmark your email composition, too? ;-)
> 
> 
> 
> -tkc (aka "Tim", not "Time")

Mea culpa, ...

[toc] | [prev] | [standalone]

Page 3 of 3 — ← Prev page 1 2 [3]

csiph-web

Unicode 7

Contents

#70866

#70870

#70845

#70846

#70847

#70849

#70855

#70863

#70857

#70861

#70858

#70871

#70724

#70765

#70789

#70808