Groups > comp.lang.python > #70366 > unrolled thread

Why Python 3?

Started by	Anthony Papillion <papillion@gmail.com>
First post	2014-04-18 22:28 -0500
Last post	2014-05-06 09:28 -0700
Articles	20 on this page of 72 — 23 participants

Back to article view | Back to comp.lang.python

  Why Python 3? Anthony Papillion <papillion@gmail.com> - 2014-04-18 22:28 -0500
    Re: Why Python 3? Paul Rubin <no.email@nospam.invalid> - 2014-04-18 23:40 -0700
      Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-19 17:34 +1000
        Re: Why Python 3? Roy Smith <roy@panix.com> - 2014-04-19 09:26 -0400
          Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-19 23:42 +1000
          Re: Why Python 3? Albert-Jan Roskam <fomcl@yahoo.com> - 2014-04-19 10:57 -0700
          Re: Why Python 3? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-20 10:07 +0000
      Re: Why Python 3? Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-19 03:25 -0600
        Re: Why Python 3? Marko Rauhamaa <marko@pacujo.net> - 2014-04-19 12:59 +0300
      Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-19 19:37 +1000
        Integer and float division [was Re: Why Python 3?] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-20 11:02 +0000
          Re: Integer and float division [was Re: Why Python 3?] Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2014-04-20 15:38 +0300
            Re: Integer and float division [was Re: Why Python 3?] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-20 15:09 +0000
          Re: Integer and float division [was Re: Why Python 3?] Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-21 11:44 +1200
      Re: Why Python 3? Terry Reedy <tjreedy@udel.edu> - 2014-04-19 13:23 -0400
        Re: Why Python 3? Paul Rubin <no.email@nospam.invalid> - 2014-04-19 20:25 -0700
          Re: Why Python 3? Ben Finney <ben+python@benfinney.id.au> - 2014-04-20 19:15 +1000
          Re: Why Python 3? Walter Hurry <walterhurry@lavabit.com> - 2014-04-20 23:50 +0000
            Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-21 10:00 +1000
              Re: Why Python 3? HoneyMonster <nobody@someplace.invalid> - 2014-04-21 04:08 +0000
            Re: Why Python 3? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-21 01:11 +0100
      Re: Why Python 3? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-19 18:31 +0100
      Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-20 03:53 +1000
      Re: Why Python 3? Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-19 13:58 -0600
      Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-20 06:31 +1000
        Re: Why Python 3? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-20 13:06 +1200
          Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-20 11:28 +1000
            Re: Why Python 3? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-21 10:52 +1200
              Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-21 09:24 +1000
                Re: Why Python 3? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-21 03:43 +0000
                  Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-21 14:43 +1000
                    Re: Why Python 3? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-22 09:58 +1200
                  Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-21 14:48 +1000
                    Re: Why Python 3? wxjmfauth@gmail.com - 2014-04-21 02:42 -0700
                    Re: Why Python 3? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-22 10:28 +1200
                      Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-22 08:43 +1000
                        Re: Why Python 3? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-22 18:03 +1200
          Re: Why Python 3? Terry Reedy <tjreedy@udel.edu> - 2014-04-20 00:17 -0400
            Re: Why Python 3? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-21 11:13 +1200
              Re: Why Python 3? Terry Reedy <tjreedy@udel.edu> - 2014-04-20 20:09 -0400
      Re: Why Python 3? Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-19 14:38 -0600
      Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-20 06:53 +1000
        Re: Why Python 3? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-20 13:35 +1200
      Re: Why Python 3? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-20 09:59 +0000
        Unicode in Python Rustom Mody <rustompmody@gmail.com> - 2014-04-21 20:57 -0700
          Re: Unicode in Python Terry Reedy <tjreedy@udel.edu> - 2014-04-22 01:44 -0400
            Re: Unicode in Python Rustom Mody <rustompmody@gmail.com> - 2014-04-21 23:18 -0700
              Re: Unicode in Python Chris Angelico <rosuav@gmail.com> - 2014-04-22 16:32 +1000
          Re: Unicode in Python Steven D'Aprano <steve@pearwood.info> - 2014-04-22 06:11 +0000
            Re: Unicode in Python Rustom Mody <rustompmody@gmail.com> - 2014-04-21 23:30 -0700
              Re: Unicode in Python Chris Angelico <rosuav@gmail.com> - 2014-04-22 16:44 +1000
              Re: Unicode in Python wxjmfauth@gmail.com - 2014-04-22 02:07 -0700
                Re: Unicode in Python Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-22 12:21 +0000
                  Re: Unicode in Python wxjmfauth@gmail.com - 2014-04-22 08:28 -0700
          Re: Unicode in Python Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-22 00:31 -0600
            Re: Unicode in Python Rustom Mody <rustompmody@gmail.com> - 2014-04-22 02:23 -0700
            Re: Unicode in Python Rustom Mody <rustompmody@gmail.com> - 2014-04-22 11:09 -0700
      Re: Why Python 3? Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-20 10:22 -0600
        Re: Why Python 3? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-21 11:56 +1200
          Re: Why Python 3? Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-20 18:29 -0600
      Re: Why Python 3? MRAB <python@mrabarnett.plus.com> - 2014-04-20 17:41 +0100
      Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-21 02:46 +1000
        Re: Why Python 3? Roy Smith <roy@panix.com> - 2014-04-20 14:40 -0700
          Re: Why Python 3? Terry Reedy <tjreedy@udel.edu> - 2014-04-20 17:58 -0400
          Re: Why Python 3? Richard Damon <Richard@Damon-Family.org> - 2014-04-20 18:02 -0400
            Re: Why Python 3? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-21 12:22 +1200
          Re: Why Python 3? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-21 02:13 +0000
    Re: Why Python 3? Steve Hayes <hayesstw@telkomsa.net> - 2014-04-19 13:53 +0200
      Re: Why Python 3? Chris Angelico <rosuav@gmail.com> - 2014-04-19 22:46 +1000
      Re: Why Python 3? Rustom Mody <rustompmody@gmail.com> - 2014-04-19 08:59 -0700
    Re: Why Python 3? Rick Johnson <rantingrickjohnson@gmail.com> - 2014-04-19 07:41 -0700
    Re: Why Python 3? Thomas Lehmann <thomas.lehmann.private@googlemail.com> - 2014-05-06 09:28 -0700

Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →

#70398

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2014-04-19 14:38 -0600
Message-ID	<mailman.9371.1397939964.18130.python-list@python.org>
In reply to	#70371

On Sat, Apr 19, 2014 at 2:31 PM, Chris Angelico <rosuav@gmail.com> wrote:
> On Sun, Apr 20, 2014 at 5:58 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
>> Considering that Fraction and Decimal did not exist yet, what type do
>> you think the PEP 238 implementers should have chosen for the result
>> of dividing two ints?  If float is not acceptable, and int is not
>> acceptable (which was the whole point of the PEP), then the only
>> alternative I can see would have been to raise a TypeError and force
>> the user to upcast explicitly.  In that case, dividing arbitrary ints
>> using floating-point math would not be possible for those ints that
>> are outside the range of floats; you would get OverflowError on the
>> upcast operation, regardless of whether the result of division would
>> be within the range of a float.
>>
>>> Yes, I can see that it's nice for simple interactive use.
>>
>> More importantly, it's useful for implementers of generic mathematical
>> routines.  If you're passed arbitrary inputs, you don't have to check
>> the types of the values you were given and then branch if both of the
>> values you were about to divide happened to be ints just because the
>> division operator arbitrarily does something different on ints.
>
> Or you just cast one of them to float. That way you're sure you're
> working with floats.

Which is inappropriate if the type passed in was a Decimal or a complex.

[toc] | [prev] | [next] | [standalone]

#70399

From	Chris Angelico <rosuav@gmail.com>
Date	2014-04-20 06:53 +1000
Message-ID	<mailman.9372.1397940783.18130.python-list@python.org>
In reply to	#70371

On Sun, Apr 20, 2014 at 6:38 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
>> Or you just cast one of them to float. That way you're sure you're
>> working with floats.
>
> Which is inappropriate if the type passed in was a Decimal or a complex.

In that case, you already have a special case in your code, so whether
that special case is handled by the language or by your code makes
little difference. Is your function so generic that it has to be able
to handle float, Decimal, or complex, and not care about the
difference, and yet has to ensure that int divided by int doesn't
yield int? Then say so; put in that special check. Personally, I've
yet to meet any non-toy example of a function that needs that exact
handling; most code doesn't ever think about complex numbers, and a
lot of things look for one specific type:

>>> "asdf"*3.0
Traceback (most recent call last):
  File "<pyshell#19>", line 1, in <module>
    "asdf"*3.0
TypeError: can't multiply sequence by non-int of type 'float'

Maybe it's not your code that should be caring about what happens when
you divide two integers, but the calling code. If you're asking for
the average of a list of numbers, and they're all integers, and the
avg() function truncates to integer, then the solution is to use sum()
and explicitly cast to floating point before dividing. Why should the
language handle that? It's no different from trying to sum a bunch of
different numeric types:

>>> sum([1.0,decimal.Decimal("1")])
Traceback (most recent call last):
  File "<pyshell#25>", line 1, in <module>
    sum([1.0,decimal.Decimal("1")])
TypeError: unsupported operand type(s) for +: 'float' and 'decimal.Decimal'

The language doesn't specify a means of resolving the conflict between
float and Decimal, but for some reason the division of two integers is
blessed with a language feature. Again, it would make perfect sense if
float were a perfect superset of int, so that you could simply declare
that 1.0 and 1 behave absolutely identically in all arithmetic (they
already hash and compare equally), but that's not the case, so I don't
see that division should try to pretend they are.

ChrisA

[toc] | [prev] | [next] | [standalone]

#70402

From	Gregory Ewing <greg.ewing@canterbury.ac.nz>
Date	2014-04-20 13:35 +1200
Message-ID	<brgmk0F8247U1@mid.individual.net>
In reply to	#70399

Chris Angelico wrote:
> Is your function so generic that it has to be able
> to handle float, Decimal, or complex, and not care about the
> difference, and yet has to ensure that int divided by int doesn't
> yield int?

It doesn't have to be that generic to cause pain. Even if
you're only dealing with floats, the old way meant you had
to stick float() calls all over the place in order to be
sure your divisions do what you want. Not only does that
clutter up and obscure the code, it's needlessy inefficient,
since *most* of the time they don't do anything.

There's also the annoyance that there's more than one
obvious way to do it. Do you write float(x)/y or
x/float(y)? Or do you go for a more symmetrical look
and write float(x)/float(y), even though it's redundant?

The new way makes *all* of that go away. The only downside
is that you need to keep your wits about you and select
the appropriate operator whenever you write a division.
But you had to think about that *anyway* under the old
system, or risk having your divisions silently do the
wrong thing under some circumstances -- and the remedy
for that was very clunky and inefficient.

I'm thoroughly convinced that the *old* way was the
mistake, and changing it was the right thing to do.

> The language doesn't specify a means of resolving the conflict between
> float and Decimal, but for some reason the division of two integers is
> blessed with a language feature.

No, it's not. What the language does is recognise that
there are two kinds of division frequently used, and that
the vast majority of the time you know *when you write the
code* which one you intend. To support this, it provides two
operators. It's still up to the types concerned to implement
those operators in a useful way.

The built-in int and float types cooperate to make // mean
integer division and / mean float division, because that's
the most convenient meanings for them on those types.
Other types are free to do what makes the most sense for
them.

-- 
Greg

[toc] | [prev] | [next] | [standalone]

#70409

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2014-04-20 09:59 +0000
Message-ID	<53539a64$0$29993$c3e8da3$5496439d@news.astraweb.com>
In reply to	#70371

On Fri, 18 Apr 2014 23:40:18 -0700, Paul Rubin wrote:

> It's just that the improvement
> from 2 to 3 is rather small, and 2 works perfectly well and people are
> used to it, so they keep using it.

Spoken like a true ASCII user :-)

The "killer feature" of Python 3 is improved handling of Unicode, which 
now brings Python 3 firmly into the (very small) group of programming 
languages with first-class support for more than 128 different characters 
by default.

Unfortunately, that made handling byte strings a bit more painful, but 
3.4 improves that, and 3.5 ought to fix it. People doing a lot of mixed 
Unicode text + bytes handling should pay attention to what goes on over 
the next 18 months, because the Python core developers are looking to fix 
the text/byte pain points. Your feedback is wanted.

> There are nice tools that help port
> your codebase from 2 to 3 with fairly little effort. But, you can also
> keep your codebase on 2 with zero effort. So people choose zero over
> fairly little.

True. But for anyone wanting long term language support, a little bit of 
effort today will save them a lot of effort in six years time.

-- 
Steven D'Aprano
http://import-that.dreamwidth.org/

[toc] | [prev] | [next] | [standalone]

#70482 — Unicode in Python

From	Rustom Mody <rustompmody@gmail.com>
Date	2014-04-21 20:57 -0700
Subject	Unicode in Python
Message-ID	<de3f2a2d-8163-4059-b0a2-cca08bb2344b@googlegroups.com>
In reply to	#70409

On Sunday, April 20, 2014 3:29:00 PM UTC+5:30, Steven D'Aprano wrote:
> On Fri, 18 Apr 2014 23:40:18 -0700, Paul Rubin wrote:
> 
> > It's just that the improvement
> > from 2 to 3 is rather small, and 2 works perfectly well and people are
> > used to it, so they keep using it.
> 
> 
> Spoken like a true ASCII user :-)

Heh!

> 
> The "killer feature" of Python 3 is improved handling of Unicode, which 
> now brings Python 3 firmly into the (very small) group of programming 
> languages with first-class support for more than 128 different characters 
> by default.

As a unicode user (ok wannabe unicode user :D ) Ive
written up some unicode ideas that have been discussed here in the
last couple of weeks:

http://blog.languager.org/2014/04/unicoded-python.html

If Ive non or misattributed some ideas please excuse and let me know!

[toc] | [prev] | [next] | [standalone]

#70483 — Re: Unicode in Python

From	Terry Reedy <tjreedy@udel.edu>
Date	2014-04-22 01:44 -0400
Subject	Re: Unicode in Python
Message-ID	<mailman.9425.1398145500.18130.python-list@python.org>
In reply to	#70482

On 4/21/2014 11:57 PM, Rustom Mody wrote:

> As a unicode user (ok wannabe unicode user :D ) Ive
> written up some unicode ideas that have been discussed here in the
> last couple of weeks:
>
> http://blog.languager.org/2014/04/unicoded-python.html

"With python 3 we are at a stage where python programs can support 
unicode well however python program- source is still completely ASCII."

In Python 3, "Python reads program text as Unicode code points; the 
encoding of a source file can be given by an encoding declaration and 
defaults to UTF-8". Why start off with an erroneous statement, which you 
even know and show is erroneous?

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#70486 — Re: Unicode in Python

From	Rustom Mody <rustompmody@gmail.com>
Date	2014-04-21 23:18 -0700
Subject	Re: Unicode in Python
Message-ID	<7a873fe6-a2fa-4c77-bb3a-b00eab2f7cff@googlegroups.com>
In reply to	#70483

On Tuesday, April 22, 2014 11:14:17 AM UTC+5:30, Terry Reedy wrote:
> On 4/21/2014 11:57 PM, Rustom Mody wrote:

> > As a unicode user (ok wannabe unicode user :D ) Ive
> > written up some unicode ideas that have been discussed here in the
> > last couple of weeks:
> > http://blog.languager.org/2014/04/unicoded-python.html

> "With python 3 we are at a stage where python programs can support 
> unicode well however python program- source is still completely ASCII."

> In Python 3, "Python reads program text as Unicode code points; the 
> encoding of a source file can be given by an encoding declaration and 
> defaults to UTF-8". Why start off with an erroneous statement, which you 
> even know and show is erroneous?

Ok

Ive reworded it to make it clear that I am referring to the character-sets and
not encodings.

[toc] | [prev] | [next] | [standalone]

#70489 — Re: Unicode in Python

From	Chris Angelico <rosuav@gmail.com>
Date	2014-04-22 16:32 +1000
Subject	Re: Unicode in Python
Message-ID	<mailman.9427.1398148386.18130.python-list@python.org>
In reply to	#70486

On Tue, Apr 22, 2014 at 4:18 PM, Rustom Mody <rustompmody@gmail.com> wrote:
> Ive reworded it to make it clear that I am referring to the character-sets and
> not encodings.

It's still false, and was in Python 2 as well. The only difference on
that front is that, in the absence of an encoding cookie, Python 2
defaults to ASCII while Python 3 defaults to UTF-8. PEP 263 explains
the feature as it was added to Py2; PEP 3120 makes the change to a
UTF-8 default.

Python source code is Unicode text, and has been since 2001 and Python 2.3.

ChrisA

[toc] | [prev] | [next] | [standalone]

#70485 — Re: Unicode in Python

From	Steven D'Aprano <steve@pearwood.info>
Date	2014-04-22 06:11 +0000
Subject	Re: Unicode in Python
Message-ID	<5356082c$0$11109$c3e8da3@news.astraweb.com>
In reply to	#70482

On Mon, 21 Apr 2014 20:57:39 -0700, Rustom Mody wrote:

> As a unicode user (ok wannabe unicode user :D ) Ive written up some
> unicode ideas that have been discussed here in the last couple of weeks:
> 
> http://blog.languager.org/2014/04/unicoded-python.html

What you are talking about is not handling Unicode with Python, but 
extending the programming language to allow non-English *letters* to be 
used as if they were *symbols*.

That's very problematic, since it assumes that nobody would ever want to 
use non-English letters in an alphanumeric context. You write:

    [quote]
    Now to move ahead!
    We dont[sic] want

    >>> λ = 1
    >>> λ
    1

    We want

    >>> (λx : x+1)(2)
    3
    [end quote]

(Speak for yourself.) But this is a problem. Suppose I want to use a 
Greek word as a variable, as Python allows me to do:

λόγος = "a word"

Or perhaps as the parameter to a function. Take the random.expovariate 
function, which currently takes an argument "lambd" (since lambda is a 
reserved word). I might write instead:

def expovariate(self, λ): ...

After all, λ is an ordinary letter of the (Greek) alphabet, why shouldn't 
it be used in variable names? But if "λx" is syntax for "lambda x", then 
I'm going to get syntax errors:

λόγος = "a word"
=> like:  lambda όγος = "a word"

def expovariate(self, λ):
=> like:  def expovariate(self, lambda):

both of which are obviously syntax errors.

This is as hostile to Greek-using programmers as deciding that "f" should 
be reserved for functions would be to English-using programmers:

# space between the f and the function name is not needed
fspam(x, y):
    ...

class Thingy:
    f__init__(selF):
        ...
    fmethod(selF, arg):
        return arg + 1

Notice that I can't even write "self" any more, since that gives a syntax 
error. Presumable "if" is okay, as it is a keyword.

Using Unicode *symbols* rather than non-English letters is less of a 
problem, since they aren't valid in identifiers.

More comments to follow later.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#70487 — Re: Unicode in Python

From	Rustom Mody <rustompmody@gmail.com>
Date	2014-04-21 23:30 -0700
Subject	Re: Unicode in Python
Message-ID	<9f94f6b0-ba35-41dc-95a5-48018412fdf6@googlegroups.com>
In reply to	#70485

On Tuesday, April 22, 2014 11:41:56 AM UTC+5:30, Steven D'Aprano wrote:
> On Mon, 21 Apr 2014 20:57:39 -0700, Rustom Mody wrote:

> > As a unicode user (ok wannabe unicode user :D ) Ive written up some
> > unicode ideas that have been discussed here in the last couple of weeks:
> > http://blog.languager.org/2014/04/unicoded-python.html

> What you are talking about is not handling Unicode with Python, but 
> extending the programming language to allow non-English *letters* to be 
> used as if they were *symbols*.

> That's very problematic, since it assumes that nobody would ever want to 
> use non-English letters in an alphanumeric context. You write:

>     [quote]
>     Now to move ahead!
>     We dont[sic] want

>     >>> λ = 1
>     >>> λ
>     1

>     We want

>     >>> (λx : x+1)(2)
>     3
>     [end quote]


> (Speak for yourself.) But this is a problem. Suppose I want to use a 
> Greek word as a variable, as Python allows me to do:

> λόγος = "a word"

> Or perhaps as the parameter to a function. Take the random.expovariate 
> function, which currently takes an argument "lambd" (since lambda is a 
> reserved word). I might write instead:

> def expovariate(self, λ): ...

> After all, λ is an ordinary letter of the (Greek) alphabet, why shouldn't 
> it be used in variable names? But if "λx" is syntax for "lambda x", then 
> I'm going to get syntax errors:

> λόγος = "a word"
> => like:  lambda όγος = "a word"

> def expovariate(self, λ):
> => like:  def expovariate(self, lambda):

> both of which are obviously syntax errors.

> This is as hostile to Greek-using programmers as deciding that "f" should 
> be reserved for functions would be to English-using programmers:

> # space between the f and the function name is not needed
> fspam(x, y):
>     ...

> class Thingy:
>     f__init__(selF):
>         ...
>     fmethod(selF, arg):
>         return arg + 1

> Notice that I can't even write "self" any more, since that gives a syntax 
> error. Presumable "if" is okay, as it is a keyword.

> Using Unicode *symbols* rather than non-English letters is less of a 
> problem, since they aren't valid in identifiers.

Ok point taken.

So instead of using λ (0x3bb) we should use  𝝀 (0x1d740)  or something thereabouts like 𝜆

[toc] | [prev] | [next] | [standalone]

#70490 — Re: Unicode in Python

From	Chris Angelico <rosuav@gmail.com>
Date	2014-04-22 16:44 +1000
Subject	Re: Unicode in Python
Message-ID	<mailman.9428.1398149050.18130.python-list@python.org>
In reply to	#70487

On Tue, Apr 22, 2014 at 4:30 PM, Rustom Mody <rustompmody@gmail.com> wrote:
> So instead of using λ (0x3bb) we should use  𝝀 (0x1d740)  or something thereabouts like 𝜆

You still have a major problem: How do you type that? It gives you
very little advantage over the word "lambda", it introduces
readability issues, it's impossible for most people to type (and
programming with a palette of arbitrary syntactic tokens isn't my idea
of fun), it's harder for a new programmer to get docs for (especially
if s/he reads the file in the wrong encoding), and all in all, it's a
pretty poor substitute for a word.

ChrisA

[toc] | [prev] | [next] | [standalone]

#70492 — Re: Unicode in Python

From	wxjmfauth@gmail.com
Date	2014-04-22 02:07 -0700
Subject	Re: Unicode in Python
Message-ID	<4e5ab7e6-a144-4709-8a6f-7bd540891ed2@googlegroups.com>
In reply to	#70487

Le mardi 22 avril 2014 08:30:45 UTC+2, Rustom Mody a écrit :
> 
> 
> 

@ rusy

> "Ive reworded it to make it clear that I am referring to the
character-sets and not encodings."

Very good, excellent, comment. An healthy coding scheme can only
work properly with a unique characters set and the coding is achieved
with the help of a unique operator. There is no other way to do it
and that's the reason why we have to live today with all these
coding schemes (unicode or not). Note: A coding scheme can be
much more complex than the coding of "raw" characters (eg. CID
fonts).
> "So instead of using λ (0x3bb) we should use  𝝀 (0x1d740)  or something thereabouts like 𝜆"

This is a very good understanding of unicode. The letter lambda
is not the mathematical symbole lambda. Another example,
the micro sign is not the greek letter mu which is not the mathematical
mu. Shorly, it's maybe not a bad idea to use a plain ascii "lambda"
instead of a wrong unicode point.

jmf

[toc] | [prev] | [next] | [standalone]

#70502 — Re: Unicode in Python

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2014-04-22 12:21 +0000
Subject	Re: Unicode in Python
Message-ID	<53565ed4$0$29993$c3e8da3$5496439d@news.astraweb.com>
In reply to	#70492

On Tue, 22 Apr 2014 02:07:58 -0700, wxjmfauth wrote:

> Le mardi 22 avril 2014 08:30:45 UTC+2, Rustom Mody a écrit :
>> 
>> 
>> 
>> 
> @ rusy
> 
>> "Ive reworded it to make it clear that I am referring to the
> character-sets and not encodings."
> 
> Very good, excellent, comment. An healthy coding scheme can only work
> properly with a unique characters set and the coding is achieved with
> the help of a unique operator. There is no other way to do it and that's
> the reason why we have to live today with all these coding schemes
> (unicode or not). Note: A coding scheme can be much more complex than
> the coding of "raw" characters (eg. CID fonts).
>> "So instead of using λ (0x3bb) we should use  𝝀 (0x1d740)  or 
>> something thereabouts like 𝜆"

For those who cannot see them, they are:

py> unicodedata.name('\U0001d740')
'MATHEMATICAL BOLD ITALIC SMALL LAMDA'
py> unicodedata.name('\U0001d706')
'MATHEMATICAL ITALIC SMALL LAMDA'

("LAMDA" is the official Unicode name for Lambda.)

> This is a very good understanding of unicode. The letter lambda is not
> the mathematical symbole lambda. Another example, the micro sign is not
> the greek letter mu which is not the mathematical mu. 

Depends what you mean by "is not". The micro sign is a legacy 
compatibility character, we shouldn't use it except for compatibility 
with legacy (non-Unicode) character sets. Instead, we should use the NFKC 
or NFKD normalization forms to convert it to the recommended character.

py> import unicodedata
py> a = '\N{GREEK SMALL LETTER MU}'  # Preferred
py> b = '\N{MICRO SIGN}'  # Legacy
py> a == b
False
py> unicodedata.normalize('NFKD', b) == a
True
py> unicodedata.normalize('NFKC', b) == a
True

As for the mathematical mu, there is no separate Unicode "maths symbol 
mu" so far as I am aware. One would simply use '\N{MICRO SIGN}' or 
'\N{GREEK SMALL LETTER MU}' to get a μ.

Likewise, the λ used in mathematics is the Greek letter λ, not a separate 
symbol, just like the Latin letter x and the x used in mathematics are 
the same.

-- 
Steven D'Aprano
http://import-that.dreamwidth.org/

[toc] | [prev] | [next] | [standalone]

#70515 — Re: Unicode in Python

From	wxjmfauth@gmail.com
Date	2014-04-22 08:28 -0700
Subject	Re: Unicode in Python
Message-ID	<a4b4f695-e4eb-45a6-b9a2-fc85b98b4ba8@googlegroups.com>
In reply to	#70502

Le mardi 22 avril 2014 14:21:40 UTC+2, Steven D'Aprano a écrit :
> On Tue, 22 Apr 2014 02:07:58 -0700, wxjmfauth wrote:
> 
> 
> 
> > Le mardi 22 avril 2014 08:30:45 UTC+2, Rustom Mody a écrit :
> 
> >> 
> 
> >> 
> 
> >> 
> 
> >> 
> 
> > @ rusy
> 
> > 
> 
> >> "Ive reworded it to make it clear that I am referring to the
> 
> > character-sets and not encodings."
> 
> > 
> 
> > Very good, excellent, comment. An healthy coding scheme can only work
> 
> > properly with a unique characters set and the coding is achieved with
> 
> > the help of a unique operator. There is no other way to do it and that's
> 
> > the reason why we have to live today with all these coding schemes
> 
> > (unicode or not). Note: A coding scheme can be much more complex than
> 
> > the coding of "raw" characters (eg. CID fonts).
> 
> >> "So instead of using λ (0x3bb) we should use  𝝀 (0x1d740)  or 
> 
> >> something thereabouts like 𝜆"
> 
> 
> 
> For those who cannot see them, they are:
> 
> 
> 
> py> unicodedata.name('\U0001d740')
> 
> 'MATHEMATICAL BOLD ITALIC SMALL LAMDA'
> 
> py> unicodedata.name('\U0001d706')
> 
> 'MATHEMATICAL ITALIC SMALL LAMDA'
> 
> 
> 
> 
> 
> ("LAMDA" is the official Unicode name for Lambda.)
> 
> 
> 
>  
> 
> > This is a very good understanding of unicode. The letter lambda is not
> 
> > the mathematical symbole lambda. Another example, the micro sign is not
> 
> > the greek letter mu which is not the mathematical mu. 
> 
> 
> 
> Depends what you mean by "is not". The micro sign is a legacy 
> 
> compatibility character, we shouldn't use it except for compatibility 
> 
> with legacy (non-Unicode) character sets. Instead, we should use the NFKC 
> 
> or NFKD normalization forms to convert it to the recommended character.
> 
> 
> 
> 
> 
> py> import unicodedata
> 
> py> a = '\N{GREEK SMALL LETTER MU}'  # Preferred
> 
> py> b = '\N{MICRO SIGN}'  # Legacy
> 
> py> a == b
> 
> False
> 
> py> unicodedata.normalize('NFKD', b) == a
> 
> True
> 
> py> unicodedata.normalize('NFKC', b) == a
> 
> True
> 
> 
> 
> As for the mathematical mu, there is no separate Unicode "maths symbol 
> 
> mu" so far as I am aware. One would simply use '\N{MICRO SIGN}' or 
> 
> '\N{GREEK SMALL LETTER MU}' to get a μ.
> 
> 
> 
> Likewise, the λ used in mathematics is the Greek letter λ, not a separate 
> 
> symbol, just like the Latin letter x and the x used in mathematics are 
> 
> the same.
> 
> 

Normalization is working fine, but it proofs nothing, it
has to use some convention.

There are several code points ranges (latin + greek), which can
be used for mathematical purpose (different mu's).

If you are interested, search for "unimath-symbols.pdf"
on CTAN (I have all this stuff on my hd).

...
"Likewise, the λ used in mathematics is the Greek letter λ, not a separate
symbol, just like the Latin letter x and the x used in mathematics are
the same. "... just like the Latin letter x and the x used in mathematics
are the same.
...

Oh! Definitively not. A tool with an unicode engine able to
produce "math text" will certainly not use the same code point
for a "textual x" or for a "mathematical x", even if one
enter/type/hit the same "x".

To be exaggeratedly stict, the real question is to know
if a used "lambda" or "x" belongs to a "math unicode range"
or not. This is quite a different approach. (Please no
confusion with a "text litteral variable x").

A text processing tool will notice the difference, it will
use different fonts.

jmf

[toc] | [prev] | [next] | [standalone]

#70488 — Re: Unicode in Python

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2014-04-22 00:31 -0600
Subject	Re: Unicode in Python
Message-ID	<mailman.9426.1398148276.18130.python-list@python.org>
In reply to	#70482

[Multipart message — attachments visible in raw view] — view raw

On Apr 22, 2014 12:01 AM, "Rustom Mody" <rustompmody@gmail.com> wrote:
> As a unicode user (ok wannabe unicode user :D ) Ive
> written up some unicode ideas that have been discussed here in the
> last couple of weeks:
>
> http://blog.languager.org/2014/04/unicoded-python.html

I'm reminded of this satire:
http://www.ojohaven.com/fun/spelling.html

[toc] | [prev] | [next] | [standalone]

#70495 — Re: Unicode in Python

From	Rustom Mody <rustompmody@gmail.com>
Date	2014-04-22 02:23 -0700
Subject	Re: Unicode in Python
Message-ID	<644ffa99-8288-460a-afb7-b9c208765892@googlegroups.com>
In reply to	#70488

On Tuesday, April 22, 2014 12:01:06 PM UTC+5:30, Ian wrote:
> On Apr 22, 2014 12:01 AM, "Rustom Mody" <rusto...@gmail.com> wrote:
> > As a unicode user (ok wannabe unicode user :D ) Ive
> > written up some unicode ideas that have been discussed here in the
> > last couple of weeks:
> > http://blog.languager.org/2014/04/unicoded-python.html
> I'm reminded of this satire:
> http://www.ojohaven.com/fun/spelling.html

Ha Ha!! Thanks much for that.

Ive been looking for that for years but had no starting point for a
search
[For some reason I always thought it was Bernard Shaw]

[toc] | [prev] | [next] | [standalone]

#70517 — Re: Unicode in Python

From	Rustom Mody <rustompmody@gmail.com>
Date	2014-04-22 11:09 -0700
Subject	Re: Unicode in Python
Message-ID	<4990bd59-1ad4-42d6-acfc-0ad161e8d938@googlegroups.com>
In reply to	#70488

On Tuesday, April 22, 2014 12:01:06 PM UTC+5:30, Ian wrote:
> On Apr 22, 2014 12:01 AM, "Rustom Mody" <rusto...@gmail.com> wrote:
> > As a unicode user (ok wannabe unicode user :D ) Ive
> > written up some unicode ideas that have been discussed here in the
> > last couple of weeks:
> > http://blog.languager.org/2014/04/unicoded-python.html
> I'm reminded of this satire:
> http://www.ojohaven.com/fun/spelling.html

At the risk of 'explaining the joke' I believe it becomes comical due
to the cumulating effect of suggesting ← for assignment and then using that.
Since I dont like its look in any fonts that I can check, I am returning the
(subsequent) examples to to good (or bad) old =

Also the λ is unnecessarily contentions.  Been replaced by more
straightforward introductory examples.

[toc] | [prev] | [next] | [standalone]

#70418

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2014-04-20 10:22 -0600
Message-ID	<mailman.9380.1398010958.18130.python-list@python.org>
In reply to	#70371

[Multipart message — attachments visible in raw view] — view raw

On Apr 19, 2014 2:54 PM, "Chris Angelico" <rosuav@gmail.com> wrote:
>
> On Sun, Apr 20, 2014 at 6:38 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> >> Or you just cast one of them to float. That way you're sure you're
> >> working with floats.
> >
> > Which is inappropriate if the type passed in was a Decimal or a complex.
>
> In that case, you already have a special case in your code, so whether
> that special case is handled by the language or by your code makes
> little difference. Is your function so generic that it has to be able
> to handle float, Decimal, or complex, and not care about the
> difference, and yet has to ensure that int divided by int doesn't
> yield int? Then say so; put in that special check. Personally, I've
> yet to meet any non-toy example of a function that needs that exact
> handling; most code doesn't ever think about complex numbers, and a
> lot of things look for one specific type:

When I'm writing a generic average function, I probably don't know whether
it will ever be used to average complex numbers. That shouldn't matter,
because I should be able to rely on this code working for whatever numeric
type I pass in:

def average(values):
    return sum(values) / len(values)

This works for decimals, it works for fractions, it works for complex
numbers, it works for numpy types, and in Python 3 it works for ints.

> Maybe it's not your code that should be caring about what happens when
> you divide two integers, but the calling code. If you're asking for
> the average of a list of numbers, and they're all integers, and the
> avg() function truncates to integer, then the solution is to use sum()
> and explicitly cast to floating point before dividing.

First, that's not equivalent.  Try the following in Python 3:

values = [int(sys.float_info.max / 10)] * 20
print(average(values))

Now try this:

print(average(map(float, values)))

I don't have an interpreter handy to test, but I expect the former to
produce the correct result and the latter to raise OverflowError on the
call to sum.

Second, why should the calling code have to worry about this implementation
detail anyway? The point of a generic function is that it's generic.

[toc] | [prev] | [next] | [standalone]

#70443

From	Gregory Ewing <greg.ewing@canterbury.ac.nz>
Date	2014-04-21 11:56 +1200
Message-ID	<brj55uFngdpU1@mid.individual.net>
In reply to	#70418

Ian Kelly wrote:

> def average(values):
>     return sum(values) / len(values)
> 
> This works for decimals, it works for fractions, it works for complex 
> numbers, it works for numpy types, and in Python 3 it works for ints.

That depends on what you mean by "works". I would actually
find it rather disturbing if an average() function implicitly
used floor division when given all ints.

The reason is that people often use ints as stand-ins for
floats in computations that are conceptually non-integer.
So a general-purpose function like average(), given a list
of ints, has no way of knowing whether they're intended
to be interpreted as ints or floats.

To my way of thinking, floor division is a specialised
operation that is only wanted in particular circumstances.
It's rare that I would actually want it done in the
context of taking an average, and if I do, I would rather
be explicit about it using e.g. int(floor(average(...))
or a specialised int_average() function.

-- 
Greg

[toc] | [prev] | [next] | [standalone]

#70448

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2014-04-20 18:29 -0600
Message-ID	<mailman.9395.1398040148.18130.python-list@python.org>
In reply to	#70443

[Multipart message — attachments visible in raw view] — view raw

On Apr 20, 2014 8:01 PM, "Gregory Ewing" <greg.ewing@canterbury.ac.nz>
wrote:
>
> Ian Kelly wrote:
>
>> def average(values):
>>     return sum(values) / len(values)
>>
>> This works for decimals, it works for fractions, it works for complex
numbers, it works for numpy types, and in Python 3 it works for ints.
>
>
> That depends on what you mean by "works". I would actually
> find it rather disturbing if an average() function implicitly
> used floor division when given all ints.

The code above never uses floor division in Python 3.  Therefore it "works".

[toc] | [prev] | [next] | [standalone]

Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →

csiph-web

Why Python 3?

Contents

#70398

#70399

#70402

#70409

#70482 — Unicode in Python

#70483 — Re: Unicode in Python

#70486 — Re: Unicode in Python

#70489 — Re: Unicode in Python

#70485 — Re: Unicode in Python

#70487 — Re: Unicode in Python

#70490 — Re: Unicode in Python

#70492 — Re: Unicode in Python

#70502 — Re: Unicode in Python

#70515 — Re: Unicode in Python

#70488 — Re: Unicode in Python

#70495 — Re: Unicode in Python

#70517 — Re: Unicode in Python

#70418

#70443

#70448