Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #69049 > unrolled thread
| Started by | Mark H Harris <harrismh777@gmail.com> |
|---|---|
| First post | 2014-03-25 13:30 -0500 |
| Last post | 2014-03-25 22:26 -0400 |
| Articles | 20 on this page of 75 — 22 participants |
Back to article view | Back to comp.lang.python
unicode as valid naming symbols Mark H Harris <harrismh777@gmail.com> - 2014-03-25 13:30 -0500
Re: unicode as valid naming symbols wxjmfauth@gmail.com - 2014-03-25 11:52 -0700
Re: unicode as valid naming symbols Mark H Harris <harrismh777@gmail.com> - 2014-03-25 14:24 -0500
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-03-25 19:16 -0700
Re: unicode as valid naming symbols MRAB <python@mrabarnett.plus.com> - 2014-03-25 19:24 +0000
Re: unicode as valid naming symbols Mark H Harris <harrismh777@gmail.com> - 2014-03-25 14:29 -0500
Re: unicode as valid naming symbols Marko Rauhamaa <marko@pacujo.net> - 2014-03-25 21:48 +0200
Re: unicode as valid naming symbols Skip Montanaro <skip@pobox.com> - 2014-03-25 14:54 -0500
Re: unicode as valid naming symbols Cameron Simpson <cs@zip.com.au> - 2014-03-26 09:16 +1100
Re: unicode as valid naming symbols Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-25 13:49 -0600
Re: unicode as valid naming symbols Tim Chase <python.list@tim.thechases.com> - 2014-03-25 15:29 -0500
Re: unicode as valid naming symbols Ethan Furman <ethan@stoneleaf.us> - 2014-03-25 15:47 -0700
Re: unicode as valid naming symbols Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-25 23:58 +0000
Re: unicode as valid naming symbols Mark H Harris <harrismh777@gmail.com> - 2014-03-27 10:28 -0500
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-03-27 08:51 -0700
Re: unicode as valid naming symbols Mark H Harris <harrismh777@gmail.com> - 2014-03-27 11:03 -0500
Re: unicode as valid naming symbols Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-28 12:45 +1300
Re: unicode as valid naming symbols MRAB <python@mrabarnett.plus.com> - 2014-03-27 17:17 +0000
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-03-27 10:53 -0700
Re: unicode as valid naming symbols Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-27 10:22 -0600
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-03-27 10:41 -0700
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-03-28 03:23 +1100
Re: unicode as valid naming symbols Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-31 11:55 +0200
Re: unicode as valid naming symbols Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-31 11:40 -0600
Re: unicode as valid naming symbols Tim Chase <python.list@tim.thechases.com> - 2014-03-31 13:02 -0500
Re: unicode as valid naming symbols Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-31 12:10 -0600
Re: unicode as valid naming symbols Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-31 21:31 +0200
Re: unicode as valid naming symbols Terry Reedy <tjreedy@udel.edu> - 2014-03-31 16:12 -0400
Re: unicode as valid naming symbols Terry Reedy <tjreedy@udel.edu> - 2014-03-31 16:15 -0400
Re: unicode as valid naming symbols Marko Rauhamaa <marko@pacujo.net> - 2014-03-31 23:34 +0300
Re: unicode as valid naming symbols Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-31 18:47 -0600
Re: unicode as valid naming symbols David Hutto <dwightdhutto@gmail.com> - 2014-03-31 23:58 -0400
Re: unicode as valid naming symbols David Hutto <dwightdhutto@gmail.com> - 2014-04-01 00:11 -0400
Re: unicode as valid naming symbols Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-04-01 10:19 +0200
Re: unicode as valid naming symbols Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-01 03:18 -0600
Re: unicode as valid naming symbols Marko Rauhamaa <marko@pacujo.net> - 2014-04-01 12:32 +0300
Re: unicode as valid naming symbols Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-01 03:58 -0600
Re: unicode as valid naming symbols Marko Rauhamaa <marko@pacujo.net> - 2014-04-01 15:02 +0300
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-04-01 23:54 +1100
Re: unicode as valid naming symbols Marko Rauhamaa <marko@pacujo.net> - 2014-04-01 16:16 +0300
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-04-02 00:32 +1100
Re: unicode as valid naming symbols Marko Rauhamaa <marko@pacujo.net> - 2014-04-01 18:59 +0300
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-04-01 19:58 -0700
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-04-01 20:16 -0700
Re: unicode as valid naming symbols Marko Rauhamaa <marko@pacujo.net> - 2014-04-02 08:55 +0300
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-04-01 21:39 +1100
Re: unicode as valid naming symbols Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-04-01 12:37 +0200
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-04-01 21:58 +1100
Re: unicode as valid naming symbols Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-04-01 13:59 +0200
Re: unicode as valid naming symbols Roy Smith <roy@panix.com> - 2014-04-01 08:29 -0400
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-04-02 00:08 +1100
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-04-01 06:34 -0700
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-04-02 00:00 +1100
Re: unicode as valid naming symbols Ned Batchelder <ned@nedbatchelder.com> - 2014-04-01 09:33 -0400
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-04-02 00:44 +1100
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-04-01 06:58 -0700
Re: unicode as valid naming symbols Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-01 09:53 -0600
Re: unicode as valid naming symbols MRAB <python@mrabarnett.plus.com> - 2014-03-26 02:56 +0000
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-03-26 14:09 +1100
Re: unicode as valid naming symbols Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-26 09:25 +0100
Re: unicode as valid naming symbols Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-26 09:52 +0100
Re: unicode as valid naming symbols Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-26 10:37 -0600
Re: unicode as valid naming symbols Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-27 10:36 +0100
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-03-27 08:10 -0700
Re: unicode as valid naming symbols Tim Chase <python.list@tim.thechases.com> - 2014-03-27 10:34 -0500
Re: unicode as valid naming symbols random832@fastmail.us - 2014-03-28 14:55 -0400
Re: unicode as valid naming symbols Rustom Mody <rustompmody@gmail.com> - 2014-03-28 22:00 -0700
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-03-29 16:12 +1100
Re: unicode as valid naming symbols Ben Finney <ben+python@benfinney.id.au> - 2014-03-29 16:32 +1100
Re: unicode as valid naming symbols Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-03-29 14:11 -0400
Re: unicode as valid naming symbols Chris Angelico <rosuav@gmail.com> - 2014-03-30 09:01 +1100
Re: unicode as valid naming symbols Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-30 19:16 +1300
Re: unicode as valid naming symbols Mark H Harris <harrismh777@gmail.com> - 2014-03-25 14:29 -0500
Re:unicode as valid naming symbols Dave Angel <davea@davea.name> - 2014-03-25 15:45 -0400
Re: unicode as valid naming symbols Terry Reedy <tjreedy@udel.edu> - 2014-03-25 22:26 -0400
Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-02 00:32 +1100 |
| Message-ID | <mailman.8800.1396359183.18130.python-list@python.org> |
| In reply to | #69511 |
On Wed, Apr 2, 2014 at 12:16 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> I implemented the loops in the scheme way. Recursion is how iteration is
> done by the Believers. Traditional looping structures are available to
> scheme, but if you felt the need for them, you might as well program in
> Python.
Then I'm happily a pagan who uses while loops instead of recursion.
Why should every loop become a named function?
find_divisor: for ( factor = 2 ; i%factor ; factor++ )
{
if ( factor == i )
{
printf("%d\n",i);
count--;
break;
}
}
Does that label add anything? If you really need to put a name to
every loop you ever write, there's something wrong with the code; some
loops' purposes should be patently obvious by their body. All you do
is add duplicate information that might be wrong.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-04-01 18:59 +0300 |
| Message-ID | <874n2dt50g.fsf@elektro.pacujo.net> |
| In reply to | #69512 |
Chris Angelico <rosuav@gmail.com>: > On Wed, Apr 2, 2014 at 12:16 AM, Marko Rauhamaa <marko@pacujo.net> wrote: >> I implemented the loops in the scheme way. Recursion is how iteration >> is done by the Believers. > > Then I'm happily a pagan who uses while loops instead of recursion. > Why should every loop become a named function? Every language has its idioms. The principal aesthetic motivation for named-let loops is the avoidance of (set!), I think. Secondarily, you get to shift gears in the middle of your loops; something you can often, but not always, accomplish in Python with break, return and continue. Don't take me wrong. Python has its own idioms and avoiding loops in Python would be equally blasphemous. In C++ you avoid void pointers like the plague, in C you celebrate them. Marko
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-01 19:58 -0700 |
| Message-ID | <b6d2ce2f-5cf2-4aae-9b49-1c71c512de5e@googlegroups.com> |
| In reply to | #69523 |
On Tuesday, April 1, 2014 9:29:27 PM UTC+5:30, Marko Rauhamaa wrote: > Chris Angelico : > > On Wed, Apr 2, 2014 at 12:16 AM, Marko Rauhamaa wrote: > >> I implemented the loops in the scheme way. Recursion is how iteration > >> is done by the Believers. > > Then I'm happily a pagan who uses while loops instead of recursion. > > Why should every loop become a named function? > Every language has its idioms. The principal aesthetic motivation for > named-let loops is the avoidance of (set!), I think. Secondarily, you > get to shift gears in the middle of your loops; something you can often, > but not always, accomplish in Python with break, return and continue. You are forgetting the main point: In scheme, in a named-let, the name chosen was very often 'loop' (if I remember the PC scheme manuals correctly). IOW if you had a dozen loops implemented with named-letted-tail-recursion, you could call all of them 'loop'. How is that different from calling all of them 'while' or 'for' ? > Don't take me wrong. Python has its own idioms and avoiding loops in > Python would be equally blasphemous. In C++ you avoid void pointers like > the plague, in C you celebrate them. Yeah... I guess that is the issue. People brought up on imperative (which includes OO) programming, think recursion and iteration are fundamentally different, just as assembly language programmers think of memory and register as fundamentally different. Sure is but if you are a C programmer the distinction is irrelevant 99% of the time! Continues downward... For an assembly language programmer, memory and cache-memory is not a distinction he needs to make 99% of the time. Not so for the hardware engineer
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-01 20:16 -0700 |
| Message-ID | <ba7a9cb4-8203-430c-9bd3-2a5ecebaac22@googlegroups.com> |
| In reply to | #69534 |
On Wednesday, April 2, 2014 8:28:02 AM UTC+5:30, Rustom Mody wrote: > On Tuesday, April 1, 2014 9:29:27 PM UTC+5:30, Marko Rauhamaa wrote: > > Chris Angelico : > > > On Wed, Apr 2, 2014 at 12:16 AM, Marko Rauhamaa wrote: > > >> I implemented the loops in the scheme way. Recursion is how iteration > > >> is done by the Believers. > > > Then I'm happily a pagan who uses while loops instead of recursion. > > > Why should every loop become a named function? > > Every language has its idioms. The principal aesthetic motivation for > > named-let loops is the avoidance of (set!), I think. Secondarily, you > > get to shift gears in the middle of your loops; something you can often, > > but not always, accomplish in Python with break, return and continue. > You are forgetting the main point: In scheme, in a named-let, the name > chosen was very often 'loop' (if I remember the PC scheme manuals > correctly). IOW if you had a dozen loops implemented with > named-letted-tail-recursion, you could call all of them 'loop'. How > is that different from calling all of them 'while' or 'for' ? Umm... I see from your prime number example that there are nested loops in which sometimes you restart the inner and sometimes the outer. So you could not possibly call both of them 'loop' :-). So "you could call all of them 'loop'" is over-statement. "Good many" may be more appropriate?
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-04-02 08:55 +0300 |
| Message-ID | <87d2h09swm.fsf@elektro.pacujo.net> |
| In reply to | #69535 |
Rustom Mody <rustompmody@gmail.com>: > On Wednesday, April 2, 2014 8:28:02 AM UTC+5:30, Rustom Mody wrote: >> In scheme, in a named-let, the name >> chosen was very often 'loop' > > Umm... I see from your prime number example that there are nested > loops in which sometimes you restart the inner and sometimes the > outer. So you could not possibly call both of them 'loop' :-). Correct. I could call them "inner" and "outer". After all, the code uses variables like "i", "c" and "n". However, it doesn't hurt to use variable/function/loop names that convey meaning. Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-01 21:39 +1100 |
| Message-ID | <mailman.8793.1396348750.18130.python-list@python.org> |
| In reply to | #69499 |
On Tue, Apr 1, 2014 at 8:58 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> Setting aside the fact that C doesn't have anonymous functions, I'll
> approximate it as best I can:
>
> static int n = 3;
>
> int f()
> {
> return n;
> }
>
> int main()
> {
> n = 7;
> return f();
> }
>
> C: 10
> Scheme: 20
And the less trivial the example, the more difference you'll see. This
is Scheme inside LilyPond, a little translator that lets me input
lyrics in a tidy way that can then be turned into something that works
well with both MIDI Karaoke and printed score:
#(define (bang2slashn lst) (
cond ((null? lst) 0)
(else (begin
(if (equal? (ly:music-property (car lst) 'name) 'LyricEvent)
(let ((txt (ly:music-property (car lst) 'text)))
(if (equal? (string-ref txt 0) #\!) (begin
; Debugging display
; (display (ly:music-property (car lst) 'name)) (display "
- ") (display txt) (newline)
; Prepend a newline instead of the exclamation mark -
works for both MIDI Karaoke and page layout
(ly:music-set-property! (car lst) 'text (string-append
"\n" (substring txt 1 (string-length txt))))
))))
(bang2slashn (ly:music-property (car lst) 'elements))
(bang2slashn (cdr lst))
))
))
% Call the above recursive function
lyr=#(define-music-function (parser location lyrics) (ly:music?)
(bang2slashn (ly:music-property lyrics 'elements))
lyrics
)
Now, this was written by a non-Scheme programmer, so it's not going to
be optimal code, but I doubt it's going to lose a huge number of
parentheses. Not counting the commented-out debugging line, that's 41
pairs of them in a short but non-trivial piece of code. Translating it
to C isn't easy, in the same way that it's hard to explain how to do
client-side web form validation in Lua; but here's an attempt. It
assumes a broadly C-like structure to LilyPond (eg that the elements
are passed as a structure; they are a tree already, as you can see by
the double-recursive function above), which is of course not the case,
but here goes:
void bang2slashn(struct element *lst)
{
while (lst)
{
if (!strcmp(lst->name, "LyricEvent"))
{
char *text = music_property(lst, "text");
/* Okay, C doesn't have string manipulation, so I cheat here */
/* If this were C++ or Pike, some notation nearer to the
original would work */
if (*text == '!') music_set_property(lst, "text", "\n" + text[1..]);
}
bang2slashn(lst->elements);
lst = lst->next;
}
}
DEFINE_MUSIC_FUNCTION(PARSER_LOCATION_LYRICS, bang2slashn);
That's nine pair parens, three braces, and one square. I assume a lot
about the supposed C-like interface to LilyPond, but I think anyone
who knows both C and Scheme would agree that I haven't been
horrendously unfair in the translation. (Though I will accept an
alternate implementation of the Scheme version. If you can cut it down
to just 26 pair parens, you'll achieve the 2:1 ratio that Ian
mentioned. And if you can cut it down to 13 pairs, you've equalled my
count.) The only way to have the C figure come up approximately equal
is to count a semicolon as if it were a pair of parens - Scheme has an
extra set of parens doing the job of separating one function call from
another. But that adds only another 5, bringing C up to a total of 18
(plus a few more if I used functions to do my string manipulation, so
let's say about 20-25) where Scheme is still at roughly twice that.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Antoon Pardon <antoon.pardon@rece.vub.ac.be> |
|---|---|
| Date | 2014-04-01 12:37 +0200 |
| Message-ID | <mailman.8792.1396348682.18130.python-list@python.org> |
| In reply to | #69195 |
On 01-04-14 11:18, Ian Kelly wrote: > On Tue, Apr 1, 2014 at 2:19 AM, Antoon Pardon > <antoon.pardon@rece.vub.ac.be> wrote: >> On 01-04-14 02:47, Ian Kelly wrote: >> >>> Well, this is the path taken by APL. It has its supporters. It's not >>> known for being readable. >> No that is not the path taken by APL. AFAICS identifiers in APL are just >> like identifiers in python. The path taken by APL was that there were >> a lot more operators available that used non-alphanumeric characters. >> >> AFICS APL programs tend to be unreadable because they are mostly written >> in a very concise style. >> >> I think this is more the path taken by lisp-like languages where '+' is >> a name just like 'alpha' or 'r2d2'. In scheme I can just do the following. >> >> (define √ sqrt) >> (√ 4) > You're still using the symbol as the name of an operation, though, so > I see no practical difference from the APL style. The operation just > happens to be user-defined rather than built-in. Python also uses symbols for names of operations, like '+'. And when someone suggested python might consider increasing the number of operations and gave some symbols for those extra operations, nobody suggested that would make python unreadable, though it would be far more like the path taken by APL then what we are discussing now. But the idea we are discussing here has nothing to do with introducing more operators and use symbolic characters for that and as such wouldn't make python more APL like. You only bring up APL because it uses a number of unfamilar symbols and you attribute the unreadabilty of APL programs mostly to that. But regarding the functionality we are talking here APL doesn't have it. So we are not talking about the path taken by APL. -- Antoon Pardon
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-01 21:58 +1100 |
| Message-ID | <mailman.8794.1396349898.18130.python-list@python.org> |
| In reply to | #69195 |
On Tue, Apr 1, 2014 at 9:37 PM, Antoon Pardon
<antoon.pardon@rece.vub.ac.be> wrote:
> Python also uses symbols for names of operations, like '+'. And when
> someone suggested python might consider increasing the number of
> operations and gave some symbols for those extra operations, nobody
> suggested that would make python unreadable, though it would be far
> more like the path taken by APL then what we are discussing now.
Actually, people did. But mainly the thread (look up "Time we switched
to unicode?") went off looking at how hard it'd be to type those
operators, and therefore the more serious point that there would
either be hard-to-type language elements or duplicate syntactic tokens
("lambda" as well as "λ", etc). That isn't an issue with names,
because any name has only one, well, name. If you choose to use both
"alpha" and "α" as names, that's fine, and they're distinct names. You
can make your code unreadable, and it doesn't impact my code at all.
Language-level features like operators have stronger concerns.
But because, in the future, Python may choose to create new operators,
the simplest and safest way to ensure safety is to put a boundary on
what can be operators and what can be names; Unicode character classes
are perfect for this. It's also possible that all Unicode whitespace
characters might become legal for indentation and separation (maybe
they are already??), so obviously they're ruled out as identifiers;
anyway, I honestly do not think people would want to use U+2007 FIGURE
SPACE inside a name. So if we deny whitespace, and accept letters and
digits, it makes good sense to deny mathematical symbols so as to keep
them available for operators. (It also makes reasonable sense to
*permit* mathematical symbols, thus allowing you to use them for
functions/methods, in the same way that you can use "n", "o", and "t",
but not "not"; but with word operators, the entire word has to be used
as-is before it's a collision - with a symbolic one, any instance of
that symbol inside a name will change parsing entirely. It's a
trade-off, and Python's made a decision one way and not the other.)
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Antoon Pardon <antoon.pardon@rece.vub.ac.be> |
|---|---|
| Date | 2014-04-01 13:59 +0200 |
| Message-ID | <mailman.8796.1396354601.18130.python-list@python.org> |
| In reply to | #69195 |
On 01-04-14 12:58, Chris Angelico wrote: > But because, in the future, Python may choose to create new operators, > the simplest and safest way to ensure safety is to put a boundary on > what can be operators and what can be names; Unicode character classes > are perfect for this. It's also possible that all Unicode whitespace > characters might become legal for indentation and separation (maybe > they are already??), so obviously they're ruled out as identifiers; > anyway, I honestly do not think people would want to use U+2007 FIGURE > SPACE inside a name. So if we deny whitespace, and accept letters and > digits, it makes good sense to deny mathematical symbols so as to keep > them available for operators. (It also makes reasonable sense to > *permit* mathematical symbols, thus allowing you to use them for > functions/methods, in the same way that you can use "n", "o", and "t", > but not "not"; but with word operators, the entire word has to be used > as-is before it's a collision - with a symbolic one, any instance of > that symbol inside a name will change parsing entirely. It's a > trade-off, and Python's made a decision one way and not the other.) This mostly makes sense to me. The only caveat I have is that since we also allow _ (U+005F LOW LINE) in names which belongs to the category <puctuation, connector>, we should allow other symbols within this category in a name. But I confess that is mostly personal taste, since I find names_like_this ugly. Names-like-this look better to me but that wouldn't be workable in python. But maybe there is some connector that would be aestetically pleasing and not causing other problems. -- Antoon Pardon
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2014-04-01 08:29 -0400 |
| Message-ID | <roy-677855.08291301042014@news.panix.com> |
| In reply to | #69506 |
In article <mailman.8796.1396354601.18130.python-list@python.org>, Antoon Pardon <antoon.pardon@rece.vub.ac.be> wrote: > On 01-04-14 12:58, Chris Angelico wrote: > > But because, in the future, Python may choose to create new operators, > > the simplest and safest way to ensure safety is to put a boundary on > > what can be operators and what can be names; Unicode character classes > > are perfect for this. It's also possible that all Unicode whitespace > > characters might become legal for indentation and separation (maybe > > they are already??), so obviously they're ruled out as identifiers; > > anyway, I honestly do not think people would want to use U+2007 FIGURE > > SPACE inside a name. So if we deny whitespace, and accept letters and > > digits, it makes good sense to deny mathematical symbols so as to keep > > them available for operators. (It also makes reasonable sense to > > *permit* mathematical symbols, thus allowing you to use them for > > functions/methods, in the same way that you can use "n", "o", and "t", > > but not "not"; but with word operators, the entire word has to be used > > as-is before it's a collision - with a symbolic one, any instance of > > that symbol inside a name will change parsing entirely. It's a > > trade-off, and Python's made a decision one way and not the other.) > > This mostly makes sense to me. The only caveat I have is that since we > also allow _ (U+005F LOW LINE) in names which belongs to the category > <puctuation, connector>, we should allow other symbols within this > category in a name. > > But I confess that is mostly personal taste, since I find names_like_this > ugly. Names-like-this look better to me but that wouldn't be workable > in python. But maybe there is some connector that would be aestetically > pleasing and not causing other problems. Semi-seriously, let me suggest (names like this). It's not valid syntax now, so it can't break any existing code. It reuses existing punctuation in a way which is a logical extension of its traditional meaning, i.e. "group these things together".
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-02 00:08 +1100 |
| Message-ID | <mailman.8799.1396357704.18130.python-list@python.org> |
| In reply to | #69507 |
On Tue, Apr 1, 2014 at 11:29 PM, Roy Smith <roy@panix.com> wrote:
>> But I confess that is mostly personal taste, since I find names_like_this
>> ugly. Names-like-this look better to me but that wouldn't be workable
>> in python. But maybe there is some connector that would be aestetically
>> pleasing and not causing other problems.
>
> Semi-seriously, let me suggest (names like this). It's not valid syntax
> now, so it can't break any existing code. It reuses existing
> punctuation in a way which is a logical extension of its traditional
> meaning, i.e. "group these things together".
I'd really rather not have a drastically different concept of "name"
to every other language's definition! Reading over COBOL code is
confusing in ways that reading, say, Ruby code isn't; the ? and !
suffixes aren't nearly as confusing as:
http://www.math-cs.gordon.edu/courses/cs323/COBOL/cobol.html
"""
COBOL identifers are 1-30 alphanumeric characters, at least one of
which must be non-numeric.
In certain contexts it is permissible to use a totally numeric
identifier; however, that usage
is discouraged. Hyphens may be included in an identifier anywhere
except the first of last
character.
"""
Hyphens in names! Ugh! That means subtraction! :)
But there is a solution! You can have *anything you want* in your
identifiers. Watch:
v = {}
v["names like this"] = 42
print(v["names like this"])
Yes, that's a five-character delimiter/marker. But it works!!
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-01 06:34 -0700 |
| Message-ID | <c50f8129-cc48-4981-bb60-43e680bdd155@googlegroups.com> |
| In reply to | #69510 |
On Tuesday, April 1, 2014 6:38:14 PM UTC+5:30, Chris Angelico wrote: > On Tue, Apr 1, 2014 at 11:29 PM, Roy Smith wrote: > >> But I confess that is mostly personal taste, since I find names_like_this > >> ugly. Names-like-this look better to me but that wouldn't be workable > >> in python. But maybe there is some connector that would be aestetically > >> pleasing and not causing other problems. > > Semi-seriously, let me suggest (names like this). It's not valid syntax > > now, so it can't break any existing code. It reuses existing > > punctuation in a way which is a logical extension of its traditional > > meaning, i.e. "group these things together". > I'd really rather not have a drastically different concept of "name" > to every other language's definition! Reading over COBOL code is > confusing in ways that reading, say, Ruby code isn't; the ? and ! > suffixes aren't nearly as confusing as: > http://www.math-cs.gordon.edu/courses/cs323/COBOL/cobol.html > """ > COBOL identifers are 1-30 alphanumeric characters, at least one of > which must be non-numeric. > In certain contexts it is permissible to use a totally numeric > identifier; however, that usage > is discouraged. Hyphens may be included in an identifier anywhere > except the first of last > character. > """ > Hyphens in names! Ugh! That means subtraction! :) Just temporarily switch to a domain other than programming -- one that has not been under the absolute hegemony of ASCII for 40 years and you may get different results -- See 1st item from here: http://searchengineland.com/9-seo-quirks-you-should-be-aware-of-146465
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-02 00:00 +1100 |
| Message-ID | <mailman.8798.1396357260.18130.python-list@python.org> |
| In reply to | #69195 |
On Tue, Apr 1, 2014 at 10:59 PM, Antoon Pardon <antoon.pardon@rece.vub.ac.be> wrote: > On 01-04-14 12:58, Chris Angelico wrote: >> But because, in the future, Python may choose to create new operators, >> the simplest and safest way to ensure safety is to put a boundary on >> what can be operators and what can be names; Unicode character classes >> are perfect for this. It's also possible that all Unicode whitespace >> characters might become legal for indentation and separation (maybe >> they are already??), so obviously they're ruled out as identifiers; >> anyway, I honestly do not think people would want to use U+2007 FIGURE >> SPACE inside a name. So if we deny whitespace, and accept letters and >> digits, it makes good sense to deny mathematical symbols so as to keep >> them available for operators. (It also makes reasonable sense to >> *permit* mathematical symbols, thus allowing you to use them for >> functions/methods, in the same way that you can use "n", "o", and "t", >> but not "not"; but with word operators, the entire word has to be used >> as-is before it's a collision - with a symbolic one, any instance of >> that symbol inside a name will change parsing entirely. It's a >> trade-off, and Python's made a decision one way and not the other.) > > This mostly makes sense to me. The only caveat I have is that since we > also allow _ (U+005F LOW LINE) in names which belongs to the category > <puctuation, connector>, we should allow other symbols within this > category in a name. > > But I confess that is mostly personal taste, since I find names_like_this > ugly. Names-like-this look better to me but that wouldn't be workable > in python. But maybe there is some connector that would be aestetically > pleasing and not causing other problems. That's reasonable. The Pc category doesn't have much in it: http://www.fileformat.info/info/unicode/category/Pc/list.htm If the definition of "characters permitted in identifiers" is derived exclusively from the Unicode categories, including Pc would make fine sense. Probably the definition should be: First character is L* or Pc, subsequent characters are L*, N*, or Pc, and either Mn or M* (combining characters). Or something like that. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Ned Batchelder <ned@nedbatchelder.com> |
|---|---|
| Date | 2014-04-01 09:33 -0400 |
| Message-ID | <mailman.8801.1396359227.18130.python-list@python.org> |
| In reply to | #69195 |
On 4/1/14 9:00 AM, Chris Angelico wrote:
> On Tue, Apr 1, 2014 at 10:59 PM, Antoon Pardon
> <antoon.pardon@rece.vub.ac.be> wrote:
>> On 01-04-14 12:58, Chris Angelico wrote:
>>> But because, in the future, Python may choose to create new operators,
>>> the simplest and safest way to ensure safety is to put a boundary on
>>> what can be operators and what can be names; Unicode character classes
>>> are perfect for this. It's also possible that all Unicode whitespace
>>> characters might become legal for indentation and separation (maybe
>>> they are already??), so obviously they're ruled out as identifiers;
>>> anyway, I honestly do not think people would want to use U+2007 FIGURE
>>> SPACE inside a name. So if we deny whitespace, and accept letters and
>>> digits, it makes good sense to deny mathematical symbols so as to keep
>>> them available for operators. (It also makes reasonable sense to
>>> *permit* mathematical symbols, thus allowing you to use them for
>>> functions/methods, in the same way that you can use "n", "o", and "t",
>>> but not "not"; but with word operators, the entire word has to be used
>>> as-is before it's a collision - with a symbolic one, any instance of
>>> that symbol inside a name will change parsing entirely. It's a
>>> trade-off, and Python's made a decision one way and not the other.)
>>
>> This mostly makes sense to me. The only caveat I have is that since we
>> also allow _ (U+005F LOW LINE) in names which belongs to the category
>> <puctuation, connector>, we should allow other symbols within this
>> category in a name.
>>
>> But I confess that is mostly personal taste, since I find names_like_this
>> ugly. Names-like-this look better to me but that wouldn't be workable
>> in python. But maybe there is some connector that would be aestetically
>> pleasing and not causing other problems.
>
> That's reasonable. The Pc category doesn't have much in it:
>
> http://www.fileformat.info/info/unicode/category/Pc/list.htm
>
> If the definition of "characters permitted in identifiers" is derived
> exclusively from the Unicode categories, including Pc would make fine
> sense. Probably the definition should be: First character is L* or Pc,
> subsequent characters are L*, N*, or Pc, and either Mn or M*
> (combining characters). Or something like that.
Maybe I'm misunderstanding the discussion... It seems like we're talking
about a hypothetical definition of identifiers based on Unicode
character categories, but there's no need: Python 3 has defined
precisely that. From the docs
(https://docs.python.org/3/reference/lexical_analysis.html#identifiers):
---<snip>---------
Python 3.0 introduces additional characters from outside the ASCII range
(see PEP 3131). For these characters, the classification uses the
version of the Unicode Character Database as included in the unicodedata
module.
Identifiers are unlimited in length. Case is significant.
identifier ::= xid_start xid_continue*
id_start ::= <all characters in general categories Lu, Ll, Lt, Lm,
Lo, Nl, the underscore, and characters with the Other_ID_Start property>
id_continue ::= <all characters in id_start, plus characters in the
categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property>
xid_start ::= <all characters in id_start whose NFKC normalization
is in "id_start xid_continue*">
xid_continue ::= <all characters in id_continue whose NFKC
normalization is in "id_continue*">
The Unicode category codes mentioned above stand for:
Lu - uppercase letters
Ll - lowercase letters
Lt - titlecase letters
Lm - modifier letters
Lo - other letters
Nl - letter numbers
Mn - nonspacing marks
Mc - spacing combining marks
Nd - decimal numbers
Pc - connector punctuations
Other_ID_Start - explicit list of characters in PropList.txt to
support backwards compatibility
Other_ID_Continue - likewise
All identifiers are converted into the normal form NFKC while parsing;
comparison of identifiers is based on NFKC.
---<end snip>-----
>
> ChrisA
>
--
Ned Batchelder, http://nedbatchelder.com
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-02 00:44 +1100 |
| Message-ID | <mailman.8802.1396359858.18130.python-list@python.org> |
| In reply to | #69195 |
On Wed, Apr 2, 2014 at 12:33 AM, Ned Batchelder <ned@nedbatchelder.com> wrote: > Maybe I'm misunderstanding the discussion... It seems like we're talking > about a hypothetical definition of identifiers based on Unicode character > categories, but there's no need: Python 3 has defined precisely that. From > the docs > (https://docs.python.org/3/reference/lexical_analysis.html#identifiers): > "Python 3.0 introduces **additional characters** from outside the ASCII range" - emphasis mine. Python currently has - at least, per that documentation - a hybrid system with ASCII characters defined in the classic way, and non-ASCII characters defined by their Unicode character classes. I'm talking about a system that's _purely_ defined by Unicode character classes. It may turn out that the class list exactly compasses the ASCII characters listed, though, in which case you'd be right: it's not hypothetical. In any case, Pc is included, which I should have checked beforehand. So that part is, as you say, not hypothetical. Go for it! Use 'em. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-01 06:58 -0700 |
| Message-ID | <d9f78a2a-94a9-41f4-8c5b-a7c519796946@googlegroups.com> |
| In reply to | #69515 |
On Tuesday, April 1, 2014 7:14:15 PM UTC+5:30, Chris Angelico wrote: > On Wed, Apr 2, 2014 at 12:33 AM, Ned Batchelder wrote: > > Maybe I'm misunderstanding the discussion... It seems like we're talking > > about a hypothetical definition of identifiers based on Unicode character > > categories, but there's no need: Python 3 has defined precisely that. From > > the docs > > (https://docs.python.org/3/reference/lexical_analysis.html#identifiers): > "Python 3.0 introduces **additional characters** from outside the > ASCII range" - emphasis mine. > Python currently has - at least, per that documentation - a hybrid > system with ASCII characters defined in the classic way, and non-ASCII > characters defined by their Unicode character classes. I'm talking > about a system that's _purely_ defined by Unicode character classes. > It may turn out that the class list exactly compasses the ASCII > characters listed, though, in which case you'd be right: it's not > hypothetical. > In any case, Pc is included, which I should have checked beforehand. > So that part is, as you say, not hypothetical. Go for it! Use 'em. Dunno if you really mean it or are just saying... Steven gave the example the other day of confusing the identifiers A and А. There must be easily hundreds (thousands?) of other such confusables. So you think thats nice and APL(-ese), Scheme(-ish) is not...??? Confused by your stand... Personally I dont believe that unicode has been designed with programming languages in mind. Assuming that unicode categories will naturally and easily fit programming language lexical/syntax categories is rather naive.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2014-04-01 09:53 -0600 |
| Message-ID | <mailman.8807.1396367664.18130.python-list@python.org> |
| In reply to | #69195 |
On Tue, Apr 1, 2014 at 7:44 AM, Chris Angelico <rosuav@gmail.com> wrote: > On Wed, Apr 2, 2014 at 12:33 AM, Ned Batchelder <ned@nedbatchelder.com> wrote: >> Maybe I'm misunderstanding the discussion... It seems like we're talking >> about a hypothetical definition of identifiers based on Unicode character >> categories, but there's no need: Python 3 has defined precisely that. From >> the docs >> (https://docs.python.org/3/reference/lexical_analysis.html#identifiers): >> > > "Python 3.0 introduces **additional characters** from outside the > ASCII range" - emphasis mine. > > Python currently has - at least, per that documentation - a hybrid > system with ASCII characters defined in the classic way, and non-ASCII > characters defined by their Unicode character classes. I'm talking > about a system that's _purely_ defined by Unicode character classes. > It may turn out that the class list exactly compasses the ASCII > characters listed, though, in which case you'd be right: it's not > hypothetical. The only ASCII character not encompassed is that _ is explicitly permitted to start an identifier (for obvious reasons) whereas characters in Pc are more generally only permitted to continue identifiers. There are also explicit lists of extra permitted characters in PropList.txt for backward compatibility (once a character is permitted, it should remain permitted even if its Unicode category changes). There are currently 4 extra starting characters and 12 extra continuing characters, but none of these are ASCII.
[toc] | [prev] | [next] | [standalone]
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2014-03-26 02:56 +0000 |
| Message-ID | <mailman.8560.1395802562.18130.python-list@python.org> |
| In reply to | #69057 |
On 2014-03-25 22:47, Ethan Furman wrote: > On 03/25/2014 12:29 PM, Mark H Harris wrote: >> On 3/25/14 2:24 PM, MRAB wrote: >>> It's explained in PEP 3131. >>> >>> Basically, a name should to start with a letter (this has been extended >>> to include Chinese characters, etc) or an underscore. >>> >>> λ is a classified as Lowercase_Letter. >>> >>> √ is classified as Math_Symbol. >> >> Thanks much! I'll note that for improvements. Any unicode symbol (that is not a number) should be allowed as an >> identifier. > > No, it shouldn't. Doing so would mean we could not use √ as the square root operator in the future. > Or as a root operator, e.g. 3 √ x (the cube root of x). > Identifiers are made up of letters, numbers, and the underscore. Considering all the unicode letters and unicode > numbers out there, you shouldn't be lacking for names. >
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-26 14:09 +1100 |
| Message-ID | <mailman.8561.1395803386.18130.python-list@python.org> |
| In reply to | #69057 |
On Wed, Mar 26, 2014 at 1:56 PM, MRAB <python@mrabarnett.plus.com> wrote: >> No, it shouldn't. Doing so would mean we could not use √ as the square >> root operator in the future. >> > Or as a root operator, e.g. 3 √ x (the cube root of x). Or both! It could be like unary negation and binary subtraction. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Antoon Pardon <antoon.pardon@rece.vub.ac.be> |
|---|---|
| Date | 2014-03-26 09:25 +0100 |
| Message-ID | <mailman.8564.1395822345.18130.python-list@python.org> |
| In reply to | #69057 |
On 25-03-14 23:47, Ethan Furman wrote: > On 03/25/2014 12:29 PM, Mark H Harris wrote: >> On 3/25/14 2:24 PM, MRAB wrote: >>> It's explained in PEP 3131. >>> >>> Basically, a name should to start with a letter (this has been extended >>> to include Chinese characters, etc) or an underscore. >>> >>> λ is a classified as Lowercase_Letter. >>> >>> √ is classified as Math_Symbol. >> >> Thanks much! I'll note that for improvements. Any unicode symbol >> (that is not a number) should be allowed as an >> identifier. > > No, it shouldn't. Doing so would mean we could not use √ as the > square root operator in the future. And what advantage would that bring over just using it as a function? -- Antoon Pardon
[toc] | [prev] | [next] | [standalone]
Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →
Back to top | Article view | comp.lang.python
csiph-web