Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #25841 > unrolled thread
| Started by | Chris Angelico <rosuav@gmail.com> |
|---|---|
| First post | 2012-07-23 17:59 +1000 |
| Last post | 2012-07-23 17:50 -0700 |
| Articles | 19 on this page of 59 — 23 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: the meaning of r’.......‘ Chris Angelico <rosuav@gmail.com> - 2012-07-23 17:59 +1000
Re: the meaning of r’.......‘ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-23 08:07 +0000
Re: the meaning of rユ.......ï¾ Roy Smith <roy@panix.com> - 2012-07-23 08:55 -0400
Re: the meaning of rユ.......�¾ Chris Angelico <rosuav@gmail.com> - 2012-07-23 23:06 +1000
Re: the meaning of r¹.......?34 Roy Smith <roy@panix.com> - 2012-07-23 09:22 -0400
Re: the meaning of rユ.......�¾ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-23 15:59 +0000
Re: the meaning of rユ.......�¾ Chris Angelico <rosuav@gmail.com> - 2012-07-24 02:10 +1000
Re: the meaning of r?.......?¾ Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-07-23 16:46 -0400
Re: the meaning of r`.......` Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2012-07-24 10:45 +0200
Re: the meaning of rユ.......ï¾ Alex Strickland <sscc@mweb.co.za> - 2012-07-23 15:10 +0200
Re: the meaning of rユ.......�¾ Dave Angel <d@davea.name> - 2012-07-23 09:22 -0400
Re: the meaning of rユ.......�¾ Larry Hudson <orgnut@yahoo.com> - 2012-07-24 00:01 -0700
Re: the meaning of rユ.......ï¾ Henrik Faber <hfaber@invalid.net> - 2012-07-23 15:24 +0200
Re: the meaning of rユ.......ï¾ Chris Angelico <rosuav@gmail.com> - 2012-07-23 23:35 +1000
Re: the meaning of rユ.......ï¾ Henrik Faber <hfaber@invalid.net> - 2012-07-23 15:52 +0200
Re: the meaning of rユ.......ï¾ Henrik Faber <hfaber@invalid.net> - 2012-07-23 15:55 +0200
Re: the meaning of rユ.......ï¾ Henrik Faber <hfaber@invalid.net> - 2012-07-23 15:59 +0200
Re: the meaning of rユ.......ï¾ Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-07-23 16:08 +0200
Re: the meaning of rユ.......ï¾ Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-07-23 15:43 +0100
Re: the meaning of rユ.......ï¾ Henrik Faber <hfaber@invalid.net> - 2012-07-23 16:43 +0200
Re: the meaning of rユ.......ï¾ Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-07-23 16:29 +0100
Re: the meaning of rユ.......ï¾ John Gordon <gordon@panix.com> - 2012-07-23 15:56 +0000
Re: the meaning of rユ.......ï¾ Chris Angelico <rosuav@gmail.com> - 2012-07-24 02:09 +1000
Re: the meaning of rユ.......ï¾ Ben Finney <ben+python@benfinney.id.au> - 2012-07-25 00:35 +1000
Re: the meaning of rユ.......ï¾ Chris Angelico <rosuav@gmail.com> - 2012-07-25 00:50 +1000
Re: the meaning of rユ.......ï¾ Maarten <maarten.sneep@knmi.nl> - 2012-07-24 07:44 -0700
Re: the meaning of rユ.......ï¾ Ben Finney <ben+python@benfinney.id.au> - 2012-07-25 13:07 +1000
Re: the meaning of rユ.......ï¾ Chris Angelico <rosuav@gmail.com> - 2012-07-25 14:11 +1000
Re: the meaning of rユ.......ï¾ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-25 06:07 +0000
Re: the meaning of rユ.......ï¾ Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-07-25 07:37 +0100
Re: the meaning of rユ.......ï¾ Ben Finney <ben+python@benfinney.id.au> - 2012-07-25 16:43 +1000
Re: the meaning of rユ.......ï¾ Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-07-25 08:13 +0100
Re: the meaning of rユ.......ï¾ Ian Kelly <ian.g.kelly@gmail.com> - 2012-07-25 09:21 -0600
Re: the meaning of rユ.......ï¾ Ian Kelly <ian.g.kelly@gmail.com> - 2012-07-24 10:36 -0600
Re: the meaning of rユ.......ï¾ Chris Angelico <rosuav@gmail.com> - 2012-07-25 02:42 +1000
Re: the meaning of rユ.......ï¾ Ethan Furman <ethan@stoneleaf.us> - 2012-07-24 09:59 -0700
Re: the meaning of rユ.......ï¾ Devin Jeanpierre <jeanpierreda@gmail.com> - 2012-07-24 10:02 -0400
Re: the meaning of rユ.......ï¾ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-23 17:10 +0000
Re: the meaning of rユ.......ï¾ rusi <rustompmody@gmail.com> - 2012-07-23 08:08 -0700
Re: the meaning of rユ.......ï¾ Jan Riechers <janpeterr@freenet.de> - 2012-07-23 21:29 +0300
Re: the meaning of rユ.......ï¾ Devin Jeanpierre <jeanpierreda@gmail.com> - 2012-07-23 10:10 -0400
Re: the meaning of rユ.......ï¾ Henrik Faber <hfaber@invalid.net> - 2012-07-23 16:40 +0200
Re: the meaning of rユ.......ï¾ Chris Angelico <rosuav@gmail.com> - 2012-07-24 00:53 +1000
Re: the meaning of rユ.......ï¾ Devin Jeanpierre <jeanpierreda@gmail.com> - 2012-07-23 11:02 -0400
Re: the meaning of rユ.......ï¾ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-23 16:54 +0000
Re: the meaning of rユ.......ï¾ Chris Angelico <rosuav@gmail.com> - 2012-07-24 00:19 +1000
Re: the meaning of rユ.......ï¾ Henrik Faber <hfaber@invalid.net> - 2012-07-23 16:42 +0200
Re: the meaning of rユ.......ï¾ Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-07-23 14:10 -0400
Re: the meaning of rユ.......ï¾ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-23 16:53 +0000
Re: the meaning of r?.......ï¾ MRAB <python@mrabarnett.plus.com> - 2012-07-23 16:46 +0100
Re: the meaning of rユ.......ï¾ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-23 16:33 +0000
Re: the meaning of rユ.......ï¾ Ross Ridge <rridge@csclub.uwaterloo.ca> - 2012-07-23 14:43 -0400
Re: the meaning of rユ.......ï¾ Henrik Faber <hfaber@invalid.net> - 2012-07-23 15:26 +0200
Re: the meaning of rïŸ.......ïŸ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-23 15:49 +0000
Re: the meaning of rïŸ.......ïŸ Serhiy Storchaka <storchaka@gmail.com> - 2012-07-26 16:34 +0300
Re: the meaning of rユ.......ï¾ Ross Ridge <rridge@csclub.uwaterloo.ca> - 2012-07-23 12:19 -0400
Re: the meaning of r?.......?¾ Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-07-23 13:44 -0400
Re: the meaning of r’.......‘ John Roth <johnroth1@gmail.com> - 2012-07-23 17:50 -0700
Re: the meaning of r’.......‘ John Roth <johnroth1@gmail.com> - 2012-07-23 17:50 -0700
Page 3 of 3 — ← Prev page 1 2 [3]
| From | Devin Jeanpierre <jeanpierreda@gmail.com> |
|---|---|
| Date | 2012-07-23 10:10 -0400 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <mailman.2471.1343052666.4697.python-list@python.org> |
| In reply to | #25869 |
On Mon, Jul 23, 2012 at 9:52 AM, Henrik Faber <hfaber@invalid.net> wrote: > If you allow for UTF-8 identifiers you'll have to be horribly careful > what to include and what to exclude. Is the non-breaking space a valid > character for a identifier? Technically it's a different character than > the normal space, so why shouldn't it be? What an awesome idea! > > What about × vs x? Or Ì vs Í vs Î vs Ï vs Ĩ vs Ī vs ī vs Ĭ vs ĭ vs Į vs > į vs I vs İ? Do you think if you need to maintain such code you'll > immediately know the difference between the 13 (!) different "I"s I just > happened to pull out randomly you need to chose and how to get it? What > about Ȝ vs ȝ? Or Ȣ vs ȣ? Or ȸ vs ȹ? Or d vs Ԁ vs ԁ vs ԃ vs Ԃ? Or ց vs g? > Or ս vs u? Yes, as soon as we add unicode to anything everyone will go insane and write gibberish. -- Devin
[toc] | [prev] | [next] | [standalone]
| From | Henrik Faber <hfaber@invalid.net> |
|---|---|
| Date | 2012-07-23 16:40 +0200 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <jujnnt$osp$1@speranza.aioe.org> |
| In reply to | #25874 |
On 23.07.2012 16:10, Devin Jeanpierre wrote: > On Mon, Jul 23, 2012 at 9:52 AM, Henrik Faber <hfaber@invalid.net> wrote: >> If you allow for UTF-8 identifiers you'll have to be horribly careful >> what to include and what to exclude. Is the non-breaking space a valid >> character for a identifier? Technically it's a different character than >> the normal space, so why shouldn't it be? What an awesome idea! >> >> What about × vs x? Or Ì vs Í vs Î vs Ï vs Ĩ vs Ī vs ī vs Ĭ vs ĭ vs Į vs >> į vs I vs İ? Do you think if you need to maintain such code you'll >> immediately know the difference between the 13 (!) different "I"s I just >> happened to pull out randomly you need to chose and how to get it? What >> about Ȝ vs ȝ? Or Ȣ vs ȣ? Or ȸ vs ȹ? Or d vs Ԁ vs ԁ vs ԃ vs Ԃ? Or ց vs g? >> Or ս vs u? > > Yes, as soon as we add unicode to anything everyone will go insane and > write gibberish. No, you misunderstood me. I didn't say people are going to write gibberish. What I'm saying is that as a foreigner (who doesn't know most of these characters), it can be hard to accurately choose which one is the correct one. This is especially true if the appropriate keys are not available on your keyboard. So it makes maintenance of other people's code much more difficult if they didn't on their own chose to limit themselves to ASCII. Regards, Henrik
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-07-24 00:53 +1000 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <mailman.2479.1343055203.4697.python-list@python.org> |
| In reply to | #25878 |
On Tue, Jul 24, 2012 at 12:40 AM, Henrik Faber <hfaber@invalid.net> wrote: > No, you misunderstood me. I didn't say people are going to write > gibberish. What I'm saying is that as a foreigner (who doesn't know most > of these characters), it can be hard to accurately choose which one is > the correct one. This is especially true if the appropriate keys are not > available on your keyboard. So it makes maintenance of other people's > code much more difficult if they didn't on their own chose to limit > themselves to ASCII. This sounds like a job for a project's style guide. Have the language support full Unicode (or maybe "everything except whitespace" or whatever), and let each project head stipulate the project's own restrictions. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Devin Jeanpierre <jeanpierreda@gmail.com> |
|---|---|
| Date | 2012-07-23 11:02 -0400 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <mailman.2480.1343055817.4697.python-list@python.org> |
| In reply to | #25878 |
On Mon, Jul 23, 2012 at 10:40 AM, Henrik Faber <hfaber@invalid.net> wrote: > No, you misunderstood me. I didn't say people are going to write > gibberish. What I'm saying is that as a foreigner (who doesn't know most > of these characters), it can be hard to accurately choose which one is > the correct one. This is especially true if the appropriate keys are not > available on your keyboard. So it makes maintenance of other people's > code much more difficult if they didn't on their own chose to limit > themselves to ASCII. I understand, and agree. But that's not always all that likely a situation. If your company only employs native Arabic speakers, why not use Arabic variable names and Arabic script? You'll get more out of making the language easy to understand for the people that work there, than out of hedging your bets on the off chance that some American will stroll in and be confused. (It is my understanding that, in any case, many non-English companies do their coding in English. That doesn't mean it's a general rule that should be forced on everyone.) -- Devin
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-07-23 16:54 +0000 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <500d81d9$0$29978$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #25874 |
On Mon, 23 Jul 2012 10:10:22 -0400, Devin Jeanpierre wrote: > Yes, as soon as we add unicode to anything everyone will go insane and > write gibberish. :) -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-07-24 00:19 +1000 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <mailman.2473.1343053204.4697.python-list@python.org> |
| In reply to | #25869 |
On Mon, Jul 23, 2012 at 11:52 PM, Henrik Faber <hfaber@invalid.net> wrote: > What about × vs x? Or Ì vs Í vs Î vs Ï vs Ĩ vs Ī vs ī vs Ĭ vs ĭ vs Į vs > į vs I vs İ? Do you think if you need to maintain such code you'll > immediately know the difference between the 13 (!) different "I"s I just > happened to pull out randomly you need to chose and how to get it? What > about Ȝ vs ȝ? Or Ȣ vs ȣ? Or ȸ vs ȹ? Or d vs Ԁ vs ԁ vs ԃ vs Ԃ? Or ց vs g? > Or ս vs u? If they're different characters, they're different. It's not unlike the confusion you can already get between uppercase I and lowercase l, or between uppercase and lowercase of the same letter, or between rn and m, or between any other of myriad confusingly-similar pairs that can be found just in ASCII. Of course, SOMEBODY is going to make use of those to improve upon this sort of code: http://thedailywtf.com/Articles/Uppity.aspx ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Henrik Faber <hfaber@invalid.net> |
|---|---|
| Date | 2012-07-23 16:42 +0200 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <jujnt8$osp$2@speranza.aioe.org> |
| In reply to | #25876 |
On 23.07.2012 16:19, Chris Angelico wrote: > On Mon, Jul 23, 2012 at 11:52 PM, Henrik Faber <hfaber@invalid.net> wrote: >> What about × vs x? Or Ì vs Í vs Î vs Ï vs Ĩ vs Ī vs ī vs Ĭ vs ĭ vs Į vs >> į vs I vs İ? Do you think if you need to maintain such code you'll >> immediately know the difference between the 13 (!) different "I"s I just >> happened to pull out randomly you need to chose and how to get it? What >> about Ȝ vs ȝ? Or Ȣ vs ȣ? Or ȸ vs ȹ? Or d vs Ԁ vs ԁ vs ԃ vs Ԃ? Or ց vs g? >> Or ս vs u? > > If they're different characters, they're different. It's not unlike > the confusion you can already get between uppercase I and lowercase l, > or between uppercase and lowercase of the same letter, or between rn > and m, or between any other of myriad confusingly-similar pairs that > can be found just in ASCII. But your reasoning is flawed: bascially you're saying some things are already confusing, so it's just fine to add more confusion. It is not in my opinion. And that the computer can differentiate different characters is also perfectly clear to me. The interpreter can also tell the difference between a non-breaking space and a regular space. Yet the non breaking space is not valid for a identifying character. This is because readability counts. People write and maintain code, not machines. Confusion should be kept to the miminum if possible. > Of course, SOMEBODY is going to make use of those to improve upon this > sort of code: > > http://thedailywtf.com/Articles/Uppity.aspx If that was written by my coworkers, I'd strangle them. Regards, Henrik
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2012-07-23 14:10 -0400 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <mailman.2489.1343067003.4697.python-list@python.org> |
| In reply to | #25880 |
On Mon, 23 Jul 2012 16:42:51 +0200, Henrik Faber <hfaber@invalid.net>
declaimed the following in gmane.comp.python.general:
>
> If that was written by my coworkers, I'd strangle them.
>
My first real assignment, 31 years ago, was porting an application
to CDC MP-60 FORTRAN (what I called "FORTRAN MINUS TWO"). This was a
minimal FORTRAN implementation in which one could not do things like:
ix = 20
call xyz(ix, ix+2, ix-2)
forcing us to produce such abominations as
ix = 20
jinx = ix + 2
minx = ix - 2
call xyz(ix, jinx, minx)
Now, when you take into consideration that this application,
purportedly a requirements traceability system, was only a few
functions* shy of being a (pre-SQL) relational database using a form of
relational algebra, and did lots of record<>disk-block^ packing, then
there were many calls to subprograms that needed things like block #,
record #, start&end bytes within record/blocks, with both source and
destination information needed.
* It needed a dynamic "project" operation; records contained a "format
ID" (formats were themselves stored in the "database") which identified
the layout of the data in the record. A dynamic "project" operation
would have to build a format from the specifications of the desired
fields. It also would have needed a simple join operation (also
dynamically creating a format for the result set).
^ The MP-60 did not make that easy either... The F-2 did not have direct
access I/O, so one had to use OS services to position to "block #" of a
file, and then perform the FORTRAN I/O operation. BUT: disk blocks were
512 bytes, but only around 488 bytes were available to the user -- the
OS stored something in the rest of the block. That meant we had to
change the program to work in 488 byte blocks, but to address the file
in 512 byte blocks.
My second major assignment was to maintain an application that had
been originally written (by a subcontractor) using preprocessor to
convert "C" style block statements into FORTRAN-IV (the only valid use
I've ever seen for the ASSIGNed GOTO -- emulating the return from
"subroutine" calls). I got to maintain the /output/ of the preprocessor
-- where programs were 3000-30000 lines of code in a single file, all
variables were global...
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-07-23 16:53 +0000 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <500d819d$0$29978$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #25869 |
On Mon, 23 Jul 2012 15:52:32 +0200, Henrik Faber wrote:
> If you allow for UTF-8 identifiers you'll have to be horribly careful
> what to include and what to exclude. Is the non-breaking space a valid
> character for a identifier? Technically it's a different character than
> the normal space, so why shouldn't it be? What an awesome idea!
Because it's not a letter. Using Python 3:
py> nbs = '\N{NO-BREAK SPACE}'
py> import unicodedata
py> unicodedata.category(nbs)
'Zs'
py> unicodedata.category('a')
'Ll'
Not every character is valid in identifiers, not even in ASCII. Why would
Unicode be any different?
Before Python added unicode identifiers, many issues were discussed and
resolved. See the PEP that discusses it:
http://www.python.org/dev/peps/pep-3131/
> What about × vs x?
No, because × is not a letter.
> Or Ì vs Í vs Î vs Ï vs Ĩ vs Ī vs ī vs Ĭ vs ...
Yes, we get the point. Some letters look similar to other letters. Since
these are all different letters, they are treated differently in
identifiers, no differently from O vs 0 and I vs l vs 1.
Dyslexics will rightly complain that s and z look too similar, and b and
d even more so. Perhaps they too should be banned from identifiers?
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2012-07-23 16:46 +0100 |
| Subject | Re: the meaning of r?.......ï¾ |
| Message-ID | <mailman.2484.1343058411.4697.python-list@python.org> |
| In reply to | #25863 |
On 23/07/2012 14:24, Henrik Faber wrote: [snip] > And if I think of PHP's latest fiasco that happened with unicode > characters, it makes me shudder to think you'd want that stuff in > Python. If I remember correctly, it was the Turkish locale that they > stuggled with: Turkey apparently does not have a capital "I", so some > weird PHP magic code broke with the Turkish locale in effect. Having to > keep crap like that in mind is just plain horrible. I'm very happy with > the way Python does it. > When Turkish changed to the Latin alphabet, the letter pair I/i was split into 2 separate letters, each with an uppercase and a lowercase form: I/ı (uppercase without dot and lowercase without dot) and İ/i (uppercase with dot and lowercase with dot).
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-07-23 16:33 +0000 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <500d7ce8$0$29978$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #25863 |
On Mon, 23 Jul 2012 15:24:21 +0200, Henrik Faber wrote:
> I disagree. Firstly, Python could already support the different types of
> strings even with the ASCII character set. For example, the choice could
> have made to treat the apostophe string 'foo' differently from the
> double quote string "foo". Then, the backtick could have been used
> `foo`.
In Python 2, backticks are a synonym for the repr() function:
py> x = object()
py> `x`
'<object object at 0xb7ec7470>'
That's gone in Python 3.
> Bash for example uses all three and all three have very different
> meanings. Python is different: explicit is better than implicit, and I'd
> rather have the "r" the signifies what weird magic is going on instead
> of having some weird language rules. It would not be different with some
> UTF-8 "rawstring" magic backticks.
I'm not entirely sure why you think that r"..." is more explicit than
some other quote marks, say `...`. Apart from the mnemonic "r for raw-
string", they're both equally mysterious language rules. (Just try
explaining why backslashes in raw-strings are normal characters, *unless*
the last character of the string is a backslash.)
py> print r"ab\"
File "<stdin>", line 1
print r"ab\"
^
SyntaxError: EOL while scanning single-quoted string
0_o
Personally, I like Python's quoting rules, even with quirks like the
above. I don't like quoting rules that magically evaluate expressions or
interpolate variables.
> Secondly, there's a reason that >=, <= and friends are in use. Every
> keyboard has a > key and every keyboard has a = key. I don't know any
> that would have >=, <= or != as UTF-8. It is useful to use only a
> limited set of characters.
You don't need a keyboard with a bazillion keys. You just need a keyboard
with a couple of extra modifier keys, and some good mnemonics for what
characters go with what keys. Plus a mechanism for entering arbitrary
code points that otherwise aren't attached to a key.
Hey, if the Japanese and Chinese can manage it, English speakers can
surely find a way to enter π or ∞ without a keyboard the size of a
battleship.
> And if I think of PHP's latest fiasco that happened with unicode
> characters, it makes me shudder to think you'd want that stuff in
> Python.
Oh please. Just because the PHP "designers" couldn't design a hammer
doesn't mean that the idea of hammers is faulty.
(Cynical about PHP? Who, me?)
> If I remember correctly, it was the Turkish locale that they
> stuggled with: Turkey apparently does not have a capital "I",
Not exactly. It has two.
Unlike most other European languages, Turkish and a few other languages
includes both a dotted and dotless I. Other languages mix them up: dotted
lowercase i goes with dotless uppercase I, like in English. But in
Turkish, you have ı <=> I and i <=> İ.
http://en.wikipedia.org/wiki/Dotted_and_dotless_I
And if this wasn't so serious, it would be hilarious:
http://gizmodo.com/382026/a-cellphones-missing-dot-kills-two-people-puts-three-more-in-jail
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Ross Ridge <rridge@csclub.uwaterloo.ca> |
|---|---|
| Date | 2012-07-23 14:43 -0400 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <juk60b$rhi$1@rumours.uwaterloo.ca> |
| In reply to | #25902 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: >Hey, if the Japanese and Chinese can manage it, English speakers can >surely find a way to enter π or ∞ without a keyboard the size of a >battleship. Japanese and Chinese programmers don't use (and don't seem to want to) use non-ASCII characters outside of strings and comments even when the language (supposedly) allows it. Ross Ridge -- l/ // Ross Ridge -- The Great HTMU [oo][oo] rridge@csclub.uwaterloo.ca -()-/()/ http://www.csclub.uwaterloo.ca/~rridge/ db //
[toc] | [prev] | [next] | [standalone]
| From | Henrik Faber <hfaber@invalid.net> |
|---|---|
| Date | 2012-07-23 15:26 +0200 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <jujjf0$dfn$2@speranza.aioe.org> |
| In reply to | #25857 |
On 23.07.2012 14:55, Roy Smith wrote: > Some day, we're going to have programming languages that take advantage > of the full unicode character set. Plus, if I may add this: It's *your* newsreader that broke the correctly declared ISO-8859-7 encoded subject of the OP. What a bitter irony that demonstrates nicely that even in the 2010s complete and ultimate Unicode support is far from here. Best regards, Henrik
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-07-23 15:49 +0000 |
| Subject | Re: the meaning of rïŸ.......ïŸ |
| Message-ID | <500d7285$0$29978$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #25857 |
On Mon, 23 Jul 2012 08:55:22 -0400, Roy Smith wrote:
> Some day, we're going to have programming languages that take advantage
> of the full unicode character set.
I don't know about the full Unicode character set, since there are many
more than 10000 characters, and few languages require that many distinct
tokens.
But if you mean a richer character set than mere ASCII, then I agree,
provided if by "some day" you mean nearly half a century ago.
http://en.wikipedia.org/wiki/APL_%28programming_language%29
Sort an array by the length of the word:
X[⍋X+.≠' ';]
compared to Python:
words.sort(key=len)
I think that's a clear example of why APL has never quite taken the world
by storm. It makes Perl look like human-readable pseudo-code.
It isn't necessary to go to APL's extremes to get the power of a richer
character set. Back in the 1990s, Apple's Hypertalk language allowed
single character synonyms for certain operators, including:
≠ for <>
≤ for <=
≥ for >=
÷ for /
√ for sqrt
See OpenXION for a modern, non-GUI version:
http://www.openxion.org/
Unicode includes a very rich set of operator characters. Assuming the
character input problem were solved, it would be awesome to be able to
define a rich set of operators beyond the few Python already has. For
example, we could use proper ∩ and ∪ operators for set intersection and
union, or use chevrons «» as delimiters for types without clashing with
lists [], tuples(), sets and dicts {}.
And wouldn't you rather see something like "string␊␍" instead of
"string\n\r"? I know I would.
Of course, the character input problem *is* a genuine problem. Between
that and the issue of character display (not all fonts are capable of
showing all characters) I wouldn't hold my breath for full Unicode syntax
any time soon.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Serhiy Storchaka <storchaka@gmail.com> |
|---|---|
| Date | 2012-07-26 16:34 +0300 |
| Subject | Re: the meaning of rïŸ.......ïŸ |
| Message-ID | <mailman.2608.1343309673.4697.python-list@python.org> |
| In reply to | #25894 |
On 23.07.12 18:49, Steven D'Aprano wrote: > ≤ for <= > ≥ for >= I insist on the use of ⩽ and ⩾ too.
[toc] | [prev] | [next] | [standalone]
| From | Ross Ridge <rridge@csclub.uwaterloo.ca> |
|---|---|
| Date | 2012-07-23 12:19 -0400 |
| Subject | Re: the meaning of rユ.......ï¾ |
| Message-ID | <jujtj5$8dh$1@rumours.uwaterloo.ca> |
| In reply to | #25857 |
Roy Smith <roy@panix.com> wrote:
>When I first started writing C code, it was on ASR-33s which did not
>support curly baces. We wrote ¥( for { and ¥) for } (although I think the translation was
>handled entirely in the TTY driver and the compiler was never in on the
>joke). 20 or 30 years from now, people are going to look back on us
>neanderthals and laugh about how we had to write r''.
No, it's not going to change in 20 or 30 years. The ASR-33 Teletype
was pretty much obsolete by the time C escaped Bell Labs. You were
programming in a 70's language using early 60's technology and suffered
accordingly. Today, the technology to support "Unicode" operators in
programming langauges is both widespread and has existed for a long
time now. I'm sure you've heard of APL, which both predates Unicode and
C and is almost old as the ASR-33. If any one actually wanted another
programming language like this it would've come into existance 20 or 30
years ago not 20 or 30 years from now.
Python actually choose to go the other direction and choose to use
keywords as operators instead of symbols in a number of instances.
Ross Ridge
--
l/ // Ross Ridge -- The Great HTMU
[oo][oo] rridge@csclub.uwaterloo.ca
-()-/()/ http://www.csclub.uwaterloo.ca/~rridge/
db //
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2012-07-23 13:44 -0400 |
| Subject | Re: the meaning of r?.......?¾ |
| Message-ID | <mailman.2488.1343065502.4697.python-list@python.org> |
| In reply to | #25857 |
On Mon, 23 Jul 2012 23:06:45 +1000, Chris Angelico <rosuav@gmail.com>
declaimed the following in gmane.comp.python.general:
> On Mon, Jul 23, 2012 at 10:55 PM, Roy Smith <roy@panix.com> wrote:
> > Some day, we're going to have programming languages that take advantage
> > of the full unicode character set. Right now, we're working in ASCII
> > and creating silly digrams/trigrams like r'' for raw strings (and triple-quotes for multi-line
> > strings). Not to mention <=, >=, ==, !=. And in languages other than
> > python, things like ->, => (arrows for structure membership), and so on.
>
> REXX predates Unicode, I think, or at least its widespread adoption,
> but it has a non-ASCII operator:
>
> http://www.rexswain.com/rexx.html#operators
>
REXX was created for IBM systems that used EBCDIC... And
IBM-specific terminals included that character.
http://en.wikipedia.org/wiki/EBCDIC#Codepage_layout
{EBCDIC was also used internally on SDS/XDS Sigma systems}
And APL really predates Unicode -- ever look at an APL terminal?
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | John Roth <johnroth1@gmail.com> |
|---|---|
| Date | 2012-07-23 17:50 -0700 |
| Message-ID | <mailman.2513.1343091007.4697.python-list@python.org> |
| In reply to | #25841 |
On Monday, July 23, 2012 1:59:42 AM UTC-6, Chris Angelico wrote: > On Fri, Jul 20, 2012 at 5:56 PM, levi nie <levinie001@gmail.com> wrote: > > the meaning of r’.......‘? > > It's a raw string. > > http://docs.python.org/py3k/tutorial/introduction.html#strings > > Chris Angelico Since this seems to have wandered into the question of unicode support for operators and identifiers in Python, I'll add a few general comments. First, adding unicode operators has been discussed several times, and the answer has always been what several people have stated: lack of support from keyboards and fonts. As far as I can tell, the only significant issue that unicode operators would solve is set operators. Currently they overload the arithmetic operators, so it's not possible to have an object with both arithmetic and set behavior supported by operator syntax. I'm not sure how important that is in the real world. Identifiers can have any character that's marked as alphabetic in the unicode data base. The doc gives the exact criteria, but it's driven by checking the actual data base, or rather Python's copy of it. PEP 8 requires sticking to the Ascii character set except for comments, and then only when it's needed for properly spelling someone's name. Of course, anyone can do anything they want, but Python doesn't have a culture of deliberately trying to write illegible code. And yes, Python did have a problem with the Turkish "i". It got fixed without all the sturm and drang that seems to accompany PHP issues.
[toc] | [prev] | [next] | [standalone]
| From | John Roth <johnroth1@gmail.com> |
|---|---|
| Date | 2012-07-23 17:50 -0700 |
| Message-ID | <f8701a7b-9897-4c32-8698-9567c531dbe2@googlegroups.com> |
| In reply to | #25841 |
On Monday, July 23, 2012 1:59:42 AM UTC-6, Chris Angelico wrote: > On Fri, Jul 20, 2012 at 5:56 PM, levi nie <levinie001@gmail.com> wrote: > > the meaning of r’.......‘? > > It's a raw string. > > http://docs.python.org/py3k/tutorial/introduction.html#strings > > Chris Angelico Since this seems to have wandered into the question of unicode support for operators and identifiers in Python, I'll add a few general comments. First, adding unicode operators has been discussed several times, and the answer has always been what several people have stated: lack of support from keyboards and fonts. As far as I can tell, the only significant issue that unicode operators would solve is set operators. Currently they overload the arithmetic operators, so it's not possible to have an object with both arithmetic and set behavior supported by operator syntax. I'm not sure how important that is in the real world. Identifiers can have any character that's marked as alphabetic in the unicode data base. The doc gives the exact criteria, but it's driven by checking the actual data base, or rather Python's copy of it. PEP 8 requires sticking to the Ascii character set except for comments, and then only when it's needed for properly spelling someone's name. Of course, anyone can do anything they want, but Python doesn't have a culture of deliberately trying to write illegible code. And yes, Python did have a problem with the Turkish "i". It got fixed without all the sturm and drang that seems to accompany PHP issues.
[toc] | [prev] | [standalone]
Page 3 of 3 — ← Prev page 1 2 [3]
Back to top | Article view | comp.lang.python
csiph-web