Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #70527 > unrolled thread
| Started by | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| First post | 2014-04-22 22:31 -0700 |
| Last post | 2014-04-23 16:41 +1000 |
| Articles | 20 on this page of 21 — 10 participants |
Back to article view | Back to comp.lang.python
Unicode in Python Rustom Mody <rustompmody@gmail.com> - 2014-04-22 22:31 -0700
Re: Unicode in Python Chris Angelico <rosuav@gmail.com> - 2014-04-23 15:50 +1000
Re: Unicode in Python Rustom Mody <rustompmody@gmail.com> - 2014-04-22 23:57 -0700
Re: Unicode in Python Chris Angelico <rosuav@gmail.com> - 2014-04-23 17:06 +1000
Re: Unicode in Python Steven D'Aprano <steve@pearwood.info> - 2014-04-23 07:29 +0000
Re: Unicode in Python Steven D'Aprano <steve@pearwood.info> - 2014-04-23 07:53 +0000
Re: Unicode in Python Rustom Mody <rustompmody@gmail.com> - 2014-04-23 10:59 -0700
Re: Unicode in Python wxjmfauth@gmail.com - 2014-04-26 00:15 -0700
Re: Unicode in Python "Frank Millman" <frank@chagford.com> - 2014-04-26 09:45 +0200
Re: Unicode in Python Ben Finney <ben@benfinney.id.au> - 2014-04-26 17:50 +1000
Re: Unicode in Python Ian Kelly <ian.g.kelly@gmail.com> - 2014-04-26 09:38 -0400
Re: Unicode in Python wxjmfauth@gmail.com - 2014-04-27 07:29 -0700
Re: Unicode in Python wxjmfauth@gmail.com - 2014-04-28 01:57 -0700
Re: Unicode in Python random832@fastmail.us - 2014-05-01 13:21 -0400
Re: Unicode in Python wxjmfauth@gmail.com - 2014-05-07 23:04 -0700
Re: Unicode in Python Michael Torrie <torriem@gmail.com> - 2014-05-01 21:50 -0600
Re: Unicode in Python wxjmfauth@gmail.com - 2014-05-03 00:46 -0700
Re: Unicode in Python Rustom Mody <rustompmody@gmail.com> - 2014-04-27 10:39 -0700
Re: Unicode in Python Steven D'Aprano <steve@pearwood.info> - 2014-04-23 05:52 +0000
Re: Unicode in Python Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-04-22 23:19 -0700
Re: Unicode in Python Ben Finney <ben@benfinney.id.au> - 2014-04-23 16:41 +1000
Page 1 of 2 [1] 2 Next page →
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-22 22:31 -0700 |
| Subject | Unicode in Python |
| Message-ID | <0f253434-5e7d-4eea-88e1-7997fec2bd2d@googlegroups.com> |
Chris Angelico wrote: > it's impossible for most people to type (and programming with a palette > of arbitrary syntactic tokens isn't my idea of fun)... Where's the suggestion to use a "palette of arbitrary tokens" ? I just tried a greek keyboard; ie do $ setxkbmap -option "grp:switch,grp:alt_shift_toggle,grp_led:scroll" -layout "us,gr" Thereafter typing abcdefghijklmnopqrstuvwxyz after a Shift-Alt gives αβψδεφγηιξκλμνοπ;ρστθωςχυζ One more Shift-Alt and back to roman IOW the extra typing cost for greek letters is negligible over the corresponding roman ones Of course - One would need to define such a keyboard (setxkb) - One would have to find similar technologies for other OSes (Im on debian; even ubuntu/unity grabs too many keys)
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-23 15:50 +1000 |
| Message-ID | <mailman.9452.1398232238.18130.python-list@python.org> |
| In reply to | #70527 |
On Wed, Apr 23, 2014 at 3:31 PM, Rustom Mody <rustompmody@gmail.com> wrote: > Chris Angelico wrote: >> it's impossible for most people to type (and programming with a palette >> of arbitrary syntactic tokens isn't my idea of fun)... > > Where's the suggestion to use a "palette of arbitrary tokens" ? > > I just tried a greek keyboard; ie do > $ setxkbmap -option "grp:switch,grp:alt_shift_toggle,grp_led:scroll" -layout "us,gr" > > Thereafter typing > abcdefghijklmnopqrstuvwxyz > after a Shift-Alt > gives > αβψδεφγηιξκλμνοπ;ρστθωςχυζ > > One more Shift-Alt and back to roman Okay. Now what about your other symbols? Your alternative assignment operator, for instance. How do you type that? ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-22 23:57 -0700 |
| Message-ID | <773afa7d-4b6d-4d67-8d40-ea90b335a1a2@googlegroups.com> |
| In reply to | #70529 |
On Wednesday, April 23, 2014 11:22:33 AM UTC+5:30, Steven D'Aprano wrote: > 25 Unicode characters down, 1114000+ to go :-) The question would arise if there was some suggestion to add 1114000(+) characters to the syntactic/lexical definition of python. IOW while its true that unicode is a character-set, its better to think of it as a repertory -- here is the universal set from which a choice is available. On Wednesday, April 23, 2014 11:20:35 AM UTC+5:30, Chris Angelico wrote: > On Wed, Apr 23, 2014 at 3:31 PM, Rustom Mody wrote: > > Chris Angelico wrote: > >> it's impossible for most people to type (and programming with a palette > >> of arbitrary syntactic tokens isn't my idea of fun)... > > Where's the suggestion to use a "palette of arbitrary tokens" ? > > I just tried a greek keyboard; ie do > > $ setxkbmap -option "grp:switch,grp:alt_shift_toggle,grp_led:scroll" -layout "us,gr" > > Thereafter typing > > abcdefghijklmnopqrstuvwxyz > > after a Shift-Alt > > gives > > αβψδεφγηιξκλμνοπ;ρστθωςχυζ > > One more Shift-Alt and back to roman > Okay. Now what about your other symbols? Your alternative assignment > operator, for instance. How do you type that? In case you missed it, I said: > Of course > - One would need to define such a keyboard (setxkb) > - One would have to find similar technologies for other OSes In more detail: In our normal use of a US-104 keyboard, every letter 'costs' something. eg 'a' costs 1 keystroke 'A' costs 2 (Shift+a) Most people do not count that as a significant cost. and when kids come on this list and talk smsese -- i wanna do so-n-so we chide them for keystrokes at the cost of readability. In such a (default) setup typing a ∧ or ∨ is not possible at all without something like a char-picker and at best has an ergonomic cost that is an order of magnitude higher than the 'naturally available' characters. On the other hand when/if a keyboard mapping is defined in which the characters that are commonly needed are available, it is reasonable to expect the ∨,∧ to cost no more than 2 strokes each (ie about as much as an 'A'; slightly more than an 'a'. Which means that '∨' is expected to cost about the same as 'or' and ∧ to cost less than an 'and' Readability is another question altogether. Random example from my machine calendar.py line 99 If one finds this: return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0) more readable than return year%4=0 ∧ (year%100≠0 ∨ year%100 = 0) then perhaps the following is the most preferred? COMPUTE YEAR MODULO 4 EQUALS 0 AND YEAR MODULO 100 NOT EQUAL TO ZERO OR YEAR MODULO 100 EQUAL to 0 IOW COBOL is desirable?
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-23 17:06 +1000 |
| Message-ID | <mailman.9457.1398236806.18130.python-list@python.org> |
| In reply to | #70535 |
On Wed, Apr 23, 2014 at 4:57 PM, Rustom Mody <rustompmody@gmail.com> wrote: > In such a (default) setup typing a ∧ or ∨ is not possible at all without > something like a char-picker and at best has an ergonomic cost that is an > order of magnitude higher than the 'naturally available' characters. > > On the other hand when/if a keyboard mapping is defined in which > the characters that are commonly needed are available, it is > reasonable to expect the ∨,∧ to cost no more than 2 strokes each > (ie about as much as an 'A'; slightly more than an 'a'. Which means > that '∨' is expected to cost about the same as 'or' and ∧ to cost less than an 'and' So how much effort are you going to go to for, effectively, the same end result? You can type "or" with the same keystrokes, and it takes zero setup work and zero memorization (you may forget which keystroke you set up for ∨, but I doubt you'll forget how to spell "or", even if you think it means gold/yellow). Where's the benefit? I'm seriously not seeing it. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2014-04-23 07:29 +0000 |
| Message-ID | <53576bdd$0$11109$c3e8da3@news.astraweb.com> |
| In reply to | #70535 |
On Tue, 22 Apr 2014 23:57:46 -0700, Rustom Mody wrote: > perhaps the following is the most preferred? > > COMPUTE YEAR MODULO 4 EQUALS 0 AND YEAR MODULO 100 NOT EQUAL TO ZERO OR > YEAR MODULO 100 EQUAL to 0 > > IOW COBOL is desirable? If the only choices are COBOL on one hand and the mutant offspring of Perl and APL on the other, I'd vote for COBOL. But surely they aren't the only options, and it is possible to find a happy medium which is neither excessively verbose nor painfully, cryptically terse. Remember that we're talking about general purpose programming here. There are domains which favour terseness and a vast number of symbols, e.g. mathematics, but most programming is not in that domain, even when it uses tools from that domain. -- Steve
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2014-04-23 07:53 +0000 |
| Message-ID | <5357715c$0$11109$c3e8da3@news.astraweb.com> |
| In reply to | #70535 |
On Tue, 22 Apr 2014 23:57:46 -0700, Rustom Mody wrote:
> On the other hand when/if a keyboard mapping is defined in which the
> characters that are commonly needed are available, it is reasonable to
> expect the ∨,∧ to cost no more than 2 strokes each (ie about as much as
> an 'A'; slightly more than an 'a'. Which means that '∨' is expected to
> cost about the same as 'or' and ∧ to cost less than an 'and'
Oh, a further thought...
Consider your example:
return year%4=0 ∧ (year%100≠0 ∨ year%100 = 0)
vs
return year%4=0 and (year%100!=0 or year%100 = 0)
[aside: personally I like ≠ and if there was a platform independent way
to type it in any editor, I'd much prefer it over != or <> ]
Apart from the memorization problem, which I've already touched on, there
is the mode problem. Keyboard layouts are modes, and you're swapping
modes. Every time you swap modes, there is a small mental cost. Think of
it as an interrupt which has to be caught, pausing the current thought
and starting a new one. So rather than:
char char char char char char char ...
you have:
char char char INTERRUPT
char INTERRUPT
char char char ...
which is a heavier cost that it appears from just counting keystrokes. Of
course, the more experienced you become, the smaller that cost will be,
but it will never be quite as low as just a "regular" keystroke.
Normally, when people use multiple keyboards, its because that interrupt
cost is amortized over a significant amount of typing:
INTERRUPT (English layout)
paragraph paragraph paragraph paragraph
INTERRUPT (Greek layout)
paragraph paragraph paragraph
INTERRUPT (English again)
paragraph ...
and possibly even lost in the noise of a far greater interrupt, namely
task-switching from one application to another. So it's manageable. But
switching layouts for a single character is likely to be far more
painful, especially for casual users of that layout.
Based on an extremely generous estimate that I use "lambda" four times in
100 lines of code, I might use λ perhaps once in a thousand non-Greek
characters. Similarly, I might use ∧ or ∨ maybe once per hundred
characters. That means I'm unlikely to ever get familiar enough with
those that the cost of two interrupts per use will be negligible.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-23 10:59 -0700 |
| Message-ID | <aa55b40a-9032-401c-a24d-1b7518ebe1e1@googlegroups.com> |
| In reply to | #70538 |
On Wednesday, April 23, 2014 1:23:00 PM UTC+5:30, Steven D'Aprano wrote: > On Tue, 22 Apr 2014 23:57:46 -0700, Rustom Mody wrote: > > On the other hand when/if a keyboard mapping is defined in which the > > characters that are commonly needed are available, it is reasonable to > > expect the ∨,∧ to cost no more than 2 strokes each (ie about as much as > > an 'A'; slightly more than an 'a'. Which means that '∨' is expected to > > cost about the same as 'or' and ∧ to cost less than an 'and' > Oh, a further thought... > Consider your example: > return year%4=0 ∧ (year%100≠0 ∨ year%100 = 0) > vs > return year%4=0 and (year%100!=0 or year%100 = 0) > [aside: personally I like ≠ and if there was a platform independent way > to type it in any editor, I'd much prefer it over != or <> ] > Apart from the memorization problem, which I've already touched on, there > is the mode problem. Keyboard layouts are modes, and you're swapping > modes. Every time you swap modes, there is a small mental cost. Think of > it as an interrupt which has to be caught, pausing the current thought > and starting a new one. So rather than: > char char char char char char char ... > you have: > char char char INTERRUPT > char INTERRUPT > char char char ... > which is a heavier cost that it appears from just counting keystrokes. Of > course, the more experienced you become, the smaller that cost will be, > but it will never be quite as low as just a "regular" keystroke. > Normally, when people use multiple keyboards, its because that interrupt > cost is amortized over a significant amount of typing: > INTERRUPT (English layout) > paragraph paragraph paragraph paragraph > INTERRUPT (Greek layout) > paragraph paragraph paragraph > INTERRUPT (English again) > paragraph ... > and possibly even lost in the noise of a far greater interrupt, namely > task-switching from one application to another. So it's manageable. But > switching layouts for a single character is likely to be far more > painful, especially for casual users of that layout. > Based on an extremely generous estimate that I use "lambda" four times in > 100 lines of code, I might use λ perhaps once in a thousand non-Greek > characters. Similarly, I might use ∧ or ∨ maybe once per hundred > characters. That means I'm unlikely to ever get familiar enough with > those that the cost of two interrupts per use will be negligible. Its gratifying to see an argument whose framing is cognitive-based! More on that later. For now: mode/modeless Yes most of us prefer the Shift key to the Caps Lock even for stretches of capitals. So analogously here is a modeless solution Earlier I found this mode-switching version $ setxkbmap -option "grp:switch,grp:alt_shift_toggle,grp_led:scroll" -layout "us,gr" this makes Shift-Alt the mode-switcher This one on the other hand $ setxkbmap -layout "us,gr" -option "grp:switch" will make right-alt behave like 'Greek-Shift' ie typing abcdefghijklmnopqrstuvwxyz with RAlt depressed throughout, produces αβψδεφγηιξκλμνοπ;ρστθωςχυζ This makes the a Greek letter's ergonomic cost identical to a capital English letter's: For Greek use RAlt the way one uses Shift for English. Notes: 1. Tried on Debian and Ubuntu -- Recent Ubuntus are rather more ill-mannered in the way they appropriates keys. Still it works as far as I can see. 2. ';' ?? ie semicolon is produced from 'q'? Whats that semicolon doing there?? But then Greek is -- well -- Greek to me! (As is xkb!)
[toc] | [prev] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2014-04-26 00:15 -0700 |
| Message-ID | <03bb12d8-93be-4ef6-94ae-4a02789aea2d@googlegroups.com> |
| In reply to | #70546 |
========== I wrote once 90 % of Python 2 apps (a generic term) supposed to process text, strings are not working. In Python 3, that's 100 %. It is somehow only by chance, apps may give the illusion they are properly working. jmf
[toc] | [prev] | [next] | [standalone]
| From | "Frank Millman" <frank@chagford.com> |
|---|---|
| Date | 2014-04-26 09:45 +0200 |
| Message-ID | <mailman.9515.1398498323.18130.python-list@python.org> |
| In reply to | #70627 |
<wxjmfauth@gmail.com> wrote in message news:03bb12d8-93be-4ef6-94ae-4a02789aea2d@googlegroups.com... > ========== > > I wrote once 90 % of Python 2 apps (a generic term) supposed to > process text, strings are not working. > > In Python 3, that's 100 %. It is somehow only by chance, apps may > give the illusion they are properly working. > It is quite frustrating when you make these statements without explaining what you mean by 'not working'. It would be really useful if you could spell out - 1. what you did 2. what you expected to happen 3. what actually happened Frank Millman
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben@benfinney.id.au> |
|---|---|
| Date | 2014-04-26 17:50 +1000 |
| Message-ID | <mailman.9516.1398498617.18130.python-list@python.org> |
| In reply to | #70627 |
"Frank Millman" <frank@chagford.com> writes: > <wxjmfauth@gmail.com> wrote […] > It is quite frustrating when you make these statements without > explaining what you mean by 'not working'. Please do not engage “wxjmfauth” on this topic; he is an amply-demonstrated troll with nothing tangible to back up his incessant complaints about Unicode in Python. He is best ignored, IMO. -- \ “As the evening sky faded from a salmon color to a sort of | `\ flint gray, I thought back to the salmon I caught that morning, | _o__) and how gray he was, and how I named him Flint.” —Jack Handey | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2014-04-26 09:38 -0400 |
| Message-ID | <mailman.9518.1398519519.18130.python-list@python.org> |
| In reply to | #70627 |
[Multipart message — attachments visible in raw view] — view raw
On Apr 26, 2014 3:46 AM, "Frank Millman" <frank@chagford.com> wrote: > > > <wxjmfauth@gmail.com> wrote in message > news:03bb12d8-93be-4ef6-94ae-4a02789aea2d@googlegroups.com... > > ========== > > > > I wrote once 90 % of Python 2 apps (a generic term) supposed to > > process text, strings are not working. > > > > In Python 3, that's 100 %. It is somehow only by chance, apps may > > give the illusion they are properly working. > > > > It is quite frustrating when you make these statements without explaining > what you mean by 'not working'. As far as anybody has been able to determine, what jmf means by "not working" is that strings containing the € character are handled less efficiently than strings that do not contain it in certain contrived test cases.
[toc] | [prev] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2014-04-27 07:29 -0700 |
| Message-ID | <d9b3a423-259a-471f-a763-49a657416ae3@googlegroups.com> |
| In reply to | #70632 |
Le samedi 26 avril 2014 15:38:29 UTC+2, Ian a écrit : > On Apr 26, 2014 3:46 AM, "Frank Millman" <fr...@chagford.com> wrote: > > > > > > > > > <wxjm...@gmail.com> wrote in message > > > news:03bb12d8-93be-4ef6-94ae-4a02789aea2d@googlegroups.com... > > > > ========== > > > > > > > > I wrote once 90 % of Python 2 apps (a generic term) supposed to > > > > process text, strings are not working. > > > > > > > > In Python 3, that's 100 %. It is somehow only by chance, apps may > > > > give the illusion they are properly working. > > > > > > > > > > It is quite frustrating when you make these statements without explaining > > > what you mean by 'not working'. > > As far as anybody has been able to determine, what jmf means by "not working" is that strings containing the EURO character are handled less efficiently than strings that do not contain it in certain contrived test cases. ----- 'EURO SIGN' ? No, it's just a character!
[toc] | [prev] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2014-04-28 01:57 -0700 |
| Message-ID | <bcd76ee0-4703-45ed-95c3-ad0cac35a889@googlegroups.com> |
| In reply to | #70632 |
Le samedi 26 avril 2014 15:38:29 UTC+2, Ian a écrit : > On Apr 26, 2014 3:46 AM, "Frank Millman" <fr...@chagford.com> wrote: > > > > > > > > > <wxjm...@gmail.com> wrote in message > > > news:03bb12d8-93be-4ef6-94ae-4a02789aea2d@googlegroups.com... > > > > ========== > > > > > > > > I wrote once 90 % of Python 2 apps (a generic term) supposed to > > > > process text, strings are not working. > > > > > > > > In Python 3, that's 100 %. It is somehow only by chance, apps may > > > > give the illusion they are properly working. > > > > > > > > > > It is quite frustrating when you make these statements without explaining > > > what you mean by 'not working'. > > As far as anybody has been able to determine, what jmf means by "not working" is that strings containing the EURO character are handled less efficiently than strings that do not contain it in certain contrived test cases. ---- Python 2.7 + cp1252: - Solid and coherent system (nothing to do with the Euro). Python 3: - It missed the unicode shift. - Covering the whole unicode range will not make Python a unicode compliant product. - Flexible String Representation (a problem per se), a mathematical absurditiy which does the opposite of the coding schemes endorsed by Unicord.org (sheet of paper and pencil!) - Very deeply buggy (quadrature of the circle problem). Positive side: - A very nice tool to teach the coding of characters and unicode. jmf
[toc] | [prev] | [next] | [standalone]
| From | random832@fastmail.us |
|---|---|
| Date | 2014-05-01 13:21 -0400 |
| Message-ID | <mailman.9631.1398964881.18130.python-list@python.org> |
| In reply to | #70671 |
On Mon, Apr 28, 2014, at 4:57, wxjmfauth@gmail.com wrote: > Python 3: > - It missed the unicode shift. > - Covering the whole unicode range will not make > Python a unicode compliant product. Please cite exactly what portion of the unicode standard requires operations with all characters to be handled in the same amount of time and space, and forbids optimizations that make some characters handled faster or in less space than others.
[toc] | [prev] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2014-05-07 23:04 -0700 |
| Message-ID | <90b5fb36-e99d-4dcb-8df0-77044ff53be9@googlegroups.com> |
| In reply to | #70817 |
Le jeudi 1 mai 2014 19:21:14 UTC+2, rand...@fastmail.us a écrit : > On Mon, Apr 28, 2014, at 4:57, wxjmfauth@gmail.com wrote: > > > Python 3: > > > - It missed the unicode shift. > > > - Covering the whole unicode range will not make > > > Python a unicode compliant product. > > > > Please cite exactly what portion of the unicode standard requires > > operations with all characters to be handled in the same amount of time > > and space, and forbids optimizations that make some characters handled > > faster or in less space than others. ========== I missed you comment. Regression is only a side effect. I can make Python failing (lead Python to failures) with any piece of text or valid sequence of characters I wish [*]. I'm no more writing code (apps), only maintaining my interactive interpreters. [*] I do not count as failures, issues like cp65001, only "basic" text/string manipulations. jmf
[toc] | [prev] | [next] | [standalone]
| From | Michael Torrie <torriem@gmail.com> |
|---|---|
| Date | 2014-05-01 21:50 -0600 |
| Message-ID | <mailman.9645.1399003680.18130.python-list@python.org> |
| In reply to | #70671 |
Can't help but feed the troll... forgive me. On 04/28/2014 02:57 AM, wxjmfauth@gmail.com wrote: > Python 2.7 + cp1252: > - Solid and coherent system (nothing to do with the Euro). Except that cp1252 is not unicode. Perhaps some subset of unicode can be encoded into bytes using cp1252. But if it works for you keep using it, and stop spreading nonsense about FSR. > Python 3: > - Flexible String Representation (a problem per se), > a mathematical absurditiy which does the opposite of > the coding schemes endorsed by Unicord.org (sheet of > paper and pencil!) > - Very deeply buggy (quadrature of the circle problem). Maybe it's the language barrier, but whatever it is you are talking about, I certainly can't make out. You've been ranting about FSR for years without being able to clearly say what's wrong with it. Please quote unicode specifications that you feel Python does not implement. What unicode characters cannot be represented? Does Python choke on certain unicode strings or expose entities it should not (like Javascript does)? Why would you think that the unicode consortium's list of byte encodings are the only possible valid ways of encoding unicode to a byte stream? If you're going to continue to write this sort of stuff, please have the decency to answer these questions at least. > Positive side: > - A very nice tool to teach the coding of characters > and unicode. Indeed.
[toc] | [prev] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2014-05-03 00:46 -0700 |
| Message-ID | <aa6da561-0284-4119-9217-57a8ef09ab96@googlegroups.com> |
| In reply to | #70844 |
Le vendredi 2 mai 2014 05:50:40 UTC+2, Michael Torrie a écrit : > Can't help but feed the troll... forgive me. > > > > On 04/28/2014 02:57 AM, wxjmfauth@gmail.com wrote: > > > Python 2.7 + cp1252: > > > - Solid and coherent system (nothing to do with the Euro). > > > > Except that cp1252 is not unicode. Perhaps some subset of unicode can > > be encoded into bytes using cp1252. But if it works for you keep using > > it, and stop spreading nonsense about FSR. > > > > > Python 3: > > > - Flexible String Representation (a problem per se), > > > a mathematical absurditiy which does the opposite of > > > the coding schemes endorsed by Unicord.org (sheet of > > > paper and pencil!) > > > - Very deeply buggy (quadrature of the circle problem). > > > > Maybe it's the language barrier, but whatever it is you are talking > > about, I certainly can't make out. > > > > You've been ranting about FSR for years without being able to clearly > > say what's wrong with it. Please quote unicode specifications that you > > feel Python does not implement. What unicode characters cannot be > > represented? Does Python choke on certain unicode strings or expose > > entities it should not (like Javascript does)? > > > > Why would you think that the unicode consortium's list of byte encodings > > are the only possible valid ways of encoding unicode to a byte stream? > > > > If you're going to continue to write this sort of stuff, please have the > > decency to answer these questions at least. > > > > > Positive side: > > > - A very nice tool to teach the coding of characters > > > and unicode. > > > > Indeed. ======== -
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-27 10:39 -0700 |
| Message-ID | <ae5ba198-cf01-41a1-981f-307de7a460b7@googlegroups.com> |
| In reply to | #70546 |
On Wednesday, April 23, 2014 11:29:13 PM UTC+5:30, Rustom Mody wrote:
> On Wednesday, April 23, 2014 1:23:00 PM UTC+5:30, Steven D'Aprano wrote:
> > On Tue, 22 Apr 2014 23:57:46 -0700, Rustom Mody wrote:
> > > On the other hand when/if a keyboard mapping is defined in which the
> > > characters that are commonly needed are available, it is reasonable to
> > > expect the ∨,∧ to cost no more than 2 strokes each (ie about as much as
> > > an 'A'; slightly more than an 'a'. Which means that '∨' is expected to
> > > cost about the same as 'or' and ∧ to cost less than an 'and'
> > Oh, a further thought...
> > Consider your example:
> > return year%4=0 ∧ (year%100≠0 ∨ year%100 = 0)
> > vs
> > return year%4=0 and (year%100!=0 or year%100 = 0)
> > [aside: personally I like ≠ and if there was a platform independent way
> > to type it in any editor, I'd much prefer it over != or <> ]
I checked haskell and find the unicode support is better.
For variables (ie identifiers) python and haskell are much the same:
Python3:
>>> α = 1
>>> α
1
Haskell:
Prelude> let α = 1
Prelude> α
1
However in haskell one can also do this unlike python:
*Main> 2 ≠ 3
True
All that's needed to make this work is this set of new-in-terms-of-old definitions:
[The -- is comments for those things that dont work as one may wish]
--------------
import qualified Data.Set as Set
-- Experimenting with Unicode in Haskell source
-- Numbers
x ≠ y = x /= y
x ≤ y = x <= y
x ≥ y = x >= y
x ÷ y = divMod x y
x ⇑ y = x ^ y
x × y = x * y -- readability hmmm !!!
π = pi
-- ⌊ x = floor x
-- ⌈ x = ceiling x
-- Lists
xs ⤚ ys = xs ++ ys
n ↑ xs = take n xs
n ↓ xs = drop n xs
-- Bools
x ∧ y = x && y
x ∨ y = y || y
-- ¬x = not x
-- Sets
x ∈ s = x `Set.member` s
s ∪ t = s `Set.union` t
s ∩ t = s `Set.intersection` t
s ⊆ t = s `Set.isSubsetOf` t
s ⊂ t = s `Set.isProperSubsetOf` t
s ⊈ t = not (s `Set.isSubsetOf` t)
-- ∅ = Set.null
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2014-04-23 05:52 +0000 |
| Message-ID | <53575521$0$11109$c3e8da3@news.astraweb.com> |
| In reply to | #70527 |
On Tue, 22 Apr 2014 22:31:41 -0700, Rustom Mody wrote:
> Chris Angelico wrote:
>> it's impossible for most people to type (and programming with a palette
>> of arbitrary syntactic tokens isn't my idea of fun)...
>
> Where's the suggestion to use a "palette of arbitrary tokens" ?
>
> I just tried a greek keyboard; ie do
> $ setxkbmap -option "grp:switch,grp:alt_shift_toggle,grp_led:scroll"
> -layout "us,gr"
>
> Thereafter typing
> abcdefghijklmnopqrstuvwxyz
> after a Shift-Alt
> gives
> αβψδεφγηιξκλμνοπ;ρστθωςχυζ
>
> One more Shift-Alt and back to roman
>
> IOW the extra typing cost for greek letters is negligible over the
> corresponding roman ones
25 Unicode characters down, 1114000+ to go :-)
There's not just the keyboard mapping. There's the mental cost of knowing
which keyboard mapping you need ("is it Greek, Hebrew, or maths
symbols?"), the cost of remembering the mapping from the keys you see on
the keyboard to the keys they are mapped to ("is Ω mapped to O or W?")
and so forth. If you know lambda-calculus, you might associate λ with
functions, but if you don't, it's as obfuscated as associating Ч with
raising exceptions.
if not isinstance(obj, int):
ЧTypeError("expected an int, got %r" % type(obj))
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Devin Jeanpierre <jeanpierreda@gmail.com> |
|---|---|
| Date | 2014-04-22 23:19 -0700 |
| Message-ID | <mailman.9455.1398234027.18130.python-list@python.org> |
| In reply to | #70530 |
On Tue, Apr 22, 2014 at 10:52 PM, Steven D'Aprano <steve@pearwood.info> wrote:
> There's not just the keyboard mapping. There's the mental cost of knowing
> which keyboard mapping you need ("is it Greek, Hebrew, or maths
> symbols?"), the cost of remembering the mapping from the keys you see on
> the keyboard to the keys they are mapped to ("is Ω mapped to O or W?")
> and so forth. If you know lambda-calculus, you might associate λ with
> functions, [...]
Or if you know Python and the name of the letter ("lambda").
But yes, typing out the special characters is annoying. I just use
words. The only downside to using words is, how do you specify capital
versus lowercase letters? "Gamma = ..." violates the style guide! :(
-- Devin
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.python
csiph-web