Groups > comp.lang.python > #47448 > unrolled thread

A few questiosn about encoding

Started by	Νικόλαος Κούρας <nikos.gr33k@gmail.com>
First post	2013-06-09 03:44 -0700
Last post	2013-06-14 10:28 +0300
Articles	20 on this page of 110 — 36 participants

Back to article view | Back to comp.lang.python

  A few questiosn about encoding Νικόλαος Κούρας <nikos.gr33k@gmail.com> - 2013-06-09 03:44 -0700
    Re: A few questiosn about encoding Fábio Santos <fabiosantosart@gmail.com> - 2013-06-09 13:18 +0100
    Re: A few questiosn about encoding Nobody <nobody@nowhere.com> - 2013-06-09 18:01 +0100
    Re: A few questiosn about encoding Chris “Kwpolska” Warrick <kwpolska@gmail.com> - 2013-06-09 19:12 +0200
      Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 09:09 +0000
        Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-12 09:24 +0000
          Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 14:23 +0300
            Re: A few questiosn about encoding Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2013-06-12 14:52 +0200
            Re: A few questiosn about encoding Nobody <nobody@nowhere.com> - 2013-06-12 21:30 +0100
              Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 01:40 +0000
                Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-13 12:01 +1000
                  Re: A few questiosn about encoding Nobody <nobody@nowhere.com> - 2013-06-13 11:02 +0100
              Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 09:21 +0300
                Re: A few questiosn about encoding jmfauth <wxjmfauth@gmail.com> - 2013-06-12 23:28 -0700
                Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-13 16:48 +1000
            Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 00:13 +0000
              Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 09:09 +0300
                Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 07:11 +0000
                  Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 10:42 +0300
                    Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-13 17:58 +1000
                      Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 11:08 +0300
                        Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-13 18:20 +1000
                          Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 12:41 +0300
                            Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 11:49 +0000
                              Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 17:19 +0300
                                Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-14 11:00 +1000
                                  Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 09:59 +0300
                                    Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-14 20:14 +1000
                                      Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 16:58 +0300
                                        Re: A few questiosn about encoding Joel Goldstick <joel.goldstick@gmail.com> - 2013-06-14 11:21 -0400
                                          Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 18:26 +0300
                                            Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-15 03:03 +1000
                                              Re: A few questiosn about encoding Walter Hurry <walterhurry@lavabit.com> - 2013-06-14 23:32 +0000
                                        Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-15 10:26 +1000
                                        Re: A few questiosn about encoding Denis McMahon <denismfmcmahon@gmail.com> - 2013-06-15 06:34 +0000
                                          Re: A few questiosn about encoding Grant Edwards <invalid@invalid.invalid> - 2013-06-15 14:44 +0000
                                            Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-15 17:49 +0300
                                              Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-15 15:30 +0000
                                            Re: A few questiosn about encoding Roy Smith <roy@panix.com> - 2013-06-15 10:59 -0400
                                              Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-15 18:14 +0300
                                                Re: A few questiosn about encoding Joel Goldstick <joel.goldstick@gmail.com> - 2013-06-15 11:35 -0400
                                        Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-15 22:26 +0300
                                          Re: A few questiosn about encoding Benjamin Schollnick <benjamin@schollnick.net> - 2013-06-15 16:35 -0400
                                          Re: A few questiosn about encoding Chris “Kwpolska” Warrick <kwpolska@gmail.com> - 2013-06-16 15:45 +0200
                        Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 09:36 +0200
                          Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 10:49 +0300
                            Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 10:22 +0200
                              Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 11:37 +0300
                                Don't feed the troll... (was: Re: A few questiosn about encoding) Heiko Wundram <modelnine@modelnine.org> - 2013-06-14 11:06 +0200
                                  Re: Don't feed the troll... Nick the Gr33k <support@superhost.gr> - 2013-06-14 12:32 +0300
                                    Re: Don't feed the troll... Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 13:09 +0200
                                      Re: Don't feed the troll... Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:36 +0300
                                        Re: Don't feed the troll... Joel Goldstick <joel.goldstick@gmail.com> - 2013-06-14 08:44 -0400
                                        Re: Don't feed the troll... Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 15:25 +0200
                                          Re: Don't feed the troll... Neil Cerutti <neilc@norwich.edu> - 2013-06-14 15:54 +0000
                                    Re: Don't feed the troll... Heiko Wundram <modelnine@modelnine.org> - 2013-06-14 12:15 +0200
                                    Re: Don't feed the troll... Guy Scree <nobody@nowhere.com> - 2013-06-14 18:50 -0400
                                    Re: Don't feed the troll... Denis McMahon <denismfmcmahon@gmail.com> - 2013-06-15 06:31 +0000
                                      Re: Don't feed the troll... Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-06-15 13:04 -0400
                                    Re: Don't feed the troll... Guy Scree <nobody@nowhere.com> - 2013-06-17 16:15 -0400
                                      Re: Don't feed the troll... Chris Angelico <rosuav@gmail.com> - 2013-06-18 07:46 +1000
                                Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-14 20:19 +1000
                                  Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:41 +0300
                                Re: Don't feed the troll... (was: Re: A few questiosn about encoding) Fábio Santos <fabiosantosart@gmail.com> - 2013-06-14 11:20 +0100
                                  Re: Don't feed the troll... (was: Re: A few questiosn about encoding) rusi <rustompmody@gmail.com> - 2013-06-14 04:51 -0700
                                    Re: Don't feed the help-vampire rusi <rustompmody@gmail.com> - 2013-06-14 05:09 -0700
                                      Re: Don't feed the help-vampire Heiko Wundram <modelnine@modelnine.org> - 2013-06-14 14:31 +0200
                                      Re: Don't feed the help-vampire Ian Kelly <ian.g.kelly@gmail.com> - 2013-06-14 10:51 -0600
                                    Re: Don't feed the troll... Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:50 +0300
                                      Re: Don't feed the troll... Zero Piraeus <schesis@gmail.com> - 2013-06-14 09:33 -0400
                                  Re: Don't feed the troll... Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:45 +0300
                                    Re: Don't feed the troll... Heiko Wundram <modelnine@modelnine.org> - 2013-06-14 14:58 +0200
                                    Re: Don't feed the troll... Fábio Santos <fabiosantosart@gmail.com> - 2013-06-14 14:25 +0100
                                    Re: Don't feed the troll... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-14 17:12 +0100
                                Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 12:50 +0200
                                  Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:59 +0300
                                    Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 15:52 +0200
                                    Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-15 10:28 +1000
                                    Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-17 08:49 +0200
                                Re: Don't feed the troll... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-14 12:57 +0100
                                Re: Don't feed the troll... (was: Re: A few questiosn about encoding) "D'Arcy J.M. Cain" <darcy@druid.net> - 2013-06-14 13:13 -0400
                                Re: Don't feed the troll... (was: Re: A few questiosn about encoding) Chris Angelico <rosuav@gmail.com> - 2013-06-15 03:31 +1000
                                  Re: Don't feed the troll... (was: Re: A few questiosn about encoding) Grant Edwards <invalid@invalid.invalid> - 2013-06-14 19:40 +0000
                                Re: Don't feed the troll "D'Arcy J.M. Cain" <darcy@druid.net> - 2013-06-14 13:56 -0400
                                Re: Don't feed the troll Tim Chase <python.list@tim.thechases.com> - 2013-06-14 14:00 -0500
                                Re: Don't feed the troll "D'Arcy J.M. Cain" <darcy@druid.net> - 2013-06-14 15:17 -0400
                                Re: Don't feed the troll... Ben Finney <ben+python@benfinney.id.au> - 2013-06-15 10:42 +1000
                  Re: A few questiosn about encoding Rick Johnson <rantingrickjohnson@gmail.com> - 2013-06-19 18:46 -0700
                    Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-20 06:26 +0000
                      Re: A few questiosn about encoding MRAB <python@mrabarnett.plus.com> - 2013-06-20 12:43 +0100
                        Re: A few questiosn about encoding wxjmfauth@gmail.com - 2013-06-20 09:27 -0700
                          Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-21 02:37 +1000
                          Re: A few questiosn about encoding MRAB <python@mrabarnett.plus.com> - 2013-06-20 18:17 +0100
                            Re: A few questiosn about encoding wxjmfauth@gmail.com - 2013-06-23 08:51 -0700
                              Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-23 16:30 +0000
                                Re: A few questiosn about encoding wxjmfauth@gmail.com - 2013-06-25 13:16 -0700
                          Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-21 03:21 +1000
                          Re: A few questiosn about encoding Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-20 20:43 +0100
                      Re: A few questiosn about encoding Rick Johnson <rantingrickjohnson@gmail.com> - 2013-06-20 06:40 -0700
                        Re: A few questiosn about encoding Andrew Berg <robotsondrugs@gmail.com> - 2013-06-20 09:04 -0500
                          Re: A few questiosn about encoding Rick Johnson <rantingrickjohnson@gmail.com> - 2013-06-20 08:12 -0700
                            Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-21 01:26 +1000
                            Re: A few questiosn about encoding Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2013-06-20 20:25 +0300
                        Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-21 01:28 +1000
                        Re: A few questiosn about encoding Andreas Perstinger <andipersti@gmail.com> - 2013-06-20 19:08 +0200
          Re: A few questiosn about encoding Dave Angel <davea@davea.name> - 2013-06-12 08:43 -0400
        Re: A few questiosn about encoding Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-06-13 18:46 -0400
          Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 08:34 +0300
            Re: A few questiosn about encoding Zero Piraeus <schesis@gmail.com> - 2013-06-14 02:00 -0400
              Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 10:28 +0300

Page 5 of 6 — ← Prev page 1 2 3 4 [5] 6 Next page →

#48188 — Re: Don't feed the troll... (was: Re: A few questiosn about encoding)

From	"D'Arcy J.M. Cain" <darcy@druid.net>
Date	2013-06-14 13:13 -0400
Subject	Re: Don't feed the troll... (was: Re: A few questiosn about encoding)
Message-ID	<mailman.3322.1371230016.3114.python-list@python.org>
In reply to	#48088

On Fri, 14 Jun 2013 11:06:55 +0200
Heiko Wundram <modelnine@modelnine.org> wrote:
> Come on now, this is _so_ obviously trolling, it's not even remotely 
> funny anymore. Why doesn't killfiling work with the mailing list
> version of the python list? :-(

A big problem, other than Mr. Support's shenanigans with his email
address, is that even those of us who seem to have successfully
*plonked* him get the responses to him.  The biggest issue with a troll
isn't so much the annoying emails from him but the amplified slew of
responses.  That's the point of a troll after all.

The answer is to always make sure that you include the previous poster
in the reply as a Cc or To.  I filter out any email that has the string
"support@superhost.gr" in a header so I would also filter out the
replies if people would follow that simple rule.

I have suggested this before but the push back I get is that then
people would get two copies of the email, one to them and one to the
list.  My answer is simple.  Get a proper email system that filters out
duplicates.  Is there an email client out there that does not have this
facility?

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 788 2246     (DoD#0082)    (eNTP)   |  what's for dinner.
IM: darcy@Vex.Net, VOIP: sip:darcy@Vex.Net

[toc] | [prev] | [next] | [standalone]

#48194 — Re: Don't feed the troll... (was: Re: A few questiosn about encoding)

From	Chris Angelico <rosuav@gmail.com>
Date	2013-06-15 03:31 +1000
Subject	Re: Don't feed the troll... (was: Re: A few questiosn about encoding)
Message-ID	<mailman.3327.1371231081.3114.python-list@python.org>
In reply to	#48088

On Sat, Jun 15, 2013 at 3:13 AM, D'Arcy J.M. Cain <darcy@druid.net> wrote:
> The answer is to always make sure that you include the previous poster
> in the reply as a Cc or To.  I filter out any email that has the string
> "support@superhost.gr" in a header so I would also filter out the
> replies if people would follow that simple rule.
>
> I have suggested this before but the push back I get is that then
> people would get two copies of the email, one to them and one to the
> list.  My answer is simple.  Get a proper email system that filters out
> duplicates.  Is there an email client out there that does not have this
> facility?

The main downside to that is not the first response, to
somebody@somewhere and python-list, but the subsequent ones. Do you
include everyone's addresses? And if so, how do they then get off the
list? (This is a serious consideration. I had some very angry people
asking me to unsubscribe them from a (private) mailman list I run, but
they weren't subscribed at all - they were being cc'd.)

I prefer to simply mail the list. You should be able to mute entire
threads, and he doesn't start more than a couple a day usually.

ChrisA

[toc] | [prev] | [next] | [standalone]

#48216 — Re: Don't feed the troll... (was: Re: A few questiosn about encoding)

From	Grant Edwards <invalid@invalid.invalid>
Date	2013-06-14 19:40 +0000
Subject	Re: Don't feed the troll... (was: Re: A few questiosn about encoding)
Message-ID	<kpfriq$sd2$4@reader1.panix.com>
In reply to	#48194

On 2013-06-14, Chris Angelico <rosuav@gmail.com> wrote:
> On Sat, Jun 15, 2013 at 3:13 AM, D'Arcy J.M. Cain <darcy@druid.net> wrote:
>> The answer is to always make sure that you include the previous poster
>> in the reply as a Cc or To.  I filter out any email that has the string
>> "support@superhost.gr" in a header so I would also filter out the
>> replies if people would follow that simple rule.
>>
>> I have suggested this before but the push back I get is that then
>> people would get two copies of the email, one to them and one to the
>> list.  My answer is simple.  Get a proper email system that filters out
>> duplicates.  Is there an email client out there that does not have this
>> facility?
>
> The main downside to that is not the first response, to
> somebody@somewhere and python-list, but the subsequent ones. Do you
> include everyone's addresses? And if so, how do they then get off the
> list? (This is a serious consideration. I had some very angry people
> asking me to unsubscribe them from a (private) mailman list I run, but
> they weren't subscribed at all - they were being cc'd.)

I think the answer is to automatically kill all threads stared by
"him".

Unfortunately, I don't know if that's possible in most newsreaders.

-- 
Grant Edwards               grant.b.edwards        Yow! A dwarf is passing out
                                  at               somewhere in Detroit!
                              gmail.com

[toc] | [prev] | [next] | [standalone]

#48198 — Re: Don't feed the troll

From	"D'Arcy J.M. Cain" <darcy@druid.net>
Date	2013-06-14 13:56 -0400
Subject	Re: Don't feed the troll
Message-ID	<mailman.3329.1371232585.3114.python-list@python.org>
In reply to	#48088

On Sat, 15 Jun 2013 03:31:12 +1000
Chris Angelico <rosuav@gmail.com> wrote:
> On Sat, Jun 15, 2013 at 3:13 AM, D'Arcy J.M. Cain <darcy@druid.net>
> wrote:
> > I have suggested this before but the push back I get is that then
> > people would get two copies of the email, one to them and one to the
> > list.  My answer is simple.  Get a proper email system that filters
> > out duplicates.  Is there an email client out there that does not
> > have this facility?
> 
> The main downside to that is not the first response, to
> somebody@somewhere and python-list, but the subsequent ones. Do you
> include everyone's addresses? And if so, how do they then get off the

No, I think Ccing the From is enough.  Other than the OP who is already
*plonked* replies to the replies tend to have at least a modicum of
information. 
 
> I prefer to simply mail the list. You should be able to mute entire
> threads, and he doesn't start more than a couple a day usually.

But then I have to deal with each thread.  I don't want to deal with
them at all.

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 788 2246     (DoD#0082)    (eNTP)   |  what's for dinner.
IM: darcy@Vex.Net, VOIP: sip:darcy@Vex.Net

[toc] | [prev] | [next] | [standalone]

#48206 — Re: Don't feed the troll

From	Tim Chase <python.list@tim.thechases.com>
Date	2013-06-14 14:00 -0500
Subject	Re: Don't feed the troll
Message-ID	<mailman.3335.1371236310.3114.python-list@python.org>
In reply to	#48088

On 2013-06-14 13:56, D'Arcy J.M. Cain wrote:
> > I prefer to simply mail the list. You should be able to mute
> > entire threads, and he doesn't start more than a couple a day
> > usually.
> 
> But then I have to deal with each thread.  I don't want to deal with
> them at all.

At least Thunderbird had the ability to set up a filter of the form
"If the sender matches 'xyz@example.com' then kill this thread" so
the thread-killing (or sub-thread killing) was automatic.

I set that up for Xah posts and my life was far better.

I've since switched to Claws for my mail and miss that kill-thread
functionality. :-/

-tkc

[toc] | [prev] | [next] | [standalone]

#48211 — Re: Don't feed the troll

From	"D'Arcy J.M. Cain" <darcy@druid.net>
Date	2013-06-14 15:17 -0400
Subject	Re: Don't feed the troll
Message-ID	<mailman.3337.1371237438.3114.python-list@python.org>
In reply to	#48088

On Fri, 14 Jun 2013 14:00:17 -0500
Tim Chase <python.list@tim.thechases.com> wrote:
> I set that up for Xah posts and my life was far better.

Has he disappeared or is my filtering just really successful?

> I've since switched to Claws for my mail and miss that kill-thread
> functionality. :-/

Heh.  Exactly what I am using.

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 788 2246     (DoD#0082)    (eNTP)   |  what's for dinner.
IM: darcy@Vex.Net, VOIP: sip:darcy@Vex.Net

[toc] | [prev] | [next] | [standalone]

#48237 — Re: Don't feed the troll...

From	Ben Finney <ben+python@benfinney.id.au>
Date	2013-06-15 10:42 +1000
Subject	Re: Don't feed the troll...
Message-ID	<mailman.3350.1371256953.3114.python-list@python.org>
In reply to	#48088

"D'Arcy J.M. Cain" <darcy@druid.net> writes:

> The answer is to always make sure that you include the previous poster
> in the reply as a Cc or To.

Dragging the discussion from one forum (comp.lang.python) to another
(every person's individual email) is obnoxious. Please don't.

> I have suggested this before but the push back I get is that then
> people would get two copies of the email, one to them and one to the
> list.

In my case, I don't want to receive the messages by email *at all*. I
participate in this forum using a non-email system, and it works fine so
long as people continue to participate in this forum.

Even for those who do participate by email, though, your approach is
broken:

> My answer is simple.  Get a proper email system that filters out
> duplicates.

The message sent to the individual typically arrives earlier (since it
is sent straight from you to the individual), and the message on the
forum arrives later (since it typically requires more processing).

But since we're participating in the discussion on the forum and not in
individual email, it is the later one we want, and the earlier one
should be deleted.

So at the point the first message arrives, it isn't a duplicate. The
mail program will show it anyway, because “remove duplicates” can't
catch it when there's no duplicate yet.

The proper solution is for you not to send that one at all, and send
only the message to the forum.

You do this by using your mail client's “reply to list” function, which
uses the RFC 3696 information in every mailing list message.

Is there any mail client which doesn't have this function? If so, use
your vendor's bug reporting system to request this feature as standard,
and/or switch to a better mail client until they fix that.

-- 
 \       “Timid men prefer the calm of despotism to the boisterous sea |
  `\                                    of liberty.” —Thomas Jefferson |
_o__)                                                                  |
Ben Finney

[toc] | [prev] | [next] | [standalone]

#48767

From	Rick Johnson <rantingrickjohnson@gmail.com>
Date	2013-06-19 18:46 -0700
Message-ID	<77ba6b16-4b1d-47a6-9b9b-5af45335c4fe@googlegroups.com>
In reply to	#47912

On Thursday, June 13, 2013 2:11:08 AM UTC-5, Steven D'Aprano wrote:

> Gah! That's twice I've screwed that up. 
> Sorry about that!

Yeah, and your difficulty explaining the Unicode implementation reminds me of a passage from the Python zen:

 "If the implementation is hard to explain, it's a bad idea."

[toc] | [prev] | [next] | [standalone]

#48777

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-06-20 06:26 +0000
Message-ID	<51c2a089$0$29973$c3e8da3$5496439d@news.astraweb.com>
In reply to	#48767

On Wed, 19 Jun 2013 18:46:59 -0700, Rick Johnson wrote:

> On Thursday, June 13, 2013 2:11:08 AM UTC-5, Steven D'Aprano wrote:
>  
>> Gah! That's twice I've screwed that up. Sorry about that!
> 
> Yeah, and your difficulty explaining the Unicode implementation reminds
> me of a passage from the Python zen:
> 
>  "If the implementation is hard to explain, it's a bad idea."

The *implementation* is easy to explain. It's the names of the encodings 
which I get tangled up in.

ASCII: Supports exactly 127 code points, each of which takes up exactly 7 
bits. Each code point represents a character.

Latin-1, Latin-2, MacRoman, MacGreek, ISO-8859-7, Big5, Windows-1251, and 
about a gazillion other legacy charsets, all of which are mutually 
incompatible: supports anything from 127 to 65535 different code points, 
usually under 256.

UCS-2: Supports exactly 65535 code points, each of which takes up exactly 
two bytes. That's fewer than required, so it is obsoleted by:

UTF-16: Supports all 1114111 code points in the Unicode charset, using a 
variable-width system where the most popular characters use exactly two-
bytes and the remaining ones use a pair of characters.

UCS-4: Supports exactly 4294967295 code points, each of which takes up 
exactly four bytes. That is more than needed for the Unicode charset, so 
this is obsoleted by:

UTF-32: Supports all 1114111 code points, using exactly four bytes each. 
Code points outside of the range 0 through 1114111 inclusive are an error.

UTF-8: Supports all 1114111 code points, using a variable-width system 
where popular ASCII characters require 1 byte, and others use 2, 3 or 4 
bytes as needed.

Ignoring the legacy charsets, only UTF-16 is a terribly complicated 
implementation, due to the surrogate pairs. But even that is not too bad. 
The real complication comes from the interactions between systems which 
use different encodings, and that's nothing to do with Unicode.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#48785

From	MRAB <python@mrabarnett.plus.com>
Date	2013-06-20 12:43 +0100
Message-ID	<mailman.3620.1371728614.3114.python-list@python.org>
In reply to	#48777

On 20/06/2013 07:26, Steven D'Aprano wrote:
> On Wed, 19 Jun 2013 18:46:59 -0700, Rick Johnson wrote:
>
>> On Thursday, June 13, 2013 2:11:08 AM UTC-5, Steven D'Aprano wrote:
>>
>>> Gah! That's twice I've screwed that up. Sorry about that!
>>
>> Yeah, and your difficulty explaining the Unicode implementation reminds
>> me of a passage from the Python zen:
>>
>>  "If the implementation is hard to explain, it's a bad idea."
>
> The *implementation* is easy to explain. It's the names of the encodings
> which I get tangled up in.
>
You're off by one below!
>
> ASCII: Supports exactly 127 code points, each of which takes up exactly 7
> bits. Each code point represents a character.
>
128 codepoints.

> Latin-1, Latin-2, MacRoman, MacGreek, ISO-8859-7, Big5, Windows-1251, and
> about a gazillion other legacy charsets, all of which are mutually
> incompatible: supports anything from 127 to 65535 different code points,
> usually under 256.
>
128 to 65536 codepoints.

> UCS-2: Supports exactly 65535 code points, each of which takes up exactly
> two bytes. That's fewer than required, so it is obsoleted by:
>
65536 codepoints.

etc.

> UTF-16: Supports all 1114111 code points in the Unicode charset, using a
> variable-width system where the most popular characters use exactly two-
> bytes and the remaining ones use a pair of characters.
>
> UCS-4: Supports exactly 4294967295 code points, each of which takes up
> exactly four bytes. That is more than needed for the Unicode charset, so
> this is obsoleted by:
>
> UTF-32: Supports all 1114111 code points, using exactly four bytes each.
> Code points outside of the range 0 through 1114111 inclusive are an error.
>
> UTF-8: Supports all 1114111 code points, using a variable-width system
> where popular ASCII characters require 1 byte, and others use 2, 3 or 4
> bytes as needed.
>
>
> Ignoring the legacy charsets, only UTF-16 is a terribly complicated
> implementation, due to the surrogate pairs. But even that is not too bad.
> The real complication comes from the interactions between systems which
> use different encodings, and that's nothing to do with Unicode.
>
>

[toc] | [prev] | [next] | [standalone]

#48806

From	wxjmfauth@gmail.com
Date	2013-06-20 09:27 -0700
Message-ID	<114200cf-2d46-46cb-bb5f-7c5f8ab98a66@googlegroups.com>
In reply to	#48785

Le jeudi 20 juin 2013 13:43:28 UTC+2, MRAB a écrit :
> On 20/06/2013 07:26, Steven D'Aprano wrote:
> 
> > On Wed, 19 Jun 2013 18:46:59 -0700, Rick Johnson wrote:
> 
> >
> 
> >> On Thursday, June 13, 2013 2:11:08 AM UTC-5, Steven D'Aprano wrote:
> 
> >>
> 
> >>> Gah! That's twice I've screwed that up. Sorry about that!
> 
> >>
> 
> >> Yeah, and your difficulty explaining the Unicode implementation reminds
> 
> >> me of a passage from the Python zen:
> 
> >>
> 
> >>  "If the implementation is hard to explain, it's a bad idea."
> 
> >
> 
> > The *implementation* is easy to explain. It's the names of the encodings
> 
> > which I get tangled up in.
> 
> >
> 
> You're off by one below!
> 
> >
> 
> > ASCII: Supports exactly 127 code points, each of which takes up exactly 7
> 
> > bits. Each code point represents a character.
> 
> >
> 
> 128 codepoints.
> 
> 
> 
> > Latin-1, Latin-2, MacRoman, MacGreek, ISO-8859-7, Big5, Windows-1251, and
> 
> > about a gazillion other legacy charsets, all of which are mutually
> 
> > incompatible: supports anything from 127 to 65535 different code points,
> 
> > usually under 256.
> 
> >
> 
> 128 to 65536 codepoints.
> 
> 
> 
> > UCS-2: Supports exactly 65535 code points, each of which takes up exactly
> 
> > two bytes. That's fewer than required, so it is obsoleted by:
> 
> >
> 
> 65536 codepoints.
> 
> 
> 
> etc.
> 
> 
> 
> > UTF-16: Supports all 1114111 code points in the Unicode charset, using a
> 
> > variable-width system where the most popular characters use exactly two-
> 
> > bytes and the remaining ones use a pair of characters.
> 
> >
> 
> > UCS-4: Supports exactly 4294967295 code points, each of which takes up
> 
> > exactly four bytes. That is more than needed for the Unicode charset, so
> 
> > this is obsoleted by:
> 
> >
> 
> > UTF-32: Supports all 1114111 code points, using exactly four bytes each.
> 
> > Code points outside of the range 0 through 1114111 inclusive are an error.
> 
> >
> 
> > UTF-8: Supports all 1114111 code points, using a variable-width system
> 
> > where popular ASCII characters require 1 byte, and others use 2, 3 or 4
> 
> > bytes as needed.
> 
> >
> 
> >
> 
> > Ignoring the legacy charsets, only UTF-16 is a terribly complicated
> 
> > implementation, due to the surrogate pairs. But even that is not too bad.
> 
> > The real complication comes from the interactions between systems which
> 
> > use different encodings, and that's nothing to do with Unicode.
> 
> >
> 
> >

And all these coding schemes have something in common,
they work all with a unique set of code points, more
precisely a unique set of encoded code points (not
the set of implemented code points (byte)).

Just what the flexible string representation is not
doing, it artificially devides unicode in subsets and try
to handle eache subset differently.

On this other side, that is because it is impossible to
work properly with multiple sets of encoded code points
that all these coding schemes exist today. There are simply
no other way.

Even "exotic" schemes like "CID-fonts" used in pdf
are based on that scheme.

jmf

[toc] | [prev] | [next] | [standalone]

#48807

From	Chris Angelico <rosuav@gmail.com>
Date	2013-06-21 02:37 +1000
Message-ID	<mailman.3630.1371746277.3114.python-list@python.org>
In reply to	#48806

On Fri, Jun 21, 2013 at 2:27 AM,  <wxjmfauth@gmail.com> wrote:
> And all these coding schemes have something in common,
> they work all with a unique set of code points, more
> precisely a unique set of encoded code points (not
> the set of implemented code points (byte)).
>
> Just what the flexible string representation is not
> doing, it artificially devides unicode in subsets and try
> to handle eache subset differently.
>

UTF-16 divides Unicode into two subsets: BMP characters (encoded using
one 16-bit unit) and astral characters (encoded using two 16-bit units
in the D800::/5 netblock, or equivalent thereof). Your beloved narrow
builds are guilty of exactly the same crime as the hated 3.3.

ChrisA

[toc] | [prev] | [next] | [standalone]

#48810

From	MRAB <python@mrabarnett.plus.com>
Date	2013-06-20 18:17 +0100
Message-ID	<mailman.3632.1371748640.3114.python-list@python.org>
In reply to	#48806

On 20/06/2013 17:37, Chris Angelico wrote:
> On Fri, Jun 21, 2013 at 2:27 AM,  <wxjmfauth@gmail.com> wrote:
>> And all these coding schemes have something in common,
>> they work all with a unique set of code points, more
>> precisely a unique set of encoded code points (not
>> the set of implemented code points (byte)).
>>
>> Just what the flexible string representation is not
>> doing, it artificially devides unicode in subsets and try
>> to handle eache subset differently.
>>
>
>
> UTF-16 divides Unicode into two subsets: BMP characters (encoded using
> one 16-bit unit) and astral characters (encoded using two 16-bit units
> in the D800::/5 netblock, or equivalent thereof). Your beloved narrow
> builds are guilty of exactly the same crime as the hated 3.3.
>
UTF-8 divides Unicode into subsets which are encoded in 1, 2, 3, or 4
bytes, and those who previously used ASCII still need only 1 byte per
codepoint!

[toc] | [prev] | [next] | [standalone]

#48986

From	wxjmfauth@gmail.com
Date	2013-06-23 08:51 -0700
Message-ID	<28586b5f-cb51-4e41-a47d-38a18723b51c@googlegroups.com>
In reply to	#48810

Le jeudi 20 juin 2013 19:17:12 UTC+2, MRAB a écrit :
> On 20/06/2013 17:37, Chris Angelico wrote:
> 
> > On Fri, Jun 21, 2013 at 2:27 AM,  <wxjmfauth@gmail.com> wrote:
> 
> >> And all these coding schemes have something in common,
> 
> >> they work all with a unique set of code points, more
> 
> >> precisely a unique set of encoded code points (not
> 
> >> the set of implemented code points (byte)).
> 
> >>
> 
> >> Just what the flexible string representation is not
> 
> >> doing, it artificially devides unicode in subsets and try
> 
> >> to handle eache subset differently.
> 
> >>
> 
> >
> 
> >
> 
> > UTF-16 divides Unicode into two subsets: BMP characters (encoded using
> 
> > one 16-bit unit) and astral characters (encoded using two 16-bit units
> 
> > in the D800::/5 netblock, or equivalent thereof). Your beloved narrow
> 
> > builds are guilty of exactly the same crime as the hated 3.3.
> 
> >
> 
> UTF-8 divides Unicode into subsets which are encoded in 1, 2, 3, or 4
> 
> bytes, and those who previously used ASCII still need only 1 byte per
> 
> codepoint!

Sorry, but no, it does not work in that way:
confusion between the set of encoded code points
and the implementation of these called code units.

utf-8: how many bytes to hold an "a" in memory?
one byte.

flexible string representation: how many bytes to
hold an "a" in memory? One byte? No, two.
(Funny, it consumes more memory to hold an ascii char
than ascii itself)

utf-8: In a series of bytes implementing the encoded code
points supposed to hold a string, picking a byte and
finding to which encoded code point it belongs is a no prolem.

flexible string representation: In a series of bytes
implementing the encoded code points supposed to hold a
string, picking a byte and finding to which encoded code
point it belongs is ... impossible !

One of the cause of the bad working of this flexible string
representation.

The basics of any coding scheme, unicode included.

jmf

[toc] | [prev] | [next] | [standalone]

#48989

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-06-23 16:30 +0000
Message-ID	<51c722af$0$29999$c3e8da3$5496439d@news.astraweb.com>
In reply to	#48986

On Sun, 23 Jun 2013 08:51:41 -0700, wxjmfauth wrote:

> utf-8: how many bytes to hold an "a" in memory? one byte.
> 
> flexible string representation: how many bytes to hold an "a" in memory?
> One byte? No, two. (Funny, it consumes more memory to hold an ascii char
> than ascii itself)

Incorrect. Python strings have overhead because they are objects, so 
let's see the difference adding a single character makes:

# Python 3.3, with the hated flexible string representation:
py> sys.getsizeof('a'*100) - sys.getsizeof('a'*99)
1

# Python 3.2:
py> sys.getsizeof('a'*100) - sys.getsizeof('a'*99)
4

How about a French é character? Of course, ASCII cannot store it *at 
all*, but let's see what Python can do:

# The hated Python 3.3 again:
py> sys.getsizeof('é'*100) - sys.getsizeof('é'*99)
1

# And Python 3.2:
py> sys.getsizeof('é'*100) - sys.getsizeof('é'*99)
4

> utf-8: In a series of bytes implementing the encoded code points
> supposed to hold a string, picking a byte and finding to which encoded
> code point it belongs is a no prolem.

Incorrect. UTF-8 is unsuitable for random access, since it has variable-
width characters, anything from 1 to 4 bytes. So you cannot just jump 
directly to character 1000 in a block of text, you have to inspect each 
byte one-by-one to decide whether it is a 1, 2, 3 or 4 byte character.

> flexible string representation: In a series of bytes implementing the
> encoded code points supposed to hold a string, picking a byte and
> finding to which encoded code point it belongs is ... impossible !

Incorrect. It is absolutely trivial. Each string is marked as either 1-
byte, 2-byte or 4-byte. If it is a 1-byte string, then each byte is one 
character. If it is a 2-byte string, then it is just like Python 3.2 
narrow build, and each two bytes is a character. If it is a 4-byte 
string, then it is just like Python 3.2 wide build, and each four bytes 
is a character. Within a single string, the number of bytes per character 
is fixed, and random access is easy and fast.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#49184

From	wxjmfauth@gmail.com
Date	2013-06-25 13:16 -0700
Message-ID	<d15fd63c-283a-437d-9b27-0c19a5b69430@googlegroups.com>
In reply to	#48989

Le dimanche 23 juin 2013 18:30:40 UTC+2, Steven D'Aprano a écrit :
> On Sun, 23 Jun 2013 08:51:41 -0700, wxjmfauth wrote:
> 
> 
> 
> > utf-8: how many bytes to hold an "a" in memory? one byte.
> 
> > 
> 
> > flexible string representation: how many bytes to hold an "a" in memory?
> 
> > One byte? No, two. (Funny, it consumes more memory to hold an ascii char
> 
> > than ascii itself)
> 
> 
> 
> Incorrect. Python strings have overhead because they are objects, so 
> 
> let's see the difference adding a single character makes:
> 
> 
> 
> # Python 3.3, with the hated flexible string representation:
> 
> py> sys.getsizeof('a'*100) - sys.getsizeof('a'*99)
> 
> 1
> 
> 
> 
> # Python 3.2:
> 
> py> sys.getsizeof('a'*100) - sys.getsizeof('a'*99)
> 
> 4
> 
> 
> 
> 
> 
> How about a French é character? Of course, ASCII cannot store it *at 
> 
> all*, but let's see what Python can do:
> 
> 
> 
> 
> 
> # The hated Python 3.3 again:
> 
> py> sys.getsizeof('é'*100) - sys.getsizeof('é'*99)
> 
> 1
> 
> 
> 
> 
> 
> # And Python 3.2:
> 
> py> sys.getsizeof('é'*100) - sys.getsizeof('é'*99)
> 
> 4
> 
> 
> 
> 
> 
> 
> 
> > utf-8: In a series of bytes implementing the encoded code points
> 
> > supposed to hold a string, picking a byte and finding to which encoded
> 
> > code point it belongs is a no prolem.
> 
> 
> 
> Incorrect. UTF-8 is unsuitable for random access, since it has variable-
> 
> width characters, anything from 1 to 4 bytes. So you cannot just jump 
> 
> directly to character 1000 in a block of text, you have to inspect each 
> 
> byte one-by-one to decide whether it is a 1, 2, 3 or 4 byte character.
> 
> 
> 
> 
> 
> > flexible string representation: In a series of bytes implementing the
> 
> > encoded code points supposed to hold a string, picking a byte and
> 
> > finding to which encoded code point it belongs is ... impossible !
> 
> 
> 
> Incorrect. It is absolutely trivial. Each string is marked as either 1-
> 
> byte, 2-byte or 4-byte. If it is a 1-byte string, then each byte is one 
> 
> character. If it is a 2-byte string, then it is just like Python 3.2 
> 
> narrow build, and each two bytes is a character. If it is a 4-byte 
> 
> string, then it is just like Python 3.2 wide build, and each four bytes 
> 
> is a character. Within a single string, the number of bytes per character 
> 
> is fixed, and random access is easy and fast.
> 
> 
> 
> 
> 
> 
> 
> -- 
> 
> Steven

:-)

[toc] | [prev] | [next] | [standalone]

#48813

From	Chris Angelico <rosuav@gmail.com>
Date	2013-06-21 03:21 +1000
Message-ID	<mailman.3635.1371748902.3114.python-list@python.org>
In reply to	#48806

On Fri, Jun 21, 2013 at 3:17 AM, MRAB <python@mrabarnett.plus.com> wrote:
> On 20/06/2013 17:37, Chris Angelico wrote:
>>
>> On Fri, Jun 21, 2013 at 2:27 AM,  <wxjmfauth@gmail.com> wrote:
>>>
>>> And all these coding schemes have something in common,
>>> they work all with a unique set of code points, more
>>> precisely a unique set of encoded code points (not
>>> the set of implemented code points (byte)).
>>>
>>> Just what the flexible string representation is not
>>> doing, it artificially devides unicode in subsets and try
>>> to handle eache subset differently.
>>>
>>
>>
>> UTF-16 divides Unicode into two subsets: BMP characters (encoded using
>> one 16-bit unit) and astral characters (encoded using two 16-bit units
>> in the D800::/5 netblock, or equivalent thereof). Your beloved narrow
>> builds are guilty of exactly the same crime as the hated 3.3.
>>
> UTF-8 divides Unicode into subsets which are encoded in 1, 2, 3, or 4
> bytes, and those who previously used ASCII still need only 1 byte per
> codepoint!

Yes, but there's never (AFAIK) been a Python implementation that
represents strings in UTF-8; UTF-16 was one of two options for Python
2.2 through 3.2, and is the one that jmf always seems to be measuring
against.

ChrisA

[toc] | [prev] | [next] | [standalone]

#48825

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2013-06-20 20:43 +0100
Message-ID	<mailman.3639.1371757432.3114.python-list@python.org>
In reply to	#48806

On 20/06/2013 17:27, wxjmfauth@gmail.com wrote:
> Le jeudi 20 juin 2013 13:43:28 UTC+2, MRAB a écrit :
>> On 20/06/2013 07:26, Steven D'Aprano wrote:
>>
>>> On Wed, 19 Jun 2013 18:46:59 -0700, Rick Johnson wrote:
>>
>>>
>>
>>>> On Thursday, June 13, 2013 2:11:08 AM UTC-5, Steven D'Aprano wrote:
>>
>>>>
>>
>>>>> Gah! That's twice I've screwed that up. Sorry about that!
>>
>>>>
>>
>>>> Yeah, and your difficulty explaining the Unicode implementation reminds
>>
>>>> me of a passage from the Python zen:
>>
>>>>
>>
>>>>   "If the implementation is hard to explain, it's a bad idea."
>>
>>>
>>
>>> The *implementation* is easy to explain. It's the names of the encodings
>>
>>> which I get tangled up in.
>>
>>>
>>
>> You're off by one below!
>>
>>>
>>
>>> ASCII: Supports exactly 127 code points, each of which takes up exactly 7
>>
>>> bits. Each code point represents a character.
>>
>>>
>>
>> 128 codepoints.
>>
>>
>>
>>> Latin-1, Latin-2, MacRoman, MacGreek, ISO-8859-7, Big5, Windows-1251, and
>>
>>> about a gazillion other legacy charsets, all of which are mutually
>>
>>> incompatible: supports anything from 127 to 65535 different code points,
>>
>>> usually under 256.
>>
>>>
>>
>> 128 to 65536 codepoints.
>>
>>
>>
>>> UCS-2: Supports exactly 65535 code points, each of which takes up exactly
>>
>>> two bytes. That's fewer than required, so it is obsoleted by:
>>
>>>
>>
>> 65536 codepoints.
>>
>>
>>
>> etc.
>>
>>
>>
>>> UTF-16: Supports all 1114111 code points in the Unicode charset, using a
>>
>>> variable-width system where the most popular characters use exactly two-
>>
>>> bytes and the remaining ones use a pair of characters.
>>
>>>
>>
>>> UCS-4: Supports exactly 4294967295 code points, each of which takes up
>>
>>> exactly four bytes. That is more than needed for the Unicode charset, so
>>
>>> this is obsoleted by:
>>
>>>
>>
>>> UTF-32: Supports all 1114111 code points, using exactly four bytes each.
>>
>>> Code points outside of the range 0 through 1114111 inclusive are an error.
>>
>>>
>>
>>> UTF-8: Supports all 1114111 code points, using a variable-width system
>>
>>> where popular ASCII characters require 1 byte, and others use 2, 3 or 4
>>
>>> bytes as needed.
>>
>>>
>>
>>>
>>
>>> Ignoring the legacy charsets, only UTF-16 is a terribly complicated
>>
>>> implementation, due to the surrogate pairs. But even that is not too bad.
>>
>>> The real complication comes from the interactions between systems which
>>
>>> use different encodings, and that's nothing to do with Unicode.
>>
>>>
>>
>>>
>
> And all these coding schemes have something in common,
> they work all with a unique set of code points, more
> precisely a unique set of encoded code points (not
> the set of implemented code points (byte)).
>
> Just what the flexible string representation is not
> doing, it artificially devides unicode in subsets and try
> to handle eache subset differently.
>
> On this other side, that is because it is impossible to
> work properly with multiple sets of encoded code points
> that all these coding schemes exist today. There are simply
> no other way.
>
> Even "exotic" schemes like "CID-fonts" used in pdf
> are based on that scheme.
>
> jmf
>

I entirely agree with the viewpoints of jmfauth, Nick the Greek, rr, 
Xah Lee and Ilias Lazaridis on the grounds that disagreeing and stating 
my beliefs ends up with the Python Mailing List police standing on my 
back doorsetep.  Give me the NSA or GCHQ any day of the week :(

-- 
"Steve is going for the pink ball - and for those of you who are 
watching in black and white, the pink is next to the green." Snooker 
commentator 'Whispering' Ted Lowe.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]

#48791

From	Rick Johnson <rantingrickjohnson@gmail.com>
Date	2013-06-20 06:40 -0700
Message-ID	<4160f6c9-8a53-432f-b807-ae33ed69ac97@googlegroups.com>
In reply to	#48777

On Thursday, June 20, 2013 1:26:17 AM UTC-5, Steven D'Aprano wrote:
> The *implementation* is easy to explain. It's the names of
> the encodings which I get tangled up in.

Well, ignoring the fact that you're last explanation is
still buggy, you have not actually described an
"implementation", no, you've merely generalized ( and quite
vaguely i might add) the technical specification of a few
encoding. Let's ask Wikipedia to enlighten us on the
subject of "implementation":

    ############################################################
    #                  Define: Implementation                  #
    ############################################################
    # In computer science, an implementation is a realization  #
    # of a technical specification or algorithm as a program,  #
    # software component, or other computer system through     #
    # computer programming and deployment. Many                #
    # implementations may exist for a given specification or   #
    # standard. For example, web browsers contain              #
    # implementations of World Wide Web Consortium-recommended #
    # specifications, and software development tools contain   #
    # implementations of programming languages.                #
    ############################################################

Do you think someone could reliably implement the alphabet of a new
language in Unicode by using the general outline you
provided? -- again, ignoring your continual fumbling when
explaining that simple generalization :-)

Your generalization is analogous to explaining web browsers
as: "software that allows a user to view web pages in the
range www.*" Do you think someone could implement a web
browser from such limited specification? (if that was all
they knew?).

============================================================
 Since we're on the subject of Unicode:
============================================================
One the most humorous aspects of Unicode is that it has
encodings for Braille characters. Hmm, this presents a
conundrum of sorts. RIDDLE ME THIS?!

    Since Braille is a type of "reading" for the blind by
    utilizing the sense of touch (therefore DEMANDING 3
    dimensions) and glyphs derived from Unicode are
    restrictively two dimensional, because let's face it people,
    Unicode exists in your computer, and computer screens are
    two dimensional... but you already knew that -- i think?,
    then what is the purpose of a Unicode Braille character set?

That should haunt your nightmares for some time.

[toc] | [prev] | [next] | [standalone]

#48794

From	Andrew Berg <robotsondrugs@gmail.com>
Date	2013-06-20 09:04 -0500
Message-ID	<mailman.3623.1371737101.3114.python-list@python.org>
In reply to	#48791

On 2013.06.20 08:40, Rick Johnson wrote:
> One the most humorous aspects of Unicode is that it has
> encodings for Braille characters. Hmm, this presents a
> conundrum of sorts. RIDDLE ME THIS?!
> 
>     Since Braille is a type of "reading" for the blind by
>     utilizing the sense of touch (therefore DEMANDING 3
>     dimensions) and glyphs derived from Unicode are
>     restrictively two dimensional, because let's face it people,
>     Unicode exists in your computer, and computer screens are
>     two dimensional... but you already knew that -- i think?,
>     then what is the purpose of a Unicode Braille character set?
Two dimensional characters can be made into 3 dimensional shapes.
Building numbers are a good example of this.
We already have one Unicode troll; do we really need you too?

-- 
CPython 3.3.2 | Windows NT 6.2.9200 / FreeBSD 9.1

[toc] | [prev] | [next] | [standalone]

Page 5 of 6 — ← Prev page 1 2 3 4 [5] 6 Next page →

csiph-web

A few questiosn about encoding

Contents

#48188 — Re: Don't feed the troll... (was: Re: A few questiosn about encoding)

#48194 — Re: Don't feed the troll... (was: Re: A few questiosn about encoding)

#48216 — Re: Don't feed the troll... (was: Re: A few questiosn about encoding)

#48198 — Re: Don't feed the troll

#48206 — Re: Don't feed the troll

#48211 — Re: Don't feed the troll

#48237 — Re: Don't feed the troll...

#48767

#48777

#48785

#48806

#48807

#48810

#48986

#48989

#49184

#48813

#48825

#48791

#48794