Groups > comp.lang.python > #59510 > unrolled thread

python 3.3 repr

Started by	Robin Becker <robin@reportlab.com>
First post	2013-11-15 11:28 +0000
Last post	2013-11-15 14:23 -0500
Articles	11 on this page of 31 — 14 participants

Back to article view | Back to comp.lang.python

  python 3.3 repr Robin Becker <robin@reportlab.com> - 2013-11-15 11:28 +0000
    Re: python 3.3 repr Ned Batchelder <ned@nedbatchelder.com> - 2013-11-15 03:38 -0800
      Re: python 3.3 repr Robin Becker <robin@reportlab.com> - 2013-11-15 12:16 +0000
        Re: python 3.3 repr Ned Batchelder <ned@nedbatchelder.com> - 2013-11-15 05:54 -0800
          Re: python 3.3 repr Robin Becker <robin@reportlab.com> - 2013-11-15 14:29 +0000
          Re: python 3.3 repr Serhiy Storchaka <storchaka@gmail.com> - 2013-11-15 16:40 +0200
          Re: python 3.3 repr Robin Becker <robin@reportlab.com> - 2013-11-15 14:52 +0000
      Re: python 3.3 repr Roy Smith <roy@panix.com> - 2013-11-15 09:25 -0500
      Re: python 3.3 repr Robin Becker <robin@reportlab.com> - 2013-11-15 14:43 +0000
        Re: python 3.3 repr Ned Batchelder <ned@nedbatchelder.com> - 2013-11-15 07:08 -0800
          Re: python 3.3 repr Robin Becker <robin@reportlab.com> - 2013-11-15 15:39 +0000
          Re: python 3.3 repr Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-11-15 16:49 +0100
          Re: python 3.3 repr Chris Angelico <rosuav@gmail.com> - 2013-11-16 03:01 +1100
            Re: python 3.3 repr Neil Cerutti <neilc@norwich.edu> - 2013-11-15 17:47 +0000
              Re: python 3.3 repr Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-16 01:09 +0000
        Re: python 3.3 repr Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-15 17:10 +0000
          Re: python 3.3 repr Chris Angelico <rosuav@gmail.com> - 2013-11-16 04:29 +1100
          Re: python 3.3 repr Cousin Stanley <cousinstanley@gmail.com> - 2013-11-15 10:45 -0700
      Re: python 3.3 repr Joel Goldstick <joel.goldstick@gmail.com> - 2013-11-15 09:50 -0500
      Re: python 3.3 repr Robin Becker <robin@reportlab.com> - 2013-11-15 15:03 +0000
      Re: python 3.3 repr Joel Goldstick <joel.goldstick@gmail.com> - 2013-11-15 10:07 -0500
      Re: python 3.3 repr Chris Angelico <rosuav@gmail.com> - 2013-11-16 02:08 +1100
      Re: python 3.3 repr Robin Becker <robin@reportlab.com> - 2013-11-15 15:18 +0000
      Re: python 3.3 repr Roy Smith <roy@panix.com> - 2013-11-15 10:32 -0500
      Re: python 3.3 repr William Ray Wing <wrw@mac.com> - 2013-11-15 11:30 -0500
      Re: python 3.3 repr Zero Piraeus <z@etiol.net> - 2013-11-15 14:06 -0300
      Re: python 3.3 repr Chris Angelico <rosuav@gmail.com> - 2013-11-16 04:11 +1100
      Re: python 3.3 repr Serhiy Storchaka <storchaka@gmail.com> - 2013-11-15 19:37 +0200
    Re: python 3.3 repr Gene Heskett <gheskett@wdtv.com> - 2013-11-15 11:36 -0500
    Re: python 3.3 repr Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-11-15 17:58 +0000
    Re: python 3.3 repr Gene Heskett <gheskett@wdtv.com> - 2013-11-15 14:23 -0500

Page 2 of 2 — ← Prev page 1 [2]

#59532

From	Joel Goldstick <joel.goldstick@gmail.com>
Date	2013-11-15 10:07 -0500
Message-ID	<mailman.2665.1384528079.18130.python-list@python.org>
In reply to	#59511

On Fri, Nov 15, 2013 at 10:03 AM, Robin Becker <robin@reportlab.com> wrote:
> ...........
>
>>> became popular.
>>>
>> Really? you cried and laughed over 7 vs. 8 bits?  That's lovely (?).
>> ;).  That eighth bit sure was less confusing than codepoint
>> translations
>
>
>
> no we had 6 bits in 60 bit words as I recall; extracting the nth character
> involved division by 6; smart people did tricks with inverted
> multiplications etc etc  :(
> --

Cool, someone here is older than me!  I came in with the 8080, and I
remember split octal, but sixes are something I missed out on.
> Robin Becker



-- 
Joel Goldstick
http://joelgoldstick.com

[toc] | [prev] | [next] | [standalone]

#59534

From	Chris Angelico <rosuav@gmail.com>
Date	2013-11-16 02:08 +1100
Message-ID	<mailman.2666.1384528119.18130.python-list@python.org>
In reply to	#59511

On Sat, Nov 16, 2013 at 1:43 AM, Robin Becker <robin@reportlab.com> wrote:
> ..........
>
>> I'm still stuck on Python 2, and while I can understand the controversy
>> ("It breaks my Python 2 code!"), this seems like the right thing to have
>> done.  In Python 2, unicode is an add-on.  One of the big design drivers in
>> Python 3 was to make unicode the standard.
>>
>> The idea behind repr() is to provide a "just plain text" representation of
>> an object.  In P2, "just plain text" means ascii, so escaping non-ascii
>> characters makes sense.  In P3, "just plain text" means unicode, so escaping
>> non-ascii characters no longer makes sense.
>>
>
> unfortunately the word 'printable' got into the definition of repr; it's
> clear that printability is not the same as unicode at least as far as the
> print function is concerned. In my opinion it would have been better to
> leave the old behaviour as that would have eased the compatibility.

"Printable" means many different things in different contexts. In some
contexts, the sequence \x66\x75\x63\x6b is considered unprintable, yet
each of those characters is perfectly displayable in its natural form.
Under IDLE, non-BMP characters can't be displayed (or at least, that's
how it has been; I haven't checked current status on that one). On
Windows, the console runs in codepage 437 by default (again, I may be
wrong here), so anything not representable in that has to be escaped.
My Linux box has its console set to full Unicode, everything working
perfectly, so any non-control character can be printed. As far as
Python's concerned, all of that is outside - something is "printable"
if it's printable within Unicode, and the other hassles are matters of
encoding. (Except the first one. I don't think there's an encoding
"g-rated".)

> The python gods don't count that sort of thing as important enough so we get
> the mess that is the python2/3 split. ReportLab has to do both so it's a
> real issue; in addition swapping the str - unicode pair to bytes str doesn't
> help one's mental models either :(

That's fixing, in effect, a long-standing bug - of a sort. The name
"str" needs to be applied to the most normal string type. As of Python
3, that's a Unicode string, which is as it should be. In Python 2, it
was the ASCII/bytes string, which still fit the description of "most
normal string type", but that means that Python 2 programs are
Unicode-unaware by default, which is a flaw. Hence the Py3 fix.

> Things went wrong when utf8 was not adopted as the standard encoding thus
> requiring two string types, it would have been easier to have a len function
> to count bytes as before and a glyphlen to count glyphs. Now as I understand
> it we have a complicated mess under the hood for unicode objects so they
> have a variable representation to approximate an 8 bit representation when
> suitable etc etc etc.

http://unspecified.wordpress.com/2012/04/19/the-importance-of-language-level-abstract-unicode-strings/

There are languages that do what you describe. It's very VERY easy to
break stuff. What happens when you slice a string?

>>> foo = "asdf"
>>> foo[:2],foo[2:]
('as', 'df')

>>> foo = "q\u1234zy"
>>> foo[:2],foo[2:]
('qሴ', 'zy')

Looks good to me. I split a four-character string, I get two
one-character strings. If that had been done in UTF-8, either I would
need to know "don't split at that boundary, that's between bytes in a
character", or else the indexing and slicing would have to be done by
counting characters from the beginning of the string - an O(n)
operation, rather than an O(1) pointer arithmetic, not to mention that
it'll blow your CPU cache (touching every part of a potentially-long
string) just to find the position.

The only reliable way to manage things is to work with true Unicode.
You can completely ignore the internal CPython representation; what
matters is that in Python (any implementation, as long as it conforms
with version 3.3 or later) lets you index Unicode codepoints out of a
Unicode string, without differentiating between those that happen to
be ASCII, those that fit in a single byte, those that fit in two
bytes, and those that are flagged RTL, because none of those
considerations makes any difference to you.

It takes some getting your head around, but it's worth it - same as
using git instead of a Windows shared drive. (I'm still trying to push
my family to think git.)

ChrisA

[toc] | [prev] | [next] | [standalone]

#59536

From	Robin Becker <robin@reportlab.com>
Date	2013-11-15 15:18 +0000
Message-ID	<mailman.2668.1384528688.18130.python-list@python.org>
In reply to	#59511

On 15/11/2013 15:07, Joel Goldstick wrote:
........



>
> Cool, someone here is older than me!  I came in with the 8080, and I
> remember split octal, but sixes are something I missed out on.

The pdp 10/15 had 18 bit words and could be organized as 3*6 or 2*9, pdp 8s had 
12 bits I think, then came the IBM 7094 which had 36 bits and finally the 
CDC6000 & 7600 machines with 60 bits, some one must have liked 6's
-mumbling-ly yrs-
Robin Becker

[toc] | [prev] | [next] | [standalone]

#59538

From	Roy Smith <roy@panix.com>
Date	2013-11-15 10:32 -0500
Message-ID	<mailman.2669.1384529579.18130.python-list@python.org>
In reply to	#59511

[Multipart message — attachments visible in raw view] — view raw

On Nov 15, 2013, at 10:18 AM, Robin Becker wrote:

> The pdp 10/15 had 18 bit words and could be organized as 3*6 or 2*9

I don't know about the 15, but the 10 had 36 bit words (18-bit halfwords).  One common character packing was 5 7-bit characters per 36 bit word (with the sign bit left over).

Anybody remember RAD-50?  It let you represent a 6-character filename (plus a 3-character extension) in a 16 bit word.  RT-11 used it, not sure if it showed up anywhere else.

---
Roy Smith
roy@panix.com

[toc] | [prev] | [next] | [standalone]

#59545

From	William Ray Wing <wrw@mac.com>
Date	2013-11-15 11:30 -0500
Message-ID	<mailman.2675.1384533037.18130.python-list@python.org>
In reply to	#59511

On Nov 15, 2013, at 10:18 AM, Robin Becker <robin@reportlab.com> wrote:

> On 15/11/2013 15:07, Joel Goldstick wrote:
> ........
> 
> 
> 
>> 
>> Cool, someone here is older than me!  I came in with the 8080, and I
>> remember split octal, but sixes are something I missed out on.
> 
> The pdp 10/15 had 18 bit words and could be organized as 3*6 or 2*9, pdp 8s had 12 bits I think, then came the IBM 7094 which had 36 bits and finally the CDC6000 & 7600 machines with 60 bits, some one must have liked 6's
> -mumbling-ly yrs-
> Robin Becker
> -- 
> https://mail.python.org/mailman/listinfo/python-list

Yes, the PDP-8s, LINC-8s, and PDP-12s were all 12-bit computers.  However the LINC-8 operated with word-pairs (instruction in one location followed by address to be operated on in the next) so it was effectively a 24-bit computer and the PDP-12 was able to execute BOTH PDP-8 and LINC-8 instructions (it added one extra instruction to each set that flipped the mode).

First assembly language program I ever wrote was on a PDP-12.  (If there is an emoticon for a face with a gray beard, I don't know it.)

-Bill

[toc] | [prev] | [next] | [standalone]

#59548

From	Zero Piraeus <z@etiol.net>
Date	2013-11-15 14:06 -0300
Message-ID	<mailman.2677.1384535196.18130.python-list@python.org>
In reply to	#59511

:

On Fri, Nov 15, 2013 at 10:32:54AM -0500, Roy Smith wrote:
> Anybody remember RAD-50?  It let you represent a 6-character filename
> (plus a 3-character extension) in a 16 bit word.  RT-11 used it, not
> sure if it showed up anywhere else.

Presumably 16 is a typo, but I just had a moderate amount of fun
envisaging how that might work: if the characters were restricted to
vowels, then 5**6 < 2**14, giving a couple of bits left over for a
choice of four preset "three-character" extensions.

I can't say that AEIOUA.EX1 looks particularly appealing, though ...

 -[]z.

-- 
Zero Piraeus: pollice verso
http://etiol.net/pubkey.asc

[toc] | [prev] | [next] | [standalone]

#59550

From	Chris Angelico <rosuav@gmail.com>
Date	2013-11-16 04:11 +1100
Message-ID	<mailman.2678.1384535511.18130.python-list@python.org>
In reply to	#59511

On Sat, Nov 16, 2013 at 4:06 AM, Zero Piraeus <z@etiol.net> wrote:
> :
>
> On Fri, Nov 15, 2013 at 10:32:54AM -0500, Roy Smith wrote:
>> Anybody remember RAD-50?  It let you represent a 6-character filename
>> (plus a 3-character extension) in a 16 bit word.  RT-11 used it, not
>> sure if it showed up anywhere else.
>
> Presumably 16 is a typo, but I just had a moderate amount of fun
> envisaging how that might work: if the characters were restricted to
> vowels, then 5**6 < 2**14, giving a couple of bits left over for a
> choice of four preset "three-character" extensions.
>
> I can't say that AEIOUA.EX1 looks particularly appealing, though ...

Looks like it might be this scheme:

https://en.wikipedia.org/wiki/DEC_Radix-50

36-bit word for a 6-char filename, but there was also a 16-bit
variant. I do like that filename scheme you describe, though it would
tend to produce names that would suit virulent diseases.

ChrisA

[toc] | [prev] | [next] | [standalone]

#59553

From	Serhiy Storchaka <storchaka@gmail.com>
Date	2013-11-15 19:37 +0200
Message-ID	<mailman.2680.1384537043.18130.python-list@python.org>
In reply to	#59511

15.11.13 17:32, Roy Smith написав(ла):
> Anybody remember RAD-50?  It let you represent a 6-character filename
> (plus a 3-character extension) in a 16 bit word.  RT-11 used it, not
> sure if it showed up anywhere else.

In three 16-bit words.

[toc] | [prev] | [next] | [standalone]

#59546

From	Gene Heskett <gheskett@wdtv.com>
Date	2013-11-15 11:36 -0500
Message-ID	<mailman.2676.1384533802.18130.python-list@python.org>
In reply to	#59510

On Friday 15 November 2013 11:28:19 Joel Goldstick did opine:

> On Fri, Nov 15, 2013 at 10:03 AM, Robin Becker <robin@reportlab.com> 
wrote:
> > ...........
> > 
> >>> became popular.
> >> 
> >> Really? you cried and laughed over 7 vs. 8 bits?  That's lovely (?).
> >> ;).  That eighth bit sure was less confusing than codepoint
> >> translations
> > 
> > no we had 6 bits in 60 bit words as I recall; extracting the nth
> > character involved division by 6; smart people did tricks with
> > inverted multiplications etc etc  :(
> > --
> 
> Cool, someone here is older than me!  I came in with the 8080, and I
> remember split octal, but sixes are something I missed out on.

Ok, if you are feeling old & decrepit, hows this for a birthday: 10/04/34, 
I came into micro computers about RCA 1802 time.  Wrote a program for the 
1802 without an assembler, for tape editing in '78 at KRCR-TV in Redding 
CA, that was still in use in '94, but never really wrote assembly code 
until the 6809 was out in the Radio Shack Color Computers.  os9 on the 
coco's was the best teacher about the unix way of doing things there ever 
was.  So I tell folks these days that I am 39, with 40 years experience at 
being 39. ;-)

> > Robin Becker

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)

Counting in binary is just like counting in decimal -- if you are all 
thumbs.
		-- Glaser and Way
A pen in the hand of this president is far more
dangerous than 200 million guns in the hands of
         law-abiding citizens.

[toc] | [prev] | [next] | [standalone]

#59557

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2013-11-15 17:58 +0000
Message-ID	<mailman.2681.1384538347.18130.python-list@python.org>
In reply to	#59510

On 15/11/2013 16:36, Gene Heskett wrote:
> On Friday 15 November 2013 11:28:19 Joel Goldstick did opine:
>
>> On Fri, Nov 15, 2013 at 10:03 AM, Robin Becker <robin@reportlab.com>
> wrote:
>>> ...........
>>>
>>>>> became popular.
>>>>
>>>> Really? you cried and laughed over 7 vs. 8 bits?  That's lovely (?).
>>>> ;).  That eighth bit sure was less confusing than codepoint
>>>> translations
>>>
>>> no we had 6 bits in 60 bit words as I recall; extracting the nth
>>> character involved division by 6; smart people did tricks with
>>> inverted multiplications etc etc  :(
>>> --
>>
>> Cool, someone here is older than me!  I came in with the 8080, and I
>> remember split octal, but sixes are something I missed out on.
>
> Ok, if you are feeling old & decrepit, hows this for a birthday: 10/04/34,
> I came into micro computers about RCA 1802 time.  Wrote a program for the
> 1802 without an assembler, for tape editing in '78 at KRCR-TV in Redding
> CA, that was still in use in '94, but never really wrote assembly code
> until the 6809 was out in the Radio Shack Color Computers.  os9 on the
> coco's was the best teacher about the unix way of doing things there ever
> was.  So I tell folks these days that I am 39, with 40 years experience at
> being 39. ;-)
>
>>> Robin Becker
>
>
> Cheers, Gene
>

I also used the RCA 1802, but did you use the Ferranti F100L?  Rationale 
for the use of both, mid/late 70s they were the only processors of their 
respective type with military approvals.

Can't remember how we coded on the F100L, but the 1802 work was done on 
the Texas Instruments Silent 700, copying from one cassette tape to 
another.  Set the controls wrong when copying and whoops, you've just 
overwritten the work you've just done.  We could have had a decent 
development environment but it was on a UK MOD cost plus project, so the 
more inefficiently you worked, the more profit your employer made.

-- 
Python is the second best programming language in the world.
But the best has yet to be invented.  Christian Tismer

Mark Lawrence

[toc] | [prev] | [next] | [standalone]

#59559

From	Gene Heskett <gheskett@wdtv.com>
Date	2013-11-15 14:23 -0500
Message-ID	<mailman.2683.1384543434.18130.python-list@python.org>
In reply to	#59510

On Friday 15 November 2013 13:52:40 Mark Lawrence did opine:

> On 15/11/2013 16:36, Gene Heskett wrote:
> > On Friday 15 November 2013 11:28:19 Joel Goldstick did opine:
> >> On Fri, Nov 15, 2013 at 10:03 AM, Robin Becker <robin@reportlab.com>
> > 
> > wrote:
> >>> ...........
> >>> 
> >>>>> became popular.
> >>>> 
> >>>> Really? you cried and laughed over 7 vs. 8 bits?  That's lovely
> >>>> (?). ;).  That eighth bit sure was less confusing than codepoint
> >>>> translations
> >>> 
> >>> no we had 6 bits in 60 bit words as I recall; extracting the nth
> >>> character involved division by 6; smart people did tricks with
> >>> inverted multiplications etc etc  :(
> >>> --
> >> 
> >> Cool, someone here is older than me!  I came in with the 8080, and I
> >> remember split octal, but sixes are something I missed out on.
> > 
> > Ok, if you are feeling old & decrepit, hows this for a birthday:
> > 10/04/34, I came into micro computers about RCA 1802 time.  Wrote a
> > program for the 1802 without an assembler, for tape editing in '78 at
> > KRCR-TV in Redding CA, that was still in use in '94, but never really
> > wrote assembly code until the 6809 was out in the Radio Shack Color
> > Computers.  os9 on the coco's was the best teacher about the unix way
> > of doing things there ever was.  So I tell folks these days that I am
> > 39, with 40 years experience at being 39. ;-)
> > 
> >>> Robin Becker
> > 
> > Cheers, Gene
> 
> I also used the RCA 1802, but did you use the Ferranti F100L?  Rationale
> for the use of both, mid/late 70s they were the only processors of their
> respective type with military approvals.
> 
> Can't remember how we coded on the F100L, but the 1802 work was done on
> the Texas Instruments Silent 700, copying from one cassette tape to
> another.  Set the controls wrong when copying and whoops, you've just
> overwritten the work you've just done.  We could have had a decent
> development environment but it was on a UK MOD cost plus project, so the
> more inefficiently you worked, the more profit your employer made.

BTDT but in 1959-60 era.  Testing the ullage pressure regulators for the 
early birds, including some that gave John Glenn his first ride or 2.  I 
don't recall the brand of paper tape recorders, but they used 12at7's & 
12au7's by the grocery sack full.  One or more got noisy & me being the 
budding C.E.T. that I now am, of course ran down the bad ones and requested 
new ones.  But you had to turn in the old ones, which Stellardyne Labs 
simply recycled back to you the next time you needed a few.  Hopeless 
management IMO, but thats cost plus for you.

At 10k$ a truckload for helium back then, each test lost about $3k worth of 
helium because the recycle catcher tank was so thin walled.  And the 6 
stage cardox re-compressor was so leaky, occasionally blowing up a pipe out 
of the last stage that put about 7800 lbs back in the monel tanks.

I considered that a huge waste compared to the cost of a 12au7, then about 
$1.35, and raised hell, so I got fired.  They simply did not care that a 
perfectly good regulator was being abused to death when it took 10 or more 
test runs to get one good recording for the certification. At those 
operating pressures, the valve faces erode just like the seats in your 
shower faucets do in 20 years.  Ten such runs and you may as well bin it, 
but they didn't.

I am amazed that as many of those birds worked as did.  Of course if it 
wasn't manned, they didn't talk about the roman candles on the launch pads. 
I heard one story that they had to regrade one pads real estate at 
Vandenburg & start all over, seems some ID10T had left the cable to the 
explosive bolts hanging on the cable tower.  Ooops, and theres no off 
switch in many of those once the umbilical has been dropped.

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)

Tehee quod she, and clapte the wyndow to.
		-- Geoffrey Chaucer
A pen in the hand of this president is far more
dangerous than 200 million guns in the hands of
         law-abiding citizens.

[toc] | [prev] | [standalone]

Page 2 of 2 — ← Prev page 1 [2]

csiph-web

python 3.3 repr

Contents

#59532

#59534

#59536

#59538

#59545

#59548

#59550

#59553

#59546

#59557

#59559