Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #72340 > unrolled thread
| Started by | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| First post | 2014-05-31 17:10 +0100 |
| Last post | 2014-06-03 14:22 -0400 |
| Articles | 20 on this page of 92 — 19 participants |
Back to article view | Back to comp.lang.python
Python 3.2 has some deadly infection Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-05-31 17:10 +0100
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-05-31 22:55 +0300
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-01 02:26 +0000
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-01 12:43 +1000
Re: Python 3.2 has some deadly infection Tim Delaney <timothy.c.delaney@gmail.com> - 2014-06-02 08:54 +1000
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-02 01:14 +0000
Re: Python 3.2 has some deadly infection Tim Delaney <timothy.c.delaney@gmail.com> - 2014-06-02 12:23 +1000
Re: Python 3.2 has some deadly infection Rustom Mody <rustompmody@gmail.com> - 2014-06-01 19:46 -0700
Re: Python 3.2 has some deadly infection Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2014-06-02 07:45 +0000
Re: Python 3.2 has some deadly infection Tim Delaney <timothy.c.delaney@gmail.com> - 2014-06-02 19:02 +1000
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-02 19:14 +1000
Re: Python 3.2 has some deadly infection Robin Becker <robin@reportlab.com> - 2014-06-02 12:10 +0100
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-03 16:34 +0000
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-04 02:43 +1000
Re: Python 3.2 has some deadly infection Terry Reedy <tjreedy@udel.edu> - 2014-06-02 17:34 -0400
Re: Python 3.2 has some deadly infection Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-06-03 17:16 +1200
Re: Python 3.2 has some deadly infection Terry Reedy <tjreedy@udel.edu> - 2014-06-03 02:21 -0400
Re: Python 3.2 has some deadly infection Robin Becker <robin@reportlab.com> - 2014-06-03 15:18 +0100
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-04 13:08 +0000
Re: Python 3.2 has some deadly infection Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-06-05 14:01 +1200
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-05 10:16 +0300
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-05 17:30 +1000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-05 11:05 +0300
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-05 18:36 +1000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-05 12:53 +0300
Re: Python 3.2 has some deadly infection wxjmfauth@gmail.com - 2014-06-05 05:43 -0700
Re: Python 3.2 has some deadly infection Terry Reedy <tjreedy@udel.edu> - 2014-06-05 14:50 -0400
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-05 23:21 +0300
Re: Python 3.2 has some deadly infection Terry Reedy <tjreedy@udel.edu> - 2014-06-05 18:09 -0400
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-05 23:13 +0000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-06 02:30 +0300
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-06 09:39 +1000
Re: Python 3.2 has some deadly infection Terry Reedy <tjreedy@udel.edu> - 2014-06-05 22:08 -0400
Re: Python 3.2 has some deadly infection Ethan Furman <ethan@stoneleaf.us> - 2014-06-05 20:47 -0700
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve@pearwood.info> - 2014-06-05 08:34 +0000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-05 12:41 +0300
Re: Python 3.2 has some deadly infection Rustom Mody <rustompmody@gmail.com> - 2014-06-05 06:37 -0700
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-05 17:45 +0300
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-05 15:33 +0000
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-06 02:12 +1000
Re: Python 3.2 has some deadly infection Rustom Mody <rustompmody@gmail.com> - 2014-06-05 09:54 -0700
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-06 03:36 +1000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-05 19:52 +0300
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-06 03:28 +1000
Re: Python 3.2 has some deadly infection Rustom Mody <rustompmody@gmail.com> - 2014-06-05 15:35 -0700
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-06 08:52 +1000
Re: Python 3.2 has some deadly infection Rustom Mody <rustompmody@gmail.com> - 2014-06-05 20:11 -0700
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-06 13:20 +1000
Re: Python 3.2 has some deadly infection Rustom Mody <rustompmody@gmail.com> - 2014-06-05 20:32 -0700
Re: Python 3.2 has some deadly infection Akira Li <4kir4.1i@gmail.com> - 2014-06-06 12:03 +0400
Re: Python 3.2 has some deadly infection Robin Becker <robin@reportlab.com> - 2014-06-05 16:37 +0100
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-05 16:16 +0000
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-06 01:50 +1000
Re: Python 3.2 has some deadly infection Robin Becker <robin@reportlab.com> - 2014-06-05 17:17 +0100
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-05 16:32 +0000
Re: Python 3.2 has some deadly infection Ethan Furman <ethan@stoneleaf.us> - 2014-06-06 07:40 -0700
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-06 03:14 +1000
Re: Python 3.2 has some deadly infection Ian Kelly <ian.g.kelly@gmail.com> - 2014-06-05 11:16 -0600
Re: Python 3.2 has some deadly infection Terry Reedy <tjreedy@udel.edu> - 2014-06-05 14:11 -0400
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-05 21:30 +0300
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-05 23:02 +0000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-06 02:21 +0300
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-06 12:15 +0000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-06 16:00 +0300
Re: Python 3.2 has some deadly infection rurpy@yahoo.com - 2014-06-07 21:34 -0700
Re: Python 3.2 has some deadly infection Ethan Furman <ethan@stoneleaf.us> - 2014-06-06 06:24 -0700
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-06 17:10 +0300
Re: Python 3.2 has some deadly infection Michael Torrie <torriem@gmail.com> - 2014-06-06 09:02 -0600
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-06 18:32 +0300
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-07 01:50 +1000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-06 20:02 +0300
Re: Python 3.2 has some deadly infection Rustom Mody <rustompmody@gmail.com> - 2014-06-06 10:13 -0700
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-07 03:26 +1000
Re: Python 3.2 has some deadly infection wxjmfauth@gmail.com - 2014-06-06 11:03 -0700
Re: Python 3.2 has some deadly infection Denis McMahon <denismfmcmahon@gmail.com> - 2014-06-06 21:18 +0000
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-07 08:18 +1000
Re: Python 3.2 has some deadly infection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-06 15:57 +0000
Re: Python 3.2 has some deadly infection Rustom Mody <rustompmody@gmail.com> - 2014-06-06 09:21 -0700
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-07 02:48 +1000
Re: Python 3.2 has some deadly infection Rustom Mody <rustompmody@gmail.com> - 2014-06-06 10:04 -0700
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-07 03:12 +1000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-06 20:11 +0300
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-07 03:16 +1000
Re: Python 3.2 has some deadly infection Marko Rauhamaa <marko@pacujo.net> - 2014-06-06 20:18 +0300
Re: Python 3.2 has some deadly infection Ned Batchelder <ned@nedbatchelder.com> - 2014-06-06 13:33 -0400
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-07 01:25 +1000
Re: Python 3.2 has some deadly infection wxjmfauth@gmail.com - 2014-06-06 08:44 -0700
Re: Python 3.2 has some deadly infection wxjmfauth@gmail.com - 2014-06-06 08:48 -0700
Re: Python 3.2 has some deadly infection Robin Becker <robin@reportlab.com> - 2014-06-06 12:56 +0100
Re: Python 3.2 has some deadly infection Akira Li <4kir4.1i@gmail.com> - 2014-06-05 06:49 +0400
Re: Python 3.2 has some deadly infection Chris Angelico <rosuav@gmail.com> - 2014-06-04 00:25 +1000
Re: Python 3.2 has some deadly infection Terry Reedy <tjreedy@udel.edu> - 2014-06-03 14:22 -0400
Page 2 of 5 — ← Prev page 1 [2] 3 4 5 Next page →
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-06-05 10:16 +0300 |
| Message-ID | <87a99r7rmx.fsf@elektro.pacujo.net> |
| In reply to | #72665 |
Gregory Ewing <greg.ewing@canterbury.ac.nz>: > As a result, most unix programs, most of the time, deal > with text on stdin and stdout. Well, ok. But even accepting that premise, that "text" might not be what Python3 considers "text". For example, if your program reads in XML, JSON or Python, the parser object might prefer to take it in as bytes and not have it predecoded by sys.stdin. > So, it makes sense for them to be text by default. I'm not sure. That could lead to nasty surprises. I've experienced analogous consternations when the "sort" utility hasn't worked identically for identical input: it is heavily influenced by the (spit, spit) locale. That's why 99.9% of your scripts should prefix "sort" and "grep" with LC_ALL=C -- even when the input really is UTF-8. Should I now take it further and prefix all Python programs with LC_ALL=C? Probably not, since UTF-8 might cause sys.stdin to barf. > And wherever there's text, there needs to be an encoding. No problem there, only should sys.stdin and sys.stdout carry the decoding/encoding out or should it be left for the program. Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-06-05 17:30 +1000 |
| Message-ID | <mailman.10727.1401953433.18130.python-list@python.org> |
| In reply to | #72684 |
On Thu, Jun 5, 2014 at 5:16 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> No problem there, only should sys.stdin and sys.stdout carry the
> decoding/encoding out or should it be left for the program.
The most normal thing to do with the standard streams is to have them
produce text, and as much as possible, you shouldn't have to go to
great lengths to make that work. If, in Python, I say print("Hello,
world!"), I expect that to produce a line of text on the screen,
without my code having to encode that to bytes, figure out what sort
of newline to add, etc, etc.
Even if stdout isn't a tty, chances are you're still working with
text. Only an extreme few Unix programs actually manipulate binary
standard streams (some, like cat, will pipe binary through unchanged,
but even cat assumes text for options like -n); those few should be
the ones to have to worry about setting stdin and stdout to be binary.
In the same way that we have double-quoted strings being Unicode
strings, we should have print() and input() "naturally just work" with
Unicode, which means they should negotiate encodings with the system
without the programmer having to lift a finger.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-06-05 11:05 +0300 |
| Message-ID | <871tv37pdt.fsf@elektro.pacujo.net> |
| In reply to | #72687 |
Chris Angelico <rosuav@gmail.com>:
> If, in Python, I say print("Hello, world!"), I expect that to produce
> a line of text on the screen, without my code having to encode that to
> bytes, figure out what sort of newline to add, etc, etc.
That example in no way represents the typical Python program (if there
is one).
> Only an extreme few Unix programs actually manipulate binary standard
> streams
That's quite an assumption to make.
> we should have print() and input() "naturally just work" with Unicode
No problem there. I couldn't imagine using either function for anything
serious.
Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-06-05 18:36 +1000 |
| Message-ID | <mailman.10730.1401957426.18130.python-list@python.org> |
| In reply to | #72689 |
On Thu, Jun 5, 2014 at 6:05 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Chris Angelico <rosuav@gmail.com>:
>
>> If, in Python, I say print("Hello, world!"), I expect that to produce
>> a line of text on the screen, without my code having to encode that to
>> bytes, figure out what sort of newline to add, etc, etc.
>
> That example in no way represents the typical Python program (if there
> is one).
It's simpler than most, but use of print() is certainly quite common.
A naive search of .py files in my /usr came up with five thousand
instances of ' print(', and given that that search won't necessarily
find a Python 2 print statement (and I'm on Debian Wheezy, so Py2 is
the system Python), I think that's a fairly respectable figure.
>> Only an extreme few Unix programs actually manipulate binary standard
>> streams
>
> That's quite an assumption to make.
Okay. Start listing some. You have (de)compression programs like gzip,
which primarily work with files but can work with standard streams;
some image or movie manipulation programs (eg avconv) can also read
from stdin, although again, it's far more common to use files; cat
will happily transmit binary untouched, but all its options (at least
the ones I can see in my 'man cat') are for working with text.
What else do you have? Let's see... grep, sort, less/more, sed, awk,
these are all text manipulation programs. All your "give me info about
the system" programs (ls, mount, pwd, hostname, date.......) print
text to stdout. Some also read from stdin, like md5sum and related.
Piles and piles of programs that work with text. A small handful that
work with binary, and most of them are more commonly used directly
with files, not with pipes. The most common case is that it all be
text.
>> we should have print() and input() "naturally just work" with Unicode
>
> No problem there. I couldn't imagine using either function for anything
> serious.
I don't know about those exact functions, but I do know that there are
plenty of Python programs that use the console (take hg as one fairly
hefty example). Maybe input() isn't all that heavily used, but
certainly print() is a fine function. I can not only imagine using
them seriously, I *have used* them, and their equivalents in other
languages, seriously.
If the standard streams are so crucial, why are their most obvious
interfaces insignificant to you?
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-06-05 12:53 +0300 |
| Message-ID | <87sinju1hf.fsf@elektro.pacujo.net> |
| In reply to | #72692 |
Chris Angelico <rosuav@gmail.com>: > If the standard streams are so crucial, why are their most obvious > interfaces insignificant to you? I want the standard streams to consume and produce bytes. I do a lot of system programming and connect processes to each other with socketpairs, pipes and the like. I have dealt with plugin APIs that communicate over stdin and stdout. Python is clearly on a crusade to make *text* a first class system entity. I don't believe that is possible (without casualties) in the linux world. Python text should only exist inside string objects. Marko
[toc] | [prev] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2014-06-05 05:43 -0700 |
| Message-ID | <8637bf51-9909-45d4-a209-48347e533f8a@googlegroups.com> |
| In reply to | #72699 |
Le jeudi 5 juin 2014 11:53:00 UTC+2, Marko Rauhamaa a écrit :
> Chris Angelico <rosuav@gmail.com>:
>
>
>
> > If the standard streams are so crucial, why are their most obvious
>
> > interfaces insignificant to you?
>
>
>
> I want the standard streams to consume and produce bytes. I do a lot of
>
> system programming and connect processes to each other with socketpairs,
>
> pipes and the like. I have dealt with plugin APIs that communicate over
>
> stdin and stdout.
>
>
>
> Python is clearly on a crusade to make *text* a first class system
>
> entity. I don't believe that is possible (without casualties) in the
>
> linux world. Python text should only exist inside string objects.
>
>
>
>
>
> Marko
=====
Are you sure?
>>> timeit.repeat("(x*1000 + y)", setup="x = 'abc'; y = 'z'")
[0.9457552436453511, 0.9190932610143818, 0.9322044912393039]
>>> timeit.repeat("(x*1000 + y)", setup="x = 'abc'; y = '\u0fce'")
[2.5541921791045183, 2.52434366066052, 2.5337417948967413]
>>> timeit.repeat("(x*1000 + y)", setup="x = 'abc'.encode('utf-8'); y = 'z'.encode('utf-8')")
[0.9168235779232532, 0.8989583403075017, 0.8964204541650247]
>>> timeit.repeat("(x*1000 + y)", setup="x = 'abc'.encode('utf-8'); y = '\u0fce'.encode('utf-8')")
[0.9320969737165115, 0.9086006535332558, 0.9051715140790861]
>>>
>>>
>>> sys.getsizeof('abc'*1000 + '\u0fce')
6040
>>> sys.getsizeof(('abc'*1000 + '\u0fce').encode('utf-8'))
3020
>>>
jmf
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2014-06-05 14:50 -0400 |
| Message-ID | <mailman.10757.1401996534.18130.python-list@python.org> |
| In reply to | #72699 |
On 6/5/2014 5:53 AM, Marko Rauhamaa wrote: > Chris Angelico <rosuav@gmail.com>: > >> If the standard streams are so crucial, why are their most obvious >> interfaces insignificant to you? > > I want the standard streams to consume and produce bytes. Easy. Read the manual entry for stdxxx. "To write or read binary data from/to the standard streams, use the underlying binary buffer object. For example, to write bytes to stdout, use sys.stdout.buffer.write(b'abc')" To make it easy, use bound methods. myfilter.p ---------- import sys sysin = sys.stdin.buffer.read sysout = sys.stdout.buffer.write syserr = sys.stderr.buffer.write <filter code with calls to sysin, sysout, syserr.> --- The same trick of defining bound methods to save both writing and execution time is also useful for text filters when you use sys.stdin.read, etc, more than once in the text. When you try this, please report the result, either way. > I do a lot of system programming and connect processes to each other > with socketpairs, pipes and the like. I have dealt with plugin APIs > that communicate over stdin and stdout. Now you know how to do so on Python 3. > Python is clearly on a crusade to make *text* a first class system > entity. I don't believe that is possible (without casualties) in the > linux world. Python text should only exist inside string objects. You are clearly on a crusade to push a falsehood. Why? On Windows and, I believe, Mac, utf-16 encoded text (C widechar type) *is* a 'first class system entity. The problem Python has with *nix is getting text bytes from the system in an unknown or worse, wrongly-claimed encoding. The Python developers do their best to cope with the differences and peculiarities of the systems it runs on. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-06-05 23:21 +0300 |
| Message-ID | <8738fjumy8.fsf@elektro.pacujo.net> |
| In reply to | #72748 |
Terry Reedy <tjreedy@udel.edu>: > On 6/5/2014 5:53 AM, Marko Rauhamaa wrote: >> Chris Angelico <rosuav@gmail.com>: >> >>> If the standard streams are so crucial, why are their most obvious >>> interfaces insignificant to you? >> >> I want the standard streams to consume and produce bytes. > > Easy. Read the manual entry for stdxxx. "To write or read binary data > from/to the standard streams, use the underlying binary buffer object. > For example, to write bytes to stdout, use > sys.stdout.buffer.write(b'abc')" This note from the manual is a bit vague: Note that the streams can be replaced with objects (like io.StringIO) that do not support the buffer attribute or the detach() method "Can be replaced" by who? By the Python developers? By me? By random library calls? Does it mean the buffer and detach are not guaranteed to stay with the API? Marko
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2014-06-05 18:09 -0400 |
| Message-ID | <mailman.10775.1402006179.18130.python-list@python.org> |
| In reply to | #72757 |
On 6/5/2014 4:21 PM, Marko Rauhamaa wrote:
> Terry Reedy <tjreedy@udel.edu>:
>
>> On 6/5/2014 5:53 AM, Marko Rauhamaa wrote:
>>> Chris Angelico <rosuav@gmail.com>:
>>>
>>>> If the standard streams are so crucial, why are their most obvious
>>>> interfaces insignificant to you?
>>>
>>> I want the standard streams to consume and produce bytes.
>>
>> Easy. Read the manual entry for stdxxx. "To write or read binary data
>> from/to the standard streams, use the underlying binary buffer object.
>> For example, to write bytes to stdout, use
>> sys.stdout.buffer.write(b'abc')"
>
> This note from the manual is a bit vague:
>
> Note that the streams can be replaced with objects (like io.StringIO)
> that do not support the buffer attribute or the detach() method
>
> "Can be replaced" by who? By the Python developers? By me? By random
> library calls?
Fair question. The Python developers will not fiddle with stdxxx for 3rd
party code on 3rd party systems. We do sometimes *temporarily replace
the streams with StringIO, either directly or via test.support when
testing Python itself or stdlib modules. That is done in Lib/test, and
except for testing StringIO, it is only done as a convenience, not a
necessity.
To test a binary stream filter, you would have to do something else,
like read from and write to actual files on disk. Otherwise, you seem
unlikely to sabotage yourself, even accidentally.
Random non-stdlib library calls could sabotage you. However, in my
opinion, an imported 3rd party module should never modify std streams,
with one exception. The exception would be a module whose entire purpose
was to put the streams in a known state, as documented, and only if
intentionally asked to.
Having said that, bound methods created (first) should work regardless
of any subsequent manipulation of sys. Here is an experiment, run from
an Idle editor.
import sys
sysout = sys.stdout.write
sys.stdout = None
sysout('works anyway\n')
>>>
works anyway
(Of course, subsequent attempts to continue interactively fail. But that
is not your use case.)
--
Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-06-05 23:13 +0000 |
| Message-ID | <5390f997$0$29978$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #72757 |
On Thu, 05 Jun 2014 23:21:35 +0300, Marko Rauhamaa wrote: > Terry Reedy <tjreedy@udel.edu>: > >> On 6/5/2014 5:53 AM, Marko Rauhamaa wrote: >>> Chris Angelico <rosuav@gmail.com>: >>> >>>> If the standard streams are so crucial, why are their most obvious >>>> interfaces insignificant to you? >>> >>> I want the standard streams to consume and produce bytes. >> >> Easy. Read the manual entry for stdxxx. "To write or read binary data >> from/to the standard streams, use the underlying binary buffer object. >> For example, to write bytes to stdout, use >> sys.stdout.buffer.write(b'abc')" > > This note from the manual is a bit vague: > > Note that the streams can be replaced with objects (like io.StringIO) > that do not support the buffer attribute or the detach() method > > "Can be replaced" by who? By the Python developers? By me? By random > library calls? By you. sys.stdout and friends are writable. Any code you call may have replaced them with another file-like object, and you should honour that. The API could have/should have been a little more friendly, but it's conceptually simple: * Does sys.stdout have a buffer attribute? Then write raw bytes to the buffer. * If not, then write raw bytes to sys.stdout. * If either fails, then somebody has replaced stdout with something weird, and they deserve whatever horrible fate their damn fool move causes. It's not your responsibility to try to keep your application running under bizarre circumstances. -- Steven D'Aprano http://import-that.dreamwidth.org/
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-06-06 02:30 +0300 |
| Message-ID | <87ioof53z1.fsf@elektro.pacujo.net> |
| In reply to | #72787 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info>: >> "Can be replaced" by who? By the Python developers? By me? By random >> library calls? > > By you. sys.stdout and friends are writable. Any code you call may > have replaced them with another file-like object, and you should > honour that. I can of course overwrite even sys and os and open and all. That hardly merits mentioning in the API documentation. What I'm afraid of is that the Python developers are reserving the right to remove the buffer and detach attributes from the standard streams in a future version. That would be terrible. If it means some other module is allowed to commandeer the standard streams, that would be bad as well. Worst of all, I don't know why the caveat had to be there. Or is it maybe because some python command line options could cause buffer and detach not to be there? That would explain the caveat, but still would be kinda sucky. Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-06-06 09:39 +1000 |
| Message-ID | <mailman.10787.1402011598.18130.python-list@python.org> |
| In reply to | #72791 |
On Fri, Jun 6, 2014 at 9:30 AM, Marko Rauhamaa <marko@pacujo.net> wrote: > Steven D'Aprano <steve+comp.lang.python@pearwood.info>: > >>> "Can be replaced" by who? By the Python developers? By me? By random >>> library calls? >> >> By you. sys.stdout and friends are writable. Any code you call may >> have replaced them with another file-like object, and you should >> honour that. > > I can of course overwrite even sys and os and open and all. That hardly > merits mentioning in the API documentation. > > What I'm afraid of is that the Python developers are reserving the right > to remove the buffer and detach attributes from the standard streams in > a future version. That would be terrible. > > If it means some other module is allowed to commandeer the standard > streams, that would be bad as well. > > Worst of all, I don't know why the caveat had to be there. > > Or is it maybe because some python command line options could cause > buffer and detach not to be there? That would explain the caveat, but > still would be kinda sucky. It's more that replacng sys.std* is considered reasonably normal (unlike, say, replacing sys.float_info, which would be a weird thing to do); and you could replace them with something that doesn't have those attributes. If you're running a top-level script and you never import anything that changes the streams, you should be able to depend on those always being there. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2014-06-05 22:08 -0400 |
| Message-ID | <mailman.10800.1402020549.18130.python-list@python.org> |
| In reply to | #72791 |
On 6/5/2014 7:30 PM, Marko Rauhamaa wrote:
> Steven D'Aprano <steve+comp.lang.python@pearwood.info>:
>
>>> "Can be replaced" by who? By the Python developers? By me? By random
>>> library calls?
>>
>> By you. sys.stdout and friends are writable. Any code you call may
>> have replaced them with another file-like object, and you should
>> honour that.
>
> I can of course overwrite even sys and os and open and all. That hardly
> merits mentioning in the API documentation.
>
> What I'm afraid of is that the Python developers are reserving the right
> to remove the buffer and detach attributes from the standard streams in
> a future version.
No, not at all.
> That would be terrible.
Agreed.
> If it means some other module is allowed to commandeer the standard
> streams, that would be bad as well.
I think that, for the most part, library modules should either open a
file given a filename from outside or read from and write to open files
handed to them from outside, but not hard-code the std streams. The
module doc should say if the file (name or object) must be text or in
particular binary.
The warning is also a hint as to how to solve a problem, such as testing
a binary filter. Assume the module reads from and writes to .buffer and
has a main function. One approach, untested:
import sys, io, unittest
from mod import main
class Binstd:
def __init(self):
self.buffer = io.BytesIO
sys.stdin = Binstd()
sys.stdout = Binstd()
sys.stdin.buffer.write('test data')
sys.stdin.buffer.seek(0)
main()
out = sys.stdout.buffer.getvalue()
# test that out is as expected for the input
# seek to 0 and truncate for more tests
> Worst of all, I don't know why the caveat had to be there.
Because the streams can be replaced for a variety of good reasons, as above.
> Or is it maybe because some python command line options could cause
> buffer and detach not to be there? That would explain the caveat, but
> still would be kinda sucky.
The doc set documents the Python command line options, as well any that
are CPython specific. It is possible that some implementation could add
one to open stdxyz in binary mode. CPython does not really need that.
--
Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2014-06-05 20:47 -0700 |
| Message-ID | <mailman.10805.1402029381.18130.python-list@python.org> |
| In reply to | #72791 |
On 06/05/2014 04:30 PM, Marko Rauhamaa wrote: > > What I'm afraid of is that the Python developers are reserving the right > to remove the buffer and detach attributes from the standard streams in > a future version. Being afraid is silly. If you have a question, ask it. -- ~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2014-06-05 08:34 +0000 |
| Message-ID | <53902bb1$0$11109$c3e8da3@news.astraweb.com> |
| In reply to | #72665 |
On Thu, 05 Jun 2014 14:01:50 +1200, Gregory Ewing wrote: > Steven D'Aprano wrote: >> The whole concept of stdin and stdout is based on the idea of having a >> console to read from and write to. > > Not really; stdin and stdout are frequently connected to files, or pipes > to other processes. The console, if it exists, just happens to be a > convenient default value for them. Even on a system without a console, > they're still a useful abstraction. If you had kept reading my post, including the bits you cut out *wink*, you'd see that I did raise that same point. Having stdin and stdout trivially generalises to the idea of replacing them with other files, or pipes. But the idea of having standard input and standard output in the first place comes about because they are useful for the console. I gave the example of Mac, which didn't have a command-line interface at all, hence no console, no stdin, no stdout. If a system had no command line interface (hence no consoles), why would you bother with a *standard* input file and output file that are never used? > But we were talking about encodings, and whether stdin and stdout should > be text or binary by default. Well, one of the design principles behind > unix is to make use of plain text wherever possible. What's plain text? *half a wink* Its a serious question. Some people think that "good ol' plain text" is EBCDIC, like IBM intended. To them, the letter "A" is synonymous with the byte 0xC1, and there's no need for an encoding (or so they think) because "A" *is* 0xC1. Of course, people on ASCII systems know better: who needs encodings when it is a universal fact that "A" *is* 0x41? *wink* > Not just for stuff > meant to be seen on the screen, but for stuff kept in files as well. > > As a result, most unix programs, most of the time, deal with text on > stdin and stdout. So, it makes sense for them to be text by default. And > wherever there's text, there needs to be an encoding. This is true > whether a console is involved or not. Agreed. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-06-05 12:41 +0300 |
| Message-ID | <87wqcvu20h.fsf@elektro.pacujo.net> |
| In reply to | #72691 |
Steven D'Aprano <steve@pearwood.info>: > But the idea of having standard input and standard output in the first > place comes about because they are useful for the console. I doubt that. Classic programs take input and produce output. Standard input and output are the default input and output. The textbook Pascal programs started: program myprogram(input, output); > If a system had no command line interface (hence no consoles), why > would you bother with a *standard* input file and output file that are > never used? Because programs are supposed to do useful work. They consume input and produce output. That concept is older than computers themselves and is used to define things like computation, algorithm, halting etc. > On Thu, 05 Jun 2014 14:01:50 +1200, Gregory Ewing wrote: >> But we were talking about encodings, and whether stdin and stdout >> should be text or binary by default. Well, one of the design >> principles behind unix is to make use of plain text wherever >> possible. No, one of the design principles behind unix is that all data is bytes: memory, files, devices, sockets, pathnames. Yes, the ASCII-is-good-for-everybody assumption has been there since the beginning, but Python will not be able to hide the fact that there is no text data (in the Python sense). There are only bytes. UTF-8 beautifully gives text a second-class citizenship in unix/linux. It will never be granted first-class citizenship, though. >> As a result, most unix programs, most of the time, deal with text on >> stdin and stdout. So, it makes sense for them to be text by default. >> And wherever there's text, there needs to be an encoding. This is >> true whether a console is involved or not. > > Agreed. Disagreed strongly. tcpdump -s 0 -w - >error.pcap tar zxf - <python.tar.gz sha1sum <smile.jpg base64 -d <a.dat >a.exe wget ftp://micorsops.com/something.avi -O - | mplayer -cache 8192 - Unfortunately, the text/binary dichotomy breaks a beautiful principle in Python as well. In numerous contexts, any file-like object will be valid. Now there is no file-like object. Instead, you have text-file-like objects and binary-file-like objects, which require special attention since some operate on strings while others operate on bytes. Marko
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-06-05 06:37 -0700 |
| Message-ID | <7b3543f6-6f62-49c5-abdc-e2783fd6d629@googlegroups.com> |
| In reply to | #72697 |
On Thursday, June 5, 2014 3:11:34 PM UTC+5:30, Marko Rauhamaa wrote: > Steven D'Aprano wrote: > > But the idea of having standard input and standard output in the first > > place comes about because they are useful for the console. > I doubt that. Classic programs take input and produce output. Standard > input and output are the default input and output. The textbook Pascal > programs started: > program myprogram(input, output); > > If a system had no command line interface (hence no consoles), why > > would you bother with a *standard* input file and output file that are > > never used? > Because programs are supposed to do useful work. They consume input and > produce output. That concept is older than computers themselves and is > used to define things like computation, algorithm, halting etc. > > On Thu, 05 Jun 2014 14:01:50 +1200, Gregory Ewing wrote: > >> But we were talking about encodings, and whether stdin and stdout > >> should be text or binary by default. Well, one of the design > >> principles behind unix is to make use of plain text wherever > >> possible. > No, one of the design principles behind unix is that all data is bytes: > memory, files, devices, sockets, pathnames. Yes, the > ASCII-is-good-for-everybody assumption has been there since the > beginning, but Python will not be able to hide the fact that there is no > text data (in the Python sense). There are only bytes. > UTF-8 beautifully gives text a second-class citizenship in unix/linux. > It will never be granted first-class citizenship, though. > >> As a result, most unix programs, most of the time, deal with text on > >> stdin and stdout. So, it makes sense for them to be text by default. > >> And wherever there's text, there needs to be an encoding. This is > >> true whether a console is involved or not. > > Agreed. > Disagreed strongly. > tcpdump -s 0 -w - >error.pcap > tar zxf - <python.tar.gz > sha1sum <smile.jpg > base64 -d <a.dat >a.exe > wget ftp://micorsops.com/something.avi -O - | mplayer -cache 8192 - > Unfortunately, the text/binary dichotomy breaks a beautiful principle in > Python as well. In numerous contexts, any file-like object will be > valid. Now there is no file-like object. Instead, you have > text-file-like objects and binary-file-like objects, which require > special attention since some operate on strings while others operate on > bytes. Pascal is for building pyramids — imposing, breathtaking, static structures built by armies pushing heavy blocks into place. — Alan Perlis Lisp is like a ball of mud. Add more and it's still a ball of mud — it still looks like Lisp. — Guy Steele There are two fundamental outlooks in computer science — structuring and universality. And they pull in opposite directions. Universality happens when a data-structure can hold everything — a universal data structure. Some of the most significant advances in CS come from a universalist vision: - von Neumann machine storing data+code in memory - Turing-tape able to store arbitrary turing machines (∴ universal TM) - Lisp program ≡ Lisp data - Stream of byte can handle/represent everything in Unix — memory, files, devices, sockets, pathnames. However after the allurement of universality is over, the realization dawns that we have a mess — Lisp is a 'mud-ball'. At which point people start needing to make distinctions — code and data, different data-structures, type-systems etc. IOW imposing structure on the mud-ball. Taking a broad view, while structuring trades the power for order, it is universality that adds significant power. Python is not as universal as Lisp — it has no homoiconicity. But it is close enough in that any variable/data-structure can contain any value. What Marko is saying is that by imposing the structuring of unicode on the outside (Unix) world of text=byte, significant power is lost. This is also Armin's crib. How significant that loss is, is yet to be seen…
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-06-05 17:45 +0300 |
| Message-ID | <87oay7tnxt.fsf@elektro.pacujo.net> |
| In reply to | #72704 |
Rustom Mody <rustompmody@gmail.com>: > What Marko is saying is that by imposing the structuring of unicode on > the outside (Unix) world of text=byte, significant power is lost. Mostly I'm saying Python3 will not be able to hide the fact that linux data consists of bytes. It shouldn't even try. The linux OS outside the Python process talks bytes, not strings. A different OS might have different assumptions. Marko
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-06-05 15:33 +0000 |
| Message-ID | <53908dd0$0$29978$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #72708 |
On Thu, 05 Jun 2014 17:45:34 +0300, Marko Rauhamaa wrote: > Rustom Mody <rustompmody@gmail.com>: > >> What Marko is saying is that by imposing the structuring of unicode on >> the outside (Unix) world of text=byte, significant power is lost. > > Mostly I'm saying Python3 will not be able to hide the fact that linux > data consists of bytes. It shouldn't even try. The linux OS outside the > Python process talks bytes, not strings. Data on pretty much *all* computers consists of bytes, regardless of the language or operating system. There may be a few esoteric or ancient machines from the Dark Ages that aren't based on bytes, and even fewer that aren't based on bits (ancient Soviet era mainframes, if any of them still survive), but they aren't important. Someday esoteric non-byte machines, perhaps quantum computers, or machines based on DNA, or nano- sized analog computers made of carbon atoms, say, will be important, but this is not that day. For now, bytes rule *everywhere*. Nevertheless, there are important abstractions that are written on top of the bytes layer, and in the Unix and Linux world, the most important abstraction is *text*. In the Unix world, text formats and text processing is much more common in user-space apps than binary processing. Perhaps the definitive explanation and celebration of the Unix way is Eric Raymond's "The Art Of Unix Programming": http://www.catb.org/esr/writings/taoup/html/ch05s01.html -- Steven D'Aprano http://import-that.dreamwidth.org/
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-06-06 02:12 +1000 |
| Message-ID | <mailman.10743.1401984750.18130.python-list@python.org> |
| In reply to | #72710 |
On Fri, Jun 6, 2014 at 1:33 AM, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > In the Unix world, text formats and text > processing is much more common in user-space apps than binary processing. > Perhaps the definitive explanation and celebration of the Unix way is > Eric Raymond's "The Art Of Unix Programming": > > http://www.catb.org/esr/writings/taoup/html/ch05s01.html Specifically, this from the opening paragraph: """ Text streams are a valuable universal format because they're easy for human beings to read, write, and edit without specialized tools. These formats are (or can be designed to be) transparent. """ He goes on to talk about network protocols, one of the best examples of this. I've idly speculated at times about the possibility of rewriting the Magic: The Gathering Online back-end with a view to making it easier to work with. Among other changes, I'd be wanting to make the client-server communication be plain text (an SMTP-style of protocol), with an external layer of encryption (TLS). This would mean that: 1) Internal testing can be done without TLS, making the communication absolutely transparent, easy to debug, easy to watch, everything. Adding TLS later would have zero impact on the critical code internally - it's just a layer around the outside. 2) Upgrades to crypto can simply follow industry best-practice. (Reminder, to anyone who might have been mad enough to consider this: DO NOT roll your own crypto! Ever! Even if you use a good library for the heavy lifting!) 3) A debug log of what the client has sent and received could be included, even in production, at very low cost. You don't need to decode packets and pretty-print them - you just take the lines of text, maybe adorn or color them according to which were sent/received, and dump them into a display box or log file somewhere. 4) The server is forced to acknowledge that the client might not be the one it expected. Not only do you get better security that way, but you could also call this a feature. 5) Therefore, you can debug the system with a simple TELNET or MUD client (okay, most MUD clients don't do SSL, but you can use "openssl s_client"). As someone who's debugged myriad issues using his trusty MUD client, I consider this to be a *huge* advantage. All it takes is a few simple rules, like: All communication is text, encoded down the wire as UTF-8, and consists of lines (terminated by U+000A) which consist of a word, a U+0020 space, and then parameters to the command. There, that's a rigorous definition that covers everything you'll need of it; compare with what Flash uses, by default: https://en.wikipedia.org/wiki/Action_Message_Format Sure, it might be slightly more compact going down the wire; but what do you really gain? Text wins. ChrisA
[toc] | [prev] | [next] | [standalone]
Page 2 of 5 — ← Prev page 1 [2] 3 4 5 Next page →
Back to top | Article view | comp.lang.python
csiph-web