Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #76029 > unrolled thread
| Started by | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| First post | 2014-08-11 07:44 +0000 |
| Last post | 2014-08-12 23:53 +1000 |
| Articles | 10 — 6 participants |
Back to article view | Back to comp.lang.python
Is print thread safe? Steven D'Aprano <steve@pearwood.info> - 2014-08-11 07:44 +0000
Re: Is print thread safe? INADA Naoki <songofacandy@gmail.com> - 2014-08-11 19:19 +0900
Re: Is print thread safe? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-12 02:07 +1000
Re: Is print thread safe? Cameron Simpson <cs@zip.com.au> - 2014-08-12 07:53 +1000
Re: Is print thread safe? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-12 09:56 +1000
Re: Is print thread safe? Chris Angelico <rosuav@gmail.com> - 2014-08-12 10:14 +1000
Re: Is print thread safe? Marko Rauhamaa <marko@pacujo.net> - 2014-08-12 08:01 +0300
Re: Is print thread safe? Cameron Simpson <cs@zip.com.au> - 2014-08-12 16:15 +1000
Re: Is print thread safe? Cameron Simpson <cs@zip.com.au> - 2014-08-12 14:31 +1000
Re: Is print thread safe? Chris Angelico <rosuav@gmail.com> - 2014-08-12 23:53 +1000
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2014-08-11 07:44 +0000 |
| Subject | Is print thread safe? |
| Message-ID | <53e87451$0$29890$c3e8da3$5496439d@news.astraweb.com> |
Specifically for Python 2.6 and 2.7, but answers for 3.x appreciated as well. Is print thread safe? That is, if I have two threads that each call print, say: print "spam spam spam" # thread 1 print "eggs eggs eggs" # thread 2 I don't care which line prints first, but I do care if the two lines are mixed in together, something like this: spam spaeggs eggs m seggspams Does print perform its own locking to prevent this? -- Steven
[toc] | [next] | [standalone]
| From | INADA Naoki <songofacandy@gmail.com> |
|---|---|
| Date | 2014-08-11 19:19 +0900 |
| Message-ID | <mailman.12845.1407752374.18130.python-list@python.org> |
| In reply to | #76029 |
On Python 3, print is thread safe. But Python 2 has broken scenario: print "spam", "spam", "spam" # thread 1 print "eggs", "eggs", "eggs" # thread 2 In this case, 2 lines are mixed. In your case, "spam spam spam" and "eggs eggs eggs" are not mixed. But newline is mixed like: spam spam spameggs eggs eggs eggs eggs eggsspam spam spam eggs eggs eggs spam spam spam On Mon, Aug 11, 2014 at 4:44 PM, Steven D'Aprano <steve@pearwood.info> wrote: > Specifically for Python 2.6 and 2.7, but answers for 3.x appreciated as > well. > > Is print thread safe? That is, if I have two threads that each call > print, say: > > print "spam spam spam" # thread 1 > print "eggs eggs eggs" # thread 2 > > I don't care which line prints first, but I do care if the two lines are > mixed in together, something like this: > > spam spaeggs eggs m seggspams > > > Does print perform its own locking to prevent this? > > > > -- > Steven > -- > https://mail.python.org/mailman/listinfo/python-list -- INADA Naoki <songofacandy@gmail.com>
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-08-12 02:07 +1000 |
| Message-ID | <53e8ea27$0$29981$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #76042 |
INADA Naoki wrote: > On Python 3, print is thread safe. > > But Python 2 has broken scenario: Is this documented somewhere? -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Cameron Simpson <cs@zip.com.au> |
|---|---|
| Date | 2014-08-12 07:53 +1000 |
| Message-ID | <mailman.12868.1407795799.18130.python-list@python.org> |
| In reply to | #76065 |
On 12Aug2014 02:07, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>INADA Naoki wrote:
>
>> On Python 3, print is thread safe.
>> But Python 2 has broken scenario:
>
>Is this documented somewhere?
In python/2.7.6/reference/simple_stmts.html#index-22, "print" is described in
terms of a "write" for each object, and a "write" for the separators. There is
no mention of locking.
On that basis, I would find the interleaving described normal and expected. And
certainly not "broken".
Just use a lock! And rebind "print"! Or use the logging system!
Cheers,
Cameron Simpson <cs@zip.com.au>
Wow! Yet another place that I've been quoted...
- Andy Beals <bandy@cinnamon.com>
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-08-12 09:56 +1000 |
| Message-ID | <53e9583b$0$29973$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #76082 |
Cameron Simpson wrote:
> On 12Aug2014 02:07, Steven D'Aprano <steve+comp.lang.python@pearwood.info>
> wrote:
>>INADA Naoki wrote:
>>
>>> On Python 3, print is thread safe.
>>> But Python 2 has broken scenario:
>>
>>Is this documented somewhere?
>
> In python/2.7.6/reference/simple_stmts.html#index-22, "print" is described
> in terms of a "write" for each object, and a "write" for the separators.
> There is no mention of locking.
Ah, thanks!
> On that basis, I would find the interleaving described normal and
> expected. And certainly not "broken".
I personally didn't describe it as "broken", but it is, despite the
documentation. I just ran a couple of trials where I collected the output
of sys.stdout while 50 threads blasted "Spam ABCD EFGH" (plus the implicit
newline) to stdout as fast as possible using print. The result was that out
of 248165 lines[1], 595 were mangled. Many of the mangled lines were the
expected simple run-ons:
Spam ABCD EFGHSpam ABCD EFGH\n\n
which makes sense given the documentation, but there were lots of anomalies.
Mysterious spaces appearing in the strings:
Spam ABCD EFGH Spam ABCD EFGH\n\n
Spam ABCD EFGH Spam ABCD EFGH\n Spam ABCD EFGH\n
occasional collisions mid-string:
Spam ABSpam ABCD EFGH\nCD EFGH\n
letters disappearing:
Spam AB\nD EFGH\n
and at least one utterly perplexing (to me) block of ASCII NULs appearing in
the middle of the output:
\x00\x00\x00...\x00\x00\n
This is with Python 2.7.2 on Linux.
> Just use a lock! And rebind "print"! Or use the logging system!
Personally, I believe that print ought to do its own locking. And print is a
statement, although in this case there's no need to support anything older
than 2.6, so something like this ought to work:
from __future__ import print_function
_print = print
_rlock = threading.RLock()
def print(*args, **kwargs):
with _rlock:
_print(*args, **kwargs)
Sadly, using print as a function alone isn't enough to fix this problem, but
in my quick tests, using locking as above does fix it, and with no
appreciable slowdown.
[1] Even the number of lines of output demonstrates a bug. I had fifty
threads printing 5000 times each, which makes 250000 lines, not 248165.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-08-12 10:14 +1000 |
| Message-ID | <mailman.12875.1407802477.18130.python-list@python.org> |
| In reply to | #76090 |
On Tue, Aug 12, 2014 at 9:56 AM, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > from __future__ import print_function > > _print = print > _rlock = threading.RLock() > def print(*args, **kwargs): > with _rlock: > _print(*args, **kwargs) You're conflating print and stdout here. Do you know which one is the cause of your problems? Alternatively, can you be certain that you never use either without the other? ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-08-12 08:01 +0300 |
| Message-ID | <87k36ee1xo.fsf@elektro.pacujo.net> |
| In reply to | #76090 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info>: > Personally, I believe that print ought to do its own locking. And > print is a statement, although in this case there's no need to support > anything older than 2.6, so something like this ought to work: > > > from __future__ import print_function > > _print = print > _rlock = threading.RLock() > def print(*args, **kwargs): > with _rlock: > _print(*args, **kwargs) Could this cause a deadlock if print were used in signal handlers? Marko
[toc] | [prev] | [next] | [standalone]
| From | Cameron Simpson <cs@zip.com.au> |
|---|---|
| Date | 2014-08-12 16:15 +1000 |
| Message-ID | <mailman.12879.1407824124.18130.python-list@python.org> |
| In reply to | #76099 |
On 12Aug2014 08:01, Marko Rauhamaa <marko@pacujo.net> wrote:
>Steven D'Aprano <steve+comp.lang.python@pearwood.info>:
>> Personally, I believe that print ought to do its own locking. And
>> print is a statement, although in this case there's no need to support
>> anything older than 2.6, so something like this ought to work:
>>
>> from __future__ import print_function
>>
>> _print = print
>> _rlock = threading.RLock()
>> def print(*args, **kwargs):
>> with _rlock:
>> _print(*args, **kwargs)
>
>Could this cause a deadlock if print were used in signal handlers?
At the C level one tries to do as little as possible in q signal handler.
Typically setting a flag or putting something on a queue for later work.
In Python that may be a much smaller issue, since I imagine the handler runs in
the ordinary course of interpretation, outside the C-level handler context.
I personally wouldn't care if this might deadlock in a handler (lots of things
might; avoid as many things as possible). Also, the code above uses an RLock;
less prone to deadlock than a plain mutex Lock.
Cheers,
Cameron Simpson <cs@zip.com.au>
A host is a host from coast to coast
& no one will talk to a host that's close
Unless the host (that isn't close)
is busy, hung or dead
- David Lesher, wb8foz@skybridge.scl.cwru.edu
[toc] | [prev] | [next] | [standalone]
| From | Cameron Simpson <cs@zip.com.au> |
|---|---|
| Date | 2014-08-12 14:31 +1000 |
| Message-ID | <mailman.12880.1407826485.18130.python-list@python.org> |
| In reply to | #76090 |
On 12Aug2014 09:56, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: >Cameron Simpson wrote: >> On 12Aug2014 02:07, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: >>>Is this documented somewhere? >> >> In python/2.7.6/reference/simple_stmts.html#index-22, "print" is described >> in terms of a "write" for each object, and a "write" for the separators. >> There is no mention of locking. > >Ah, thanks! > >> On that basis, I would find the interleaving described normal and >> expected. And certainly not "broken". > >I personally didn't describe it as "broken", Yes, sorry. >but it is, despite the >documentation. I just ran a couple of trials where I collected the output >of sys.stdout while 50 threads blasted "Spam ABCD EFGH" (plus the implicit >newline) to stdout as fast as possible using print. The result was that out >of 248165 lines[1], 595 were mangled. Many of the mangled lines were the >expected simple run-ons: > > Spam ABCD EFGHSpam ABCD EFGH\n\n > >which makes sense given the documentation, but there were lots of anomalies. > >Mysterious spaces appearing in the strings: > > Spam ABCD EFGH Spam ABCD EFGH\n\n > Spam ABCD EFGH Spam ABCD EFGH\n Spam ABCD EFGH\n > >occasional collisions mid-string: > > Spam ABSpam ABCD EFGH\nCD EFGH\n > >letters disappearing: > > Spam AB\nD EFGH\n > >and at least one utterly perplexing (to me) block of ASCII NULs appearing in >the middle of the output: > > \x00\x00\x00...\x00\x00\n > >This is with Python 2.7.2 on Linux. Sounds like print is not thread safe. Which it does not promise to be. But I would normally expect most file.write methods to be thread safe. Naively. >> Just use a lock! And rebind "print"! Or use the logging system! > >Personally, I believe that print ought to do its own locking. I don't, but I kind of believe "file"s should have thread safe write calls. Again, not guarrenteed AFAIR. >And print is a >statement, although in this case there's no need to support anything older >than 2.6, so something like this ought to work: > >from __future__ import print_function > >_print = print >_rlock = threading.RLock() >def print(*args, **kwargs): > with _rlock: > _print(*args, **kwargs) > >Sadly, using print as a function alone isn't enough to fix this problem, but >in my quick tests, using locking as above does fix it, and with no >appreciable slowdown. I would expect file.write to be fast enough that the lock would usually be free. With no evidence, just personal expectation. Taking a free lock should be almost instant. >[1] Even the number of lines of output demonstrates a bug. I had fifty >threads printing 5000 times each, which makes 250000 lines, not 248165. Sounds like the file internals are unsafe. Ugh. Cheers, Cameron Simpson <cs@zip.com.au> If it ain't broken, keep playing with it.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-08-12 23:53 +1000 |
| Message-ID | <mailman.12886.1407851593.18130.python-list@python.org> |
| In reply to | #76090 |
On Tue, Aug 12, 2014 at 2:31 PM, Cameron Simpson <cs@zip.com.au> wrote: > I would expect file.write to be fast enough that the lock would usually be > free. Until the day when it becomes really REALLY slow, because your program's piped into 'less' and the user's paging through it. But even apart from that, writing to stdout can take a notable amount of time. Expecting the lock to usually be free will depend on the nature of the program - how much of it is spent in silent computation and how much in production of output. If that ratio is sufficiently skewed, then sure, the lock'll usually be free. ChrisA
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web