Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #76029 > unrolled thread

Is print thread safe?

Started bySteven D'Aprano <steve@pearwood.info>
First post2014-08-11 07:44 +0000
Last post2014-08-12 23:53 +1000
Articles 10 — 6 participants

Back to article view | Back to comp.lang.python


Contents

  Is print thread safe? Steven D'Aprano <steve@pearwood.info> - 2014-08-11 07:44 +0000
    Re: Is print thread safe? INADA Naoki <songofacandy@gmail.com> - 2014-08-11 19:19 +0900
      Re: Is print thread safe? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-12 02:07 +1000
        Re: Is print thread safe? Cameron Simpson <cs@zip.com.au> - 2014-08-12 07:53 +1000
          Re: Is print thread safe? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-12 09:56 +1000
            Re: Is print thread safe? Chris Angelico <rosuav@gmail.com> - 2014-08-12 10:14 +1000
            Re: Is print thread safe? Marko Rauhamaa <marko@pacujo.net> - 2014-08-12 08:01 +0300
              Re: Is print thread safe? Cameron Simpson <cs@zip.com.au> - 2014-08-12 16:15 +1000
            Re: Is print thread safe? Cameron Simpson <cs@zip.com.au> - 2014-08-12 14:31 +1000
            Re: Is print thread safe? Chris Angelico <rosuav@gmail.com> - 2014-08-12 23:53 +1000

#76029 — Is print thread safe?

FromSteven D'Aprano <steve@pearwood.info>
Date2014-08-11 07:44 +0000
SubjectIs print thread safe?
Message-ID<53e87451$0$29890$c3e8da3$5496439d@news.astraweb.com>
Specifically for Python 2.6 and 2.7, but answers for 3.x appreciated as 
well.

Is print thread safe? That is, if I have two threads that each call 
print, say:

print "spam spam spam"  # thread 1
print "eggs eggs eggs"  # thread 2

I don't care which line prints first, but I do care if the two lines are 
mixed in together, something like this:

spam spaeggs eggs m seggspams


Does print perform its own locking to prevent this?



-- 
Steven

[toc] | [next] | [standalone]


#76042

FromINADA Naoki <songofacandy@gmail.com>
Date2014-08-11 19:19 +0900
Message-ID<mailman.12845.1407752374.18130.python-list@python.org>
In reply to#76029
On Python 3, print is thread safe.

But Python 2 has broken scenario:

print "spam", "spam", "spam"  # thread 1
print "eggs", "eggs", "eggs"  # thread 2

In this case, 2 lines are mixed.

In your case, "spam spam spam" and "eggs eggs eggs" are not mixed.
But newline is mixed like:

spam spam spameggs eggs eggs

eggs eggs eggsspam spam spam
eggs eggs eggs

spam spam spam


On Mon, Aug 11, 2014 at 4:44 PM, Steven D'Aprano <steve@pearwood.info> wrote:
> Specifically for Python 2.6 and 2.7, but answers for 3.x appreciated as
> well.
>
> Is print thread safe? That is, if I have two threads that each call
> print, say:
>
> print "spam spam spam"  # thread 1
> print "eggs eggs eggs"  # thread 2
>
> I don't care which line prints first, but I do care if the two lines are
> mixed in together, something like this:
>
> spam spaeggs eggs m seggspams
>
>
> Does print perform its own locking to prevent this?
>
>
>
> --
> Steven
> --
> https://mail.python.org/mailman/listinfo/python-list



-- 
INADA Naoki  <songofacandy@gmail.com>

[toc] | [prev] | [next] | [standalone]


#76065

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-08-12 02:07 +1000
Message-ID<53e8ea27$0$29981$c3e8da3$5496439d@news.astraweb.com>
In reply to#76042
INADA Naoki wrote:

> On Python 3, print is thread safe.
> 
> But Python 2 has broken scenario:

Is this documented somewhere?


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#76082

FromCameron Simpson <cs@zip.com.au>
Date2014-08-12 07:53 +1000
Message-ID<mailman.12868.1407795799.18130.python-list@python.org>
In reply to#76065
On 12Aug2014 02:07, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>INADA Naoki wrote:
>
>> On Python 3, print is thread safe.
>> But Python 2 has broken scenario:
>
>Is this documented somewhere?

In python/2.7.6/reference/simple_stmts.html#index-22, "print" is described in 
terms of a "write" for each object, and a "write" for the separators. There is 
no mention of locking.

On that basis, I would find the interleaving described normal and expected. And 
certainly not "broken".

Just use a lock! And rebind "print"! Or use the logging system!

Cheers,
Cameron Simpson <cs@zip.com.au>

Wow!  Yet another place that I've been quoted...
         - Andy Beals <bandy@cinnamon.com>

[toc] | [prev] | [next] | [standalone]


#76090

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-08-12 09:56 +1000
Message-ID<53e9583b$0$29973$c3e8da3$5496439d@news.astraweb.com>
In reply to#76082
Cameron Simpson wrote:

> On 12Aug2014 02:07, Steven D'Aprano <steve+comp.lang.python@pearwood.info>
> wrote:
>>INADA Naoki wrote:
>>
>>> On Python 3, print is thread safe.
>>> But Python 2 has broken scenario:
>>
>>Is this documented somewhere?
> 
> In python/2.7.6/reference/simple_stmts.html#index-22, "print" is described
> in terms of a "write" for each object, and a "write" for the separators.
> There is no mention of locking.

Ah, thanks!

> On that basis, I would find the interleaving described normal and
> expected. And certainly not "broken".

I personally didn't describe it as "broken", but it is, despite the
documentation. I just ran a couple of trials where I collected the output
of sys.stdout while 50 threads blasted "Spam ABCD EFGH" (plus the implicit
newline) to stdout as fast as possible using print. The result was that out
of 248165 lines[1], 595 were mangled. Many of the mangled lines were the
expected simple run-ons:

    Spam ABCD EFGHSpam ABCD EFGH\n\n

which makes sense given the documentation, but there were lots of anomalies.

Mysterious spaces appearing in the strings:

    Spam ABCD EFGH Spam ABCD EFGH\n\n
    Spam ABCD EFGH Spam ABCD EFGH\n Spam ABCD EFGH\n

occasional collisions mid-string:

    Spam ABSpam ABCD EFGH\nCD EFGH\n

letters disappearing:

    Spam AB\nD EFGH\n

and at least one utterly perplexing (to me) block of ASCII NULs appearing in
the middle of the output:

    \x00\x00\x00...\x00\x00\n


This is with Python 2.7.2 on Linux. 


> Just use a lock! And rebind "print"! Or use the logging system!

Personally, I believe that print ought to do its own locking. And print is a
statement, although in this case there's no need to support anything older
than 2.6, so something like this ought to work:


from __future__ import print_function

_print = print
_rlock = threading.RLock()
def print(*args, **kwargs):
    with _rlock:
        _print(*args, **kwargs)


Sadly, using print as a function alone isn't enough to fix this problem, but
in my quick tests, using locking as above does fix it, and with no
appreciable slowdown.


[1] Even the number of lines of output demonstrates a bug. I had fifty
threads printing 5000 times each, which makes 250000 lines, not 248165.

-- 
Steven

[toc] | [prev] | [next] | [standalone]


#76092

FromChris Angelico <rosuav@gmail.com>
Date2014-08-12 10:14 +1000
Message-ID<mailman.12875.1407802477.18130.python-list@python.org>
In reply to#76090
On Tue, Aug 12, 2014 at 9:56 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> from __future__ import print_function
>
> _print = print
> _rlock = threading.RLock()
> def print(*args, **kwargs):
>     with _rlock:
>         _print(*args, **kwargs)

You're conflating print and stdout here. Do you know which one is the
cause of your problems? Alternatively, can you be certain that you
never use either without the other?

ChrisA

[toc] | [prev] | [next] | [standalone]


#76099

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-08-12 08:01 +0300
Message-ID<87k36ee1xo.fsf@elektro.pacujo.net>
In reply to#76090
Steven D'Aprano <steve+comp.lang.python@pearwood.info>:

> Personally, I believe that print ought to do its own locking. And
> print is a statement, although in this case there's no need to support
> anything older than 2.6, so something like this ought to work:
>
>
> from __future__ import print_function
>
> _print = print
> _rlock = threading.RLock()
> def print(*args, **kwargs):
>     with _rlock:
>         _print(*args, **kwargs)

Could this cause a deadlock if print were used in signal handlers?


Marko

[toc] | [prev] | [next] | [standalone]


#76100

FromCameron Simpson <cs@zip.com.au>
Date2014-08-12 16:15 +1000
Message-ID<mailman.12879.1407824124.18130.python-list@python.org>
In reply to#76099
On 12Aug2014 08:01, Marko Rauhamaa <marko@pacujo.net> wrote:
>Steven D'Aprano <steve+comp.lang.python@pearwood.info>:
>> Personally, I believe that print ought to do its own locking. And
>> print is a statement, although in this case there's no need to support
>> anything older than 2.6, so something like this ought to work:
>>
>> from __future__ import print_function
>>
>> _print = print
>> _rlock = threading.RLock()
>> def print(*args, **kwargs):
>>     with _rlock:
>>         _print(*args, **kwargs)
>
>Could this cause a deadlock if print were used in signal handlers?

At the C level one tries to do as little as possible in q signal handler.  
Typically setting a flag or putting something on a queue for later work.

In Python that may be a much smaller issue, since I imagine the handler runs in 
the ordinary course of interpretation, outside the C-level handler context.

I personally wouldn't care if this might deadlock in a handler (lots of things 
might; avoid as many things as possible). Also, the code above uses an RLock; 
less prone to deadlock than a plain mutex Lock.

Cheers,
Cameron Simpson <cs@zip.com.au>

A host is a host from coast to coast
& no one will talk to a host that's close
Unless the host (that isn't close)
is busy, hung or dead
         - David Lesher, wb8foz@skybridge.scl.cwru.edu

[toc] | [prev] | [next] | [standalone]


#76101

FromCameron Simpson <cs@zip.com.au>
Date2014-08-12 14:31 +1000
Message-ID<mailman.12880.1407826485.18130.python-list@python.org>
In reply to#76090
On 12Aug2014 09:56, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>Cameron Simpson wrote:
>> On 12Aug2014 02:07, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>>>Is this documented somewhere?
>>
>> In python/2.7.6/reference/simple_stmts.html#index-22, "print" is described
>> in terms of a "write" for each object, and a "write" for the separators.
>> There is no mention of locking.
>
>Ah, thanks!
>
>> On that basis, I would find the interleaving described normal and
>> expected. And certainly not "broken".
>
>I personally didn't describe it as "broken",

Yes, sorry.

>but it is, despite the
>documentation. I just ran a couple of trials where I collected the output
>of sys.stdout while 50 threads blasted "Spam ABCD EFGH" (plus the implicit
>newline) to stdout as fast as possible using print. The result was that out
>of 248165 lines[1], 595 were mangled. Many of the mangled lines were the
>expected simple run-ons:
>
>    Spam ABCD EFGHSpam ABCD EFGH\n\n
>
>which makes sense given the documentation, but there were lots of anomalies.
>
>Mysterious spaces appearing in the strings:
>
>    Spam ABCD EFGH Spam ABCD EFGH\n\n
>    Spam ABCD EFGH Spam ABCD EFGH\n Spam ABCD EFGH\n
>
>occasional collisions mid-string:
>
>    Spam ABSpam ABCD EFGH\nCD EFGH\n
>
>letters disappearing:
>
>    Spam AB\nD EFGH\n
>
>and at least one utterly perplexing (to me) block of ASCII NULs appearing in
>the middle of the output:
>
>    \x00\x00\x00...\x00\x00\n
>
>This is with Python 2.7.2 on Linux.

Sounds like print is not thread safe. Which it does not promise to be. But I 
would normally expect most file.write methods to be thread safe. Naively.

>> Just use a lock! And rebind "print"! Or use the logging system!
>
>Personally, I believe that print ought to do its own locking.

I don't, but I kind of believe "file"s should have thread safe write calls.  
Again, not guarrenteed AFAIR.

>And print is a
>statement, although in this case there's no need to support anything older
>than 2.6, so something like this ought to work:
>
>from __future__ import print_function
>
>_print = print
>_rlock = threading.RLock()
>def print(*args, **kwargs):
>    with _rlock:
>        _print(*args, **kwargs)
>
>Sadly, using print as a function alone isn't enough to fix this problem, but
>in my quick tests, using locking as above does fix it, and with no
>appreciable slowdown.

I would expect file.write to be fast enough that the lock would usually be 
free. With no evidence, just personal expectation. Taking a free lock should be 
almost instant.

>[1] Even the number of lines of output demonstrates a bug. I had fifty
>threads printing 5000 times each, which makes 250000 lines, not 248165.

Sounds like the file internals are unsafe. Ugh.

Cheers,
Cameron Simpson <cs@zip.com.au>

If it ain't broken, keep playing with it.

[toc] | [prev] | [next] | [standalone]


#76114

FromChris Angelico <rosuav@gmail.com>
Date2014-08-12 23:53 +1000
Message-ID<mailman.12886.1407851593.18130.python-list@python.org>
In reply to#76090
On Tue, Aug 12, 2014 at 2:31 PM, Cameron Simpson <cs@zip.com.au> wrote:
> I would expect file.write to be fast enough that the lock would usually be
> free.

Until the day when it becomes really REALLY slow, because your
program's piped into 'less' and the user's paging through it. But even
apart from that, writing to stdout can take a notable amount of time.
Expecting the lock to usually be free will depend on the nature of the
program - how much of it is spent in silent computation and how much
in production of output. If that ratio is sufficiently skewed, then
sure, the lock'll usually be free.

ChrisA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web