Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #70591 > unrolled thread

MacOS 10.9.2: threading error using python.org 2.7.6 distribution

Started byMatthew Pounsett <matt.pounsett@gmail.com>
First post2014-04-25 06:43 -0700
Last post2014-04-27 07:18 -0700
Articles 8 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  MacOS 10.9.2: threading error using python.org 2.7.6 distribution Matthew Pounsett <matt.pounsett@gmail.com> - 2014-04-25 06:43 -0700
    Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution Chris Angelico <rosuav@gmail.com> - 2014-04-26 00:05 +1000
      Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution Matthew Pounsett <matt.pounsett@gmail.com> - 2014-04-27 07:16 -0700
        Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution Chris Angelico <rosuav@gmail.com> - 2014-04-28 00:33 +1000
          Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution Matthew Pounsett <matt.pounsett@gmail.com> - 2014-04-28 15:50 -0700
            Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution Chris Angelico <rosuav@gmail.com> - 2014-04-29 09:00 +1000
    Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution Ned Deily <nad@acm.org> - 2014-04-25 11:58 -0700
      Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution Matthew Pounsett <matt.pounsett@gmail.com> - 2014-04-27 07:18 -0700

#70591 — MacOS 10.9.2: threading error using python.org 2.7.6 distribution

FromMatthew Pounsett <matt.pounsett@gmail.com>
Date2014-04-25 06:43 -0700
SubjectMacOS 10.9.2: threading error using python.org 2.7.6 distribution
Message-ID<a087ae04-a87c-4e77-a7dd-2d883d30a6f0@googlegroups.com>
I've run into a threading error in some code when I run it on MacOS that works flawlessly on a *BSD system running the same version of python.  I'm running the python 2.7.6 for MacOS distribution from python.org's downloads page.

I have tried to reproduce the error with a simple example, but so far haven't been able to find the element or my code that triggers the error.  I'm hoping someone can suggest some things to try and/or look at.  Googling for "pyton" and the error returns exactly two pages, neither of which are any help.

When I run it through the debugger, I'm getting the following from inside threading.start().  python fails to provide a stack trace when I step into _start_new_thread(), which is a pointer to thread.start_new_thread().  It looks like threading.__bootstrap_inner() may be throwing an exception which thread.start_new_thread() is unable to handle, and for some reason the stack is missing so I get no stack trace explaining the error.

It looks like thread.start_new_thread() is in the binary object, so I can't actually step into it and find where the error is occurring.

> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py(745)start()
-> _start_new_thread(self.__bootstrap, ())
(Pdb) s
> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py(750)start()
-> self.__started.wait()
(Pdb) Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from

My test code (which works) follows the exact same structure as the failing code, making the same calls to the threading module's objects' methods:

----
import threading

class MyThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)

    def run(self):
        print "MyThread runs and exits."

def main():
    try:
        t = MyThread()
        t.start()
    except Exception as e:
        print "Failed with {!r}".format(e)

if __name__ == '__main__':
    main()
----

The actual thread object that's failing looks like this:

class RTF2TXT(threading.Thread):
    """
    Takes a directory path and a Queue as arguments.  The directory should be
    a collection of RTF files, which will be read one-by-one, converted to
    text, and each output line will be appended in order to the Queue.
    """
    def __init__(self, path, queue):
        threading.Thread.__init__(self)
        self.path = path
        self.queue = queue

    def run(self):
        logger = logging.getLogger('RTF2TXT')
        if not os.path.isdir(self.path):
            raise TypeError, "supplied path must be a directory"
        for f in sorted(os.listdir(self.path)):
            ff = os.path.join(self.path, f)
            args = [ UNRTF_BIN, '-P', '.', '-t', 'unrtf.text',  ff ]
            logger.debug("Processing file {} with args {!r}".format(f, args))
            p1 = subprocess.Popen( args, stdout=subprocess.PIPE,
                    universal_newlines=True)
            output = p1.communicate()[0]
            try:
                output = output.decode('utf-8', 'ignore')
            except Exception as err:
                logger.error("Failed to decode output: {}".format(err))
                logger.error("Output was: {!r}".format(output))

            for line in output.split("\n"):
                line = line.strip()
                self.queue.put(line)
            self.queue.put("<EOF>")

Note: I only run one instance of this thread.  The Queue object is used to pass work off to another thread for later processing.

If I insert that object into the test code and run it instead of MyThread(), I get the error.  I can't see anything in there that should cause problems for the threading module though... especially since this runs fine on another system with the same version of python.

Any thoughts on what's going on here?


[toc] | [next] | [standalone]


#70593

FromChris Angelico <rosuav@gmail.com>
Date2014-04-26 00:05 +1000
Message-ID<mailman.9495.1398434710.18130.python-list@python.org>
In reply to#70591
On Fri, Apr 25, 2014 at 11:43 PM, Matthew Pounsett
<matt.pounsett@gmail.com> wrote:
> If I insert that object into the test code and run it instead of MyThread(), I get the error.  I can't see anything in there that should cause problems for the threading module though... especially since this runs fine on another system with the same version of python.
>
> Any thoughts on what's going on here?

First culprit I'd look at is the mixing of subprocess and threading.
It's entirely possible that something goes messy when you fork from a
thread.

Separately: You're attempting a very messy charset decode there. You
attempt to decode as UTF-8, errors ignored, and if that fails, you log
an error... and continue on with the original bytes. You're risking
shooting yourself in the foot there; I would recommend you have an
explicit fall-back (maybe re-decode as Latin-1??), so the next code is
guaranteed to be working with Unicode. Currently, it might get a
unicode or a str.

ChrisA

[toc] | [prev] | [next] | [standalone]


#70651

FromMatthew Pounsett <matt.pounsett@gmail.com>
Date2014-04-27 07:16 -0700
Message-ID<675725e3-38d2-4d81-bf64-f6d903d4a684@googlegroups.com>
In reply to#70593
On Friday, 25 April 2014 10:05:03 UTC-4, Chris Angelico  wrote:
> First culprit I'd look at is the mixing of subprocess and threading.
> It's entirely possible that something goes messy when you fork from a
> thread.

I liked the theory, but I've run some tests and can't reproduce the error that way.  I'm using all the elements in my test code that the real code runs, and I can't get the same error.  Even when I deliberately break things I'm getting a proper exception with stack trace.

class MyThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)

    def run(self):
        logger = logging.getLogger("thread")
        p1 = subprocess.Popen( shlex.split( 'echo "MyThread calls echo."'),
                stdout=subprocess.PIPE, universal_newlines=True)
        logger.debug( p1.communicate()[0].decode('utf-8', 'ignore' ))
        logger.debug( "MyThread runs and exits." )

def main():
    console = logging.StreamHandler()
    console.setFormatter(
            logging.Formatter('%(asctime)s [%(name)-12s] %(message)s', '%T'))
    logger = logging.getLogger()
    logger.addHandler(console)
    logger.setLevel(logging.NOTSET)

    try:
        t = MyThread()
        #t = RTF2TXT("../data/SRD/rtf/", Queue.Queue())
        t.start()
    except Exception as e:
        logger.error( "Failed with {!r}".format(e))

if __name__ == '__main__':
    main()


> Separately: You're attempting a very messy charset decode there. You
> attempt to decode as UTF-8, errors ignored, and if that fails, you log
> an error... and continue on with the original bytes. You're risking
> shooting yourself in the foot there; I would recommend you have an
> explicit fall-back (maybe re-decode as Latin-1??), so the next code is
> guaranteed to be working with Unicode. Currently, it might get a
> unicode or a str.

Yeah, that was a logic error on my part that I hadn't got around to noticing, since I'd been concentrating on the stuff that was actively breaking.  That should have been in an else: block on the end of the try.

[toc] | [prev] | [next] | [standalone]


#70654

FromChris Angelico <rosuav@gmail.com>
Date2014-04-28 00:33 +1000
Message-ID<mailman.9531.1398609228.18130.python-list@python.org>
In reply to#70651
On Mon, Apr 28, 2014 at 12:16 AM, Matthew Pounsett
<matt.pounsett@gmail.com> wrote:
> On Friday, 25 April 2014 10:05:03 UTC-4, Chris Angelico  wrote:
>> First culprit I'd look at is the mixing of subprocess and threading.
>> It's entirely possible that something goes messy when you fork from a
>> thread.
>
> I liked the theory, but I've run some tests and can't reproduce the error that way.  I'm using all the elements in my test code that the real code runs, and I can't get the same error.  Even when I deliberately break things I'm getting a proper exception with stack trace.
>

In most contexts, "thread unsafe" simply means that you can't use the
same facilities simultaneously from two threads (eg a lot of database
connection libraries are thread unsafe with regard to a single
connection, as they'll simply write to a pipe or socket and then read
a response from it). But processes and threads are, on many systems,
linked. Just the act of spinning off a new thread and then forking can
potentially cause problems. Those are the exact sorts of issues that
you'll see when you switch OSes, as it's the underlying thread/process
model that's significant. (Particularly of note is that Windows is
*very* different from Unix-based systems, in that subprocess
management is not done by forking. But not applicable here.)

You may want to have a look at subprocess32, which Ned pointed out. I
haven't checked, but I would guess that its API is identical to
subprocess's, so it should be a drop-in replacement ("import
subprocess32 as subprocess"). If that produces the exact same results,
then it's (probably) not thread-safety that's the problem.

>> Separately: You're attempting a very messy charset decode there. You
>> attempt to decode as UTF-8, errors ignored, and if that fails, you log
>> an error... and continue on with the original bytes. You're risking
>> shooting yourself in the foot there; I would recommend you have an
>> explicit fall-back (maybe re-decode as Latin-1??), so the next code is
>> guaranteed to be working with Unicode. Currently, it might get a
>> unicode or a str.
>
> Yeah, that was a logic error on my part that I hadn't got around to noticing, since I'd been concentrating on the stuff that was actively breaking.  That should have been in an else: block on the end of the try.
>

Ah good. Keeping bytes versus text separate is something that becomes
particularly important in Python 3, so I always like to encourage
people to get them straight even in Py2. It'll save you some hassle
later on.

ChrisA

[toc] | [prev] | [next] | [standalone]


#70699

FromMatthew Pounsett <matt.pounsett@gmail.com>
Date2014-04-28 15:50 -0700
Message-ID<b442b64d-b79c-4c0e-9df4-e3c3ae47ee9e@googlegroups.com>
In reply to#70654
On Sunday, 27 April 2014 10:33:38 UTC-4, Chris Angelico  wrote:
> In most contexts, "thread unsafe" simply means that you can't use the 
> same facilities simultaneously from two threads (eg a lot of database
> connection libraries are thread unsafe with regard to a single
> connection, as they'll simply write to a pipe or socket and then read
> a response from it). But processes and threads are, on many systems,
> linked. Just the act of spinning off a new thread and then forking can
> potentially cause problems. Those are the exact sorts of issues that
> you'll see when you switch OSes, as it's the underlying thread/process
> model that's significant. (Particularly of note is that Windows is
> *very* different from Unix-based systems, in that subprocess
> management is not done by forking. But not applicable here.)
> 

Thanks, I'll keep all that in mind.  I have to wonder how much of a problem it is here though, since I was able to demonstrate a functioning fork inside a new thread further up in the discussion.

I have a new development that I find interesting, and I'm wondering if you still think it's the same problem.

I have taken that threading object and turned it into a normal function definition.  It's still forking the external tool, but it's doing so in the main thread, and it is finished execution before any other threads are created.   And I'm still getting the same error.

Turns out it's not coming from the threading module, but from the subprocess module instead.  Specifically, like 709 of /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py
which is this:

        try:
            self._execute_child(args, executable, preexec_fn, close_fds,
                                cwd, env, universal_newlines,
                                startupinfo, creationflags, shell, to_close,
                                p2cread, p2cwrite,
                                c2pread, c2pwrite,
                                errread, errwrite)
        except Exception:

I get the "Warning: No stack to get attribute from" twice when that self._execute_child() call is made.  I've tried stepping into it to narrow it down further, but I'm getting weird behaviour from the debugger that I've never seen before once I do that.  It's making it hard to track down exactly where the error is occurring.

Interestingly, it's not actually raising an exception there.  The except block is not being run.

[toc] | [prev] | [next] | [standalone]


#70700

FromChris Angelico <rosuav@gmail.com>
Date2014-04-29 09:00 +1000
Message-ID<mailman.9564.1398726034.18130.python-list@python.org>
In reply to#70699
On Tue, Apr 29, 2014 at 8:50 AM, Matthew Pounsett
<matt.pounsett@gmail.com> wrote:
> Thanks, I'll keep all that in mind.  I have to wonder how much of a problem it is here though, since I was able to demonstrate a functioning fork inside a new thread further up in the discussion.
>

Yeah, it's really hard to pin down sometimes. I once discovered a
problem whereby I was unable to spin off subprocesses that did certain
things, but I could do a trivial subprocess (I think I fork/exec'd to
the echo command or something) and that worked fine. Turned out to be
a bug in one of my signal handlers, but the error was being reported
at the point of the forking.

> I have a new development that I find interesting, and I'm wondering if you still think it's the same problem.
>
> I have taken that threading object and turned it into a normal function definition.  It's still forking the external tool, but it's doing so in the main thread, and it is finished execution before any other threads are created.   And I'm still getting the same error.
>

Interesting. That ought to eliminate all possibility of
thread-vs-process issues. Can you post the smallest piece of code that
exhibits the same failure?

ChrisA

[toc] | [prev] | [next] | [standalone]


#70615

FromNed Deily <nad@acm.org>
Date2014-04-25 11:58 -0700
Message-ID<mailman.9511.1398452352.18130.python-list@python.org>
In reply to#70591
In article 
<CAPTjJmpXuj9N3cdQcH0oJaVkSfVrqJWHH1GSt3FafkcGyw54Ag@mail.gmail.com>,
 Chris Angelico <rosuav@gmail.com> wrote:

> On Fri, Apr 25, 2014 at 11:43 PM, Matthew Pounsett
> <matt.pounsett@gmail.com> wrote:
> > If I insert that object into the test code and run it instead of 
> > MyThread(), I get the error.  I can't see anything in there that should 
> > cause problems for the threading module though... especially since this 
> > runs fine on another system with the same version of python.
> >
> > Any thoughts on what's going on here?
> 
> First culprit I'd look at is the mixing of subprocess and threading.
> It's entirely possible that something goes messy when you fork from a
> thread.

FWIW, the Python 2 version of subprocess is known to be thread-unsafe.  
There is a Py2 backport available on PyPI of the improved Python 3 
subprocess module:

http://bugs.python.org/issue20318
https://pypi.python.org/pypi/subprocess32/

-- 
 Ned Deily,
 nad@acm.org

[toc] | [prev] | [next] | [standalone]


#70652

FromMatthew Pounsett <matt.pounsett@gmail.com>
Date2014-04-27 07:18 -0700
Message-ID<a1bb450b-30d7-4559-9afb-f82932657446@googlegroups.com>
In reply to#70615
On Friday, 25 April 2014 14:58:56 UTC-4, Ned Deily  wrote:
> FWIW, the Python 2 version of subprocess is known to be thread-unsafe.  
> There is a Py2 backport available on PyPI of the improved Python 3 
> subprocess module:

Since that't the only thread that calls anything in subprocess, and I'm only running one instance of the thread, I'm not too concerned about how threadsafe subprocess is.  In this case it shouldn't matter.  Thanks for the info though.. that might be handy at some future point.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web