Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #68920 > unrolled thread
| Started by | Chris Angelico <rosuav@gmail.com> |
|---|---|
| First post | 2014-03-25 11:19 +1100 |
| Last post | 2014-03-26 01:00 +1100 |
| Articles | 5 — 2 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: advice on sub-classing multiprocessing.Process and multiprocessing.BaseManager Chris Angelico <rosuav@gmail.com> - 2014-03-25 11:19 +1100
Re: advice on sub-classing multiprocessing.Process and multiprocessing.BaseManager matt.newville@gmail.com - 2014-03-24 20:27 -0700
Re: advice on sub-classing multiprocessing.Process and multiprocessing.BaseManager Chris Angelico <rosuav@gmail.com> - 2014-03-25 14:44 +1100
Re: advice on sub-classing multiprocessing.Process and multiprocessing.BaseManager matt.newville@gmail.com - 2014-03-25 06:34 -0700
Re: advice on sub-classing multiprocessing.Process and multiprocessing.BaseManager Chris Angelico <rosuav@gmail.com> - 2014-03-26 01:00 +1100
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-25 11:19 +1100 |
| Subject | Re: advice on sub-classing multiprocessing.Process and multiprocessing.BaseManager |
| Message-ID | <mailman.8474.1395707125.18130.python-list@python.org> |
On Tue, Mar 25, 2014 at 7:24 AM, Matt Newville <newville@cars.uchicago.edu> wrote: > I'm maintaining a python interface to a C library for a distributed > control system (EPICS, sort of a SCADA system) that does a large > amount of relatively light-weight network I/O. In order to keep many > connections open and responsive, and to provide a simple interface, > the python library keeps a global store of connection state. > > This works well for single processes and threads, but not so well for > multiprocessing, where the global state causes trouble. >From the sound of things, a single process is probably what you want here. Is there something you can't handle with one process? ChrisA
[toc] | [next] | [standalone]
| From | matt.newville@gmail.com |
|---|---|
| Date | 2014-03-24 20:27 -0700 |
| Message-ID | <80cf8fb7-d0c5-43a9-bc6f-c61ce6214f98@googlegroups.com> |
| In reply to | #68920 |
On Monday, March 24, 2014 7:19:56 PM UTC-5, Chris Angelico wrote: > On Tue, Mar 25, 2014 at 7:24 AM, Matt Newville > > > I'm maintaining a python interface to a C library for a distributed > > control system (EPICS, sort of a SCADA system) that does a large > > amount of relatively light-weight network I/O. In order to keep many > > connections open and responsive, and to provide a simple interface, > > the python library keeps a global store of connection state. > > > > This works well for single processes and threads, but not so well for > > multiprocessing, where the global state causes trouble. > > > From the sound of things, a single process is probably what you want > here. Is there something you can't handle with one process? Thanks for the reply. I find that appreciation is greatly (perhaps infinitely) delayed whenever I reply "X is probably not what you want to do" without further explanation to a question of "can I get some advice on how to do X?". So, I do thank you for your willingness to reply, even such a guaranteed-to-be-under-appreciated reply. There are indeed operations that can't be handled with a single process, such as simultaneously using multiple cores. This is why we want to use multiprocessing instead of (or, in addition to) threading. We're trying to do real-time collection of scientific data from a variety of data sources, generally within a LAN. The data can get largish and fast, and intermediate processing occasionally requires non-trivial computation time. So being able to launch worker processes that can run independently on separate cores would be very helpful. Ideally, we'd like to let sub-processes make calls to the control system too, say, read new data. I wasn't really asking "is multiprocessing appropriate?" but whether there was a cleaner way to subclass multiprocessing.BaseManager() to use a subclass of Process(). I can believe the answer is No, but thought I'd ask. Thanks again, --Matt
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-25 14:44 +1100 |
| Message-ID | <mailman.8484.1395719089.18130.python-list@python.org> |
| In reply to | #68938 |
On Tue, Mar 25, 2014 at 2:27 PM, <matt.newville@gmail.com> wrote: > Thanks for the reply. I find that appreciation is greatly (perhaps infinitely) delayed whenever I reply "X is probably not what you want to do" without further explanation to a question of "can I get some advice on how to do X?". So, I do thank you for your willingness to reply, even such a guaranteed-to-be-under-appreciated reply. > Heh. I do see that side of it, but the problem is that sometimes a question will be asked that implies a completely wrong approach. Take this example: "I'm having trouble passing a global variable to a function, how can I do it?" This exact question came up recently (I may have the wording wrong), and some of the solutions offered were horrendously convoluted messes involving passing the name of a global to the function which then used 'exec' or 'eval'. While technically that answers the question, it's much more helpful to take a step back - no, let's take a step forward - now another step back - and we're cha-cha'ing! - well, unless you're a real genius, just take the step back, and look at what you're actually trying to achieve. I wasn't trying to imply that you absolutely ought to use a single process, but more that the exact reasons for not using one process are significant in your style of coding the multi-process method. > There are indeed operations that can't be handled with a single process, such as simultaneously using multiple cores. This is why we want to use multiprocessing instead of (or, in addition to) threading. We're trying to do real-time collection of scientific data from a variety of data sources, generally within a LAN. The data can get largish and fast, and intermediate processing occasionally requires non-trivial computation time. So being able to launch worker processes that can run independently on separate cores would be very helpful. Ideally, we'd like to let sub-processes make calls to the control system too, say, read new data. > > I wasn't really asking "is multiprocessing appropriate?" but whether there was a cleaner way to subclass multiprocessing.BaseManager() to use a subclass of Process(). I can believe the answer is No, but thought I'd ask. > I've never subclassed BaseManager like this. It might be simpler to spin off one or more workers and not have them do any network communication at all; that way, you don't need to worry about the cache. Set up a process tree with one at the top doing only networking and process management (so it's always fast), and then use a multiprocessing.Queue or somesuch to pass info to a subprocess and back. Then your global connection state is all stored within the top process, and none of the others need care about it. You might have a bit of extra effort to pass info back to the parent rather than simply writing it to the connection, but that's a common requirement in other areas (eg GUI handling - it's common to push all GUI manipulation onto the main thread), so it's a common enough model. But if subclassing and tweaking is the easiest way, and if you don't mind your solution being potentially fragile (which subclassing like that is), then you could look into monkey-patching Process. Inject your code into it and then use the original. It's not perfect, but it may turn out easier than the "subclass everything" technique. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | matt.newville@gmail.com |
|---|---|
| Date | 2014-03-25 06:34 -0700 |
| Message-ID | <8d59e3a4-e6af-4633-869b-53568f6091cd@googlegroups.com> |
| In reply to | #68943 |
ChrisA - >> I wasn't really asking "is multiprocessing appropriate?" but whether >> there was a cleaner way to subclass multiprocessing.BaseManager() to >> use a subclass of Process(). I can believe the answer is No, but >> thought I'd ask. > > I've never subclassed BaseManager like this. It might be simpler to > spin off one or more workers and not have them do any network > communication at all; that way, you don't need to worry about the > cache. Set up a process tree with one at the top doing only networking > and process management (so it's always fast), and then use a > multiprocessing.Queue or somesuch to pass info to a subprocess and > back. Then your global connection state is all stored within the top > process, and none of the others need care about it. You might have a > bit of extra effort to pass info back to the parent rather than simply > writing it to the connection, but that's a common requirement in other > areas (eg GUI handling - it's common to push all GUI manipulation onto > the main thread), so it's a common enough model. > > But if subclassing and tweaking is the easiest way, and if you don't > mind your solution being potentially fragile (which subclassing like > that is), then you could look into monkey-patching Process. Inject > your code into it and then use the original. It's not perfect, but it > may turn out easier than the "subclass everything" technique. > > ChrisA Thanks, I agree that restricting network communications to a parent process would be a good recommended solution, but it's hard to enforce and easy to forget such a recommendation. It seems better to provide lightweight library-specific subclasses of Process (and Pool) and explaining why they should be used. This library (pyepics) already does similar things for interaction with other libraries (notably providing decorators to avoid issues with wxPython). Monkey-patching multiprocessing.Process seems more fragile than subclassing it. It turned out that multiprocessing.pool.Pool was also very easy to subclass. But cleanly subclassing the Managers in multiprocessing.managers look much harder. I'm not sure if this is intentional or not, or if it should be filed as an issue for multiprocessing. For now, I'm willing to say that the multiprocessing managers are not yet available with the pyepics library. Thanks again, --Matt
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-26 01:00 +1100 |
| Message-ID | <mailman.8522.1395756056.18130.python-list@python.org> |
| In reply to | #69023 |
On Wed, Mar 26, 2014 at 12:34 AM, <matt.newville@gmail.com> wrote: > Monkey-patching multiprocessing.Process seems more fragile than subclassing it. It turned out that multiprocessing.pool.Pool was also very easy to subclass. But cleanly subclassing the Managers in multiprocessing.managers look much harder. I'm not sure if this is intentional or not, or if it should be filed as an issue for multiprocessing. For now, I'm willing to say that the multiprocessing managers are not yet available with the pyepics library. > Subclassing is actually more fragile than you might think. As you've found, you need to fidget with more and more classes to make your change "stick", and also, any small change to implementation details in the superclass could suddenly break things. It's not really any safer than monkeypatching, despite all the OO fanatics saying how easy it is to rework by subclassing. At least when you monkeypatch, you *know* you're fiddling with internals. ChrisA
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web