Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #86105 > unrolled thread
| Started by | Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> |
|---|---|
| First post | 2015-02-22 12:45 +0000 |
| Last post | 2015-02-23 23:23 +0000 |
| Articles | 20 on this page of 71 — 19 participants |
Back to article view | Back to comp.lang.python
Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 12:45 +0000
Re: Future of Pypy? jkn <jkn_gg@nicorp.f9.co.uk> - 2015-02-22 04:58 -0800
Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 15:30 +0000
[OT] - BASIC is still not a bad choice, was Re: Future of Pypy? Michael Torrie <torriem@gmail.com> - 2015-02-23 17:24 -0700
Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 14:27 +0100
Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 15:36 +0000
Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 18:22 +0100
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 11:02 -0800
Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 20:51 +0100
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 12:14 -0800
Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 23:13 +0100
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 18:45 -0800
Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-23 12:18 +1100
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 18:04 -0800
Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-23 13:16 +1100
Re: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-23 03:16 +0000
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 19:45 -0800
Re: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-23 04:00 +0000
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 22:13 -0800
Re: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-23 07:32 +0000
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 16:11 -0800
Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-24 11:31 +1100
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 17:50 -0800
Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-24 13:03 +1100
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 20:40 -0800
Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-24 17:57 +1100
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-27 13:40 -0800
Re: Future of Pypy? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-02-27 18:47 -0500
Are threads bad? - was: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-24 00:35 +0000
Re: Are threads bad? - was: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 21:27 -0800
Re: Are threads bad? - was: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-24 16:57 +1100
Re: Are threads bad? - was: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 22:23 -0800
Re: Are threads bad? - was: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-24 10:08 +0200
Re: Are threads bad? - was: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-24 15:53 -0800
Re: Are threads bad? - was: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-25 07:25 +0200
Re: Are threads bad? - was: Future of Pypy? Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com> - 2015-02-25 13:34 +0800
Re: Are threads bad? - was: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-25 07:46 +0200
Re: Are threads bad? - was: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-25 16:54 +1100
Re: Are threads bad? - was: Future of Pypy? Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com> - 2015-02-25 13:58 +0800
Re: Are threads bad? - was: Future of Pypy? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-24 23:02 -0700
Re: Are threads bad? - was: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-25 17:07 +1100
Re: Are threads bad? - was: Future of Pypy? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 16:37 +0000
Re: Are threads bad? - was: Future of Pypy? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-25 10:00 -0700
Re: Are threads bad? - was: Future of Pypy? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 17:16 +0000
Re: Are threads bad? - was: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-26 04:22 +1100
Re: Are threads bad? - was: Future of Pypy? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-02-25 19:44 -0500
Re: Are threads bad? - was: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-25 00:59 +0000
Re: Are threads bad? - was: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-26 21:55 -0800
Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-23 14:25 +1100
Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-23 18:41 +1100
Re: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-23 10:16 +0200
Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-23 20:19 +1100
Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-24 17:56 +1100
Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-24 18:16 +1100
Re: Future of Pypy? wxjmfauth@gmail.com - 2015-02-23 23:57 -0800
Re: Future of Pypy? Ethan Furman <ethan@stoneleaf.us> - 2015-02-23 11:39 -0800
Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-24 13:15 +1100
Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 17:47 -0800
Re: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-24 10:12 +0200
Re: Future of Pypy? Emile van Sebille <emile@fenx.com> - 2015-02-24 09:57 -0800
Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-23 01:05 +1100
Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 15:44 +0000
Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 19:20 +0000
Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 22:45 +0100
Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-23 14:04 +0000
Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-23 17:16 +0100
Re: Future of Pypy? Terry Reedy <tjreedy@udel.edu> - 2015-02-23 01:34 -0500
Re: Future of Pypy? Dave Cook <davecook@nowhere.net> - 2015-02-23 11:36 +0000
Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-23 14:13 +0000
Cython - was: Future of Pypy? Stefan Behnel <stefan_ml@behnel.de> - 2015-02-23 16:43 +0100
Re: Future of Pypy? Dave Cook <davecook@nowhere.net> - 2015-02-23 23:23 +0000
Page 2 of 4 — ← Prev page 1 [2] 3 4 Next page →
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2015-02-23 16:11 -0800 |
| Message-ID | <87bnkkb22u.fsf@jester.gateway.pace.com> |
| In reply to | #86199 |
Ryan Stuart <ryan.stuart.85@gmail.com> writes: > Threads can also share read-only data and you can pass arbitrary > objects (such as code callables that you want the other thread to > execute--this is quite useful) through Queue.Queue. I don't think > you can do that with the multiprocessing module. > > These things might be convenient but they are error prone for the > reasons pointed out. I don't see the error-proneness since nothing there seems to set off mutation of shared data. > Also, the majority can be achieved via the process approach. For > example, using fork to take a copy of the current process (including > the heap) you want to use will give you access to any callables on the > heap. What if you want to dynamically construct a callable and send it to another process? > Even if you are extra careful to not touch any shared state in your > code, you can almost be guaranteed that code higher up the stack, like > malloc for example, *will* be using shared state. This isn't the 1980's any more--any serious malloc implementation these days is thread safe. People write multi-threaded C programs all the time and those programs use malloc in more than one thread. > Even if you aren't sharing state in your code directly, code higher up > the stack will be sharing state. That is the whole point of a thread, > that's what they were invented for. Using threads safely might well > be impossible much less verifiable. You're basically saying it's impossible to write a reliable operating system, since OS's by nature have to do that stuff. Of course there are verified OS's, and some of the early pioneers in concurrency were the same guys who worked in program verification, e.g. Dijkstra's semaphores. Even Erlang uses data sharing under the hood (ETS tables and large binaries) though their API makes it look like the data is copied between processes. What I'd say is that multi-threaded programs tend to have miniature OS's inside them, so it helps to have had some exposure to OS implementation techniques if you're going to write this kind of code. But if you've had that exposure then it all becomes less scary. > So when there are other options that are just as viable/functional, > result in far less risk and are often much quicker to implement > correctly, why wouldn't you use them? I should give the multiprocessing module a try sometime (haven't used it so far because it's relatively new and I'm comfortable with threads). It has the disadvantages that I noted, though. > If it were easy to use threads in a verifiably safe manner, then there > probably wouldn't be a GIL. Nah, the GIL is just a CPython artifact. As Steven says, IronPython and Jython don't have GIL's. Java has no GIL, OCaml has no GIL, GHC has no GIL, etc. Someone made a CPython version with no GIL some years ago and it worked fine and it got a speedup on multiple cores. The only problem was that on a single core, it was significantly slower than regular CPython, specifically because of the overhead of having to lock all the refcount updates, so it was considered a failure. Laura Creighton may have more to say about this, but I've been under the impression that the main obstacle to getting rid of the CPython GIL is the refcount system (which is also easy to make mistakes with, by the way). That's why I was surprised to hear that PyPy has a GIL.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-02-24 11:31 +1100 |
| Message-ID | <mailman.19110.1424737905.18130.python-list@python.org> |
| In reply to | #86276 |
On Tue, Feb 24, 2015 at 11:11 AM, Paul Rubin <no.email@nospam.invalid> wrote: > What if you want to dynamically construct a callable and send it to > another process? I'm not sure what that would actually mean. Do you try to construct it out of code that already exists in the other process? Are you passing actual code to the other process? Does the callable, when called, actually execute in the calling process? And what about its context - its globals, and possibly nonlocals (if it's a closure)? ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2015-02-23 17:50 -0800 |
| Message-ID | <871tlgaxi7.fsf@jester.gateway.pace.com> |
| In reply to | #86280 |
Chris Angelico <rosuav@gmail.com> writes: >> What if you want to dynamically construct a callable and send it to >> another process? > I'm not sure what that would actually mean. Do you try to construct it > out of code that already exists in the other process? Are you passing > actual code to the other process? I gave an example in a reply to Steven, something like other_thread_queue.put(lambda x: x*x) to tell the other thread it is supposed to square something. It receives a callable and calls it in its own context.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-02-24 13:03 +1100 |
| Message-ID | <mailman.19113.1424743389.18130.python-list@python.org> |
| In reply to | #86284 |
On Tue, Feb 24, 2015 at 12:50 PM, Paul Rubin <no.email@nospam.invalid> wrote:
> Chris Angelico <rosuav@gmail.com> writes:
>>> What if you want to dynamically construct a callable and send it to
>>> another process?
>> I'm not sure what that would actually mean. Do you try to construct it
>> out of code that already exists in the other process? Are you passing
>> actual code to the other process?
>
> I gave an example in a reply to Steven, something like
>
> other_thread_queue.put(lambda x: x*x)
>
> to tell the other thread it is supposed to square something. It
> receives a callable and calls it in its own context.
So, you would have to pass code to the other process, probably. What about this:
y = 4
other_thread_queue.put(lambda x: x*y)
Or this:
y = [4]
def next_y():
y[0] += 1
return y[0]
other_thread_queue.put(next_y)
It may not be obvious with your squaring example, but every Python
function has its context (module globals, etc). You can't pass a
function around without also passing, or sharing, its data.
With threads in a single process, this isn't a problem. They all
access the same memory space, so they can all share state. As soon as
you go to separate processes, these considerations become serious.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2015-02-23 20:40 -0800 |
| Message-ID | <87mw43apmf.fsf@jester.gateway.pace.com> |
| In reply to | #86285 |
Chris Angelico <rosuav@gmail.com> writes: > So, you would have to pass code to the other process, probably. What > about this: > y = 4 > other_thread_queue.put(lambda x: x*y) the y in the lambda is a free variable that's a reference to the surrounding mutable context, so that's at best dubious. You could use: other_thread_queue.put(lambda x, y=y: x*y) > Or this: > > y = [4] > def next_y(): > y[0] += 1 > return y[0] > other_thread_queue.put(next_y) There you have shared mutable data, which isn't allowed in this style. > It may not be obvious with your squaring example, but every Python > function has its context (module globals, etc). You can't pass a > function around without also passing, or sharing, its data. That is ok as long as the data can't change. > With threads in a single process, this isn't a problem. They all > access the same memory space, so they can all share state. As soon as > you go to separate processes, these considerations become serious. Right, that's a limitation of processes compared to threads.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-02-24 17:57 +1100 |
| Message-ID | <54ec20d0$0$11103$c3e8da3@news.astraweb.com> |
| In reply to | #86291 |
Paul Rubin wrote: >> With threads in a single process, this isn't a problem. They all >> access the same memory space, so they can all share state. As soon as >> you go to separate processes, these considerations become serious. > > Right, that's a limitation of processes compared to threads. > I think the point is that it's not a *limitation* of processes, but a *feature* of processes that they don't share state. (Well, I think there are explicit ways to have shared memory, but that's another story.) An interesting point of view: threading is harmful because it removes determinism from your program. http://radar.oreilly.com/2007/01/threads-considered-harmful.html As I once wrote: A programmer had a problem, and thought Now he has "I know, I'll solve two it with threads!" problems. http://code.activestate.com/lists/python-list/634273/ Some discussion of the pros and cons of threading: http://c2.com/cgi/wiki?ThreadsConsideredHarmful -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2015-02-27 13:40 -0800 |
| Message-ID | <878ufj824g.fsf@jester.gateway.pace.com> |
| In reply to | #86301 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes: > An interesting point of view: threading is harmful because it removes > determinism from your program. > http://radar.oreilly.com/2007/01/threads-considered-harmful.html Concurrent programs are inherently nondeterministic because they respond to i/o events that can happen in any order. I looked at the paper cited in that article and it seemed like handwaving. Then it talks about threaded programs being equivalent if they are the same over all interleavings of input, and then goes on about that being horribly difficult to establish. It talked about program inputs as infinite sequences of bits. OK, a standard conceit in mathematical logic is to call an infinite sequence of bits a "real number". So it seems to me that such a proof would just be a theorem about real numbers or sets of real numbers, and freshman calculus classes are already full of proofs like that. The presence of large sets doesn't necessarily make math all that much harder. The test suite for HOL Light actually uses an inaccessible cardinal, if that means anything to you. IOW he says it's difficult and maybe it is, but he doesn't make any attempt to explain why it's difficult, at least once there's some tools (synchronization primitives etc.) to control the concurrency. He seems instead to ignore decades of work going back to Dijkstra and Wirth and those guys. It would be a lot more convincing if he addressed that existing literature and said why it wasn't good enough to help write real programs that work. He then advocates something he calls the "PN model" (processes communicating by message passing) but that seems about the same as what I've heard called communicating sequential processes (CSP), which are the execution model of Erlang and is what I've been using with Python threads and queues. Maybe there's some subtle difference. Anyway there's again plenty of theory about CSP, which are modelled with Pi-calculus (process calculus) which can be interpreted in lambda calculus, so sequential verification techniques are still useful on it. Hmm, I see there's a Wikipedia article "Kahn process networks" about PN networks as mentioned, so I guess I'll look at it. I see it claims a KPN is deterministic on its inputs, while I think CSP's might not be. > Some discussion of the pros and cons of threading: > http://c2.com/cgi/wiki?ThreadsConsideredHarmful This wasn't very informative either.
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2015-02-27 18:47 -0500 |
| Message-ID | <mailman.19325.1425080856.18130.python-list@python.org> |
| In reply to | #86595 |
On Fri, 27 Feb 2015 13:40:15 -0800, Paul Rubin <no.email@nospam.invalid>
declaimed the following:
>Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes:
>> An interesting point of view: threading is harmful because it removes
>> determinism from your program.
>> http://radar.oreilly.com/2007/01/threads-considered-harmful.html
>
>Concurrent programs are inherently nondeterministic because they respond
>to i/o events that can happen in any order. I looked at the paper cited
And that aspect (nondeterministic) applies whether one is using threads
or processes to handle the I/O -- except, possibly, in the types of
architectures used for aircraft systems: fixed time slices for "partitions"
(which /may/ run threads internally on a partition OS, but the entire
partition gets scheduled as a chunk by an overarching OS); no dynamic
memory allocations (and no freeing either) once the system transitions from
"startup" to "running" (any message queues have all "entries" pre-allocated
in a list); dedicated message queues for data transfer between partitions
vs within a partition, etc. Oh, and some designs require a partition to
essentially complete all processing within some total period of CPU time
(say, three partition time slices) and then, in effect, start over from the
beginning.
>Hmm, I see there's a Wikipedia article "Kahn process networks" about PN
>networks as mentioned, so I guess I'll look at it. I see it claims a
>KPN is deterministic on its inputs, while I think CSP's might not be.
>
Oddly, KPN wouldn't fit the hard realtime of aircraft systems -- it
fails on the "unbounded FIFOs".
Wikipedia gives "deterministic" as (paraphrased) same inputs give the
same outputs -- but does not take into account the timing of the system.
KPN> Hence, timing of the processes does not affect outputs of the system.
Deterministic /timing/ is a factor in aircraft systems.
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Ryan Stuart <ryan.stuart.85@gmail.com> |
|---|---|
| Date | 2015-02-24 00:35 +0000 |
| Subject | Are threads bad? - was: Future of Pypy? |
| Message-ID | <mailman.19111.1424738135.18130.python-list@python.org> |
| In reply to | #86276 |
[Multipart message — attachments visible in raw view] — view raw
On Tue Feb 24 2015 at 10:15:40 AM Paul Rubin <no.email@nospam.invalid> wrote: > > I don't see the error-proneness since nothing there seems to set off > mutation of shared data. > I'm not sure what else to say really. It's just a fact of life that Threads by definition run in the same memory space and hence always have the possibility of nasty unforeseen problems. They are unforeseen because it is extremely difficult (maybe impossible?) to try and map out and understand all the different possible mutations to state. Sure, your code might not be making any mutations (that you know of), but malloc definitely is [1], and that's just the tip of the iceberg. Other things like buffers for stdin and stdout, DNS resolution etc. all have the same issue. I have no doubt someone can come up with a scenario where they need to use threads. I can't come up with one myself, but maybe someone else can. But in the work I have done, processes have sufficed - even for the example of dynamic callables you gave. To borrow from the original article I linked - "Nevertheless I still think it’s a bad idea to make things harder for ourselves if we can avoid it." Cheers [1] Line 70 of glibc malloc - https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/arena.c;h=8af51f05eb376ae2ba07e99c8c766a8ae8af425b;hb=bdf1ff052a8e23d637f2c838fa5642d78fcedc33#l70 > > > Also, the majority can be achieved via the process approach. For > > example, using fork to take a copy of the current process (including > > the heap) you want to use will give you access to any callables on the > > heap. > > What if you want to dynamically construct a callable and send it to > another process? > > > Even if you are extra careful to not touch any shared state in your > > code, you can almost be guaranteed that code higher up the stack, like > > malloc for example, *will* be using shared state. > > This isn't the 1980's any more--any serious malloc implementation these > days is thread safe. People write multi-threaded C programs all the > time and those programs use malloc in more than one thread. > > > Even if you aren't sharing state in your code directly, code higher up > > the stack will be sharing state. That is the whole point of a thread, > > that's what they were invented for. Using threads safely might well > > be impossible much less verifiable. > > You're basically saying it's impossible to write a reliable operating > system, since OS's by nature have to do that stuff. Of course there are > verified OS's, and some of the early pioneers in concurrency were the > same guys who worked in program verification, e.g. Dijkstra's > semaphores. Even Erlang uses data sharing under the hood (ETS tables > and large binaries) though their API makes it look like the data is > copied between processes. > > What I'd say is that multi-threaded programs tend to have miniature OS's > inside them, so it helps to have had some exposure to OS implementation > techniques if you're going to write this kind of code. But if you've > had that exposure then it all becomes less scary. > > > So when there are other options that are just as viable/functional, > > result in far less risk and are often much quicker to implement > > correctly, why wouldn't you use them? > > I should give the multiprocessing module a try sometime (haven't used it > so far because it's relatively new and I'm comfortable with threads). > It has the disadvantages that I noted, though. > > > If it were easy to use threads in a verifiably safe manner, then there > > probably wouldn't be a GIL. > > Nah, the GIL is just a CPython artifact. As Steven says, IronPython and > Jython don't have GIL's. Java has no GIL, OCaml has no GIL, GHC has no > GIL, etc. Someone made a CPython version with no GIL some years ago and > it worked fine and it got a speedup on multiple cores. The only problem > was that on a single core, it was significantly slower than regular > CPython, specifically because of the overhead of having to lock all the > refcount updates, so it was considered a failure. Laura Creighton may > have more to say about this, but I've been under the impression that the > main obstacle to getting rid of the CPython GIL is the refcount system > (which is also easy to make mistakes with, by the way). That's why I > was surprised to hear that PyPy has a GIL. > -- > https://mail.python.org/mailman/listinfo/python-list >
[toc] | [prev] | [next] | [standalone]
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2015-02-23 21:27 -0800 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <87lhjnang1.fsf@jester.gateway.pace.com> |
| In reply to | #86281 |
Ryan Stuart <ryan.stuart.85@gmail.com> writes: > I'm not sure what else to say really. It's just a fact of life that > Threads by definition run in the same memory space and hence always > have the possibility of nasty unforeseen problems. They are unforeseen > because it is extremely difficult (maybe impossible?) to try and map > out and understand all the different possible mutations to state. Sure, the shared memory introduces the possibility of some bad errors, I'm just saying that I've found that by staying with a certain straightforward style, it doesn't seem difficult in practice to avoid those errors. > Sure, your code might not be making any mutations (that you know of), > but malloc definitely is [1], and that's just the tip of the iceberg. > Other things like buffers for stdin and stdout, DNS resolution etc. > all have the same issue. I don't understand what you mean about malloc. I looked at that code and there's a mutex to make multi-threaded programs work right, and an ifdef (maybe for better performance) to use different code if there are no threads. IOW they spent a bunch of time handling threads. Are you saying there's a bug? Re stdin/stdout: obviously you can't have multiple threads messing with the same fd's; that's the same thing as data sharing. Re DNS: if gethostbyname isn't thread-safe I'd think of that as a pretty bad bug. But I'm having a vague memory of having had an issue with this though, and IIRC it took part of a morning to figure out what was going on, annoying but not a multi-month bug-hunt or anything like that. It didn't happen on my workstation, but only on the embedded target that was probably running an old or weird libc. > To borrow from the original article I linked - "Nevertheless I still > think it’s a bad idea to make things harder for ourselves if we can > avoid it." That article was interesting in some ways but confused in others. One way it was interesting is it said various non-thread approaches (such as coroutines) had about the same problems as threads. Some ways it was confused were: 1) thinking Haskell threads were like processes with separate address spaces. In fact they are in the same address space and programming with them isn't all that different from Python threads, though the synchronization primitives are a bit different. There is also an STM library available that is ingenious though apparently somewhat slow. 2) it has a weird story about the brass cockroach, that basically signified that they didn't have a robust enough testing system to be able to reproduce the bug. That is what they should have worked on. 3) It goes into various hazards of the balance transfer example not mentioning that STM (available in Haskell and Clojure) completely solves it. 4) It says: "eventually a system which communicates exclusively through non-blocking queues effectively becomes a set of communicating event loops, and its problems revert to those of an event-driven system; it doesn’t look like regular programming with threads any more." That is essentially what an Erlang program is, and it misses the fact that those low-level event loops can use blocking operations to their heart's content, without the inversion of control (callback spaghetti) of traditional evented systems (I haven't used asyncio yet). Also, the low-level loops can run in parallel on multiple cores, while a asyncio-style coroutine loop is sequential under the skin. In Erlang/OTP, you don't even see the event loops directly, since they are abstracted away by the OTP framework and it looks like RPC calls at the application level. But, it helps to know what is going on underneath. I'm realizing some people program Python in an ultra-dynamic style where the mutability of modules, functions, etc. really comes into play, so that make threads much more dangerous. I've tended to write Python with much less dynamism or even as if it were statically typed, so maybe that helps. Anyway, I got one thing out of this, which is that the multiprocessing module looks pretty nice and I should try it even when I don't need multicore parallelism, so thanks for that. In reality though, Python is still my most productive language for throwaway, non-concurrent scripting, but for more complicated concurrent programs, alternatives like Haskell, Erlang, and Go all have significant attractions.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-02-24 16:57 +1100 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <mailman.19118.1424757423.18130.python-list@python.org> |
| In reply to | #86294 |
On Tue, Feb 24, 2015 at 4:27 PM, Paul Rubin <no.email@nospam.invalid> wrote: >> Sure, your code might not be making any mutations (that you know of), >> but malloc definitely is [1], and that's just the tip of the iceberg. >> Other things like buffers for stdin and stdout, DNS resolution etc. >> all have the same issue. > > Re stdin/stdout: obviously you can't have > multiple threads messing with the same fd's; that's the same thing as > data sharing. Actually, you can quite happily have multiple threads messing with the underlying file descriptors, that's not a problem. (Though you will tend to get interleaved output. But if you always produce output in single blocks of text that each contain one line with a trailing newline, you should see interleaved lines that are each individually correct. I'm also not sure of any sane way to multiplex stdin - merging output from multiple threads is fine, but dividing input between multiple threads is messy.) The problem is *buffers* for stdin and stdout, where you have to be absolutely sure that you're not trampling all over another thread's data structures. If you unbuffer your output, it's probably going to be thread-safe. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2015-02-23 22:23 -0800 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <87h9ubakul.fsf@jester.gateway.pace.com> |
| In reply to | #86297 |
Chris Angelico <rosuav@gmail.com> writes: > Actually, you can quite happily have multiple threads messing with the > underlying file descriptors,... The problem is *buffers* for stdin > and stdout, where you have to be absolutely sure that you're not > trampling all over another thread's data structures. Oh ok, sure, yeah, the distinction is valid. I guess the classic interleaved "print 'a'" "print 'b'" loops could crash if the stdio structures get corrupted through conflicting updates somehow.
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2015-02-24 10:08 +0200 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <87bnkjenpp.fsf@elektro.pacujo.net> |
| In reply to | #86297 |
Chris Angelico <rosuav@gmail.com>:
> Actually, you can quite happily have multiple threads messing with the
> underlying file descriptors, that's not a problem. (Though you will
> tend to get interleaved output. But if you always produce output in
> single blocks of text that each contain one line with a trailing
> newline, you should see interleaved lines that are each individually
> correct. I'm also not sure of any sane way to multiplex stdin -
> merging output from multiple threads is fine, but dividing input
> between multiple threads is messy.) The problem is *buffers* for stdin
> and stdout, where you have to be absolutely sure that you're not
> trampling all over another thread's data structures. If you unbuffer
> your output, it's probably going to be thread-safe.
Here's an anecdote describing one real-life threading problem. We had a
largish multithreading framework (in Java, but I'm setting it in Python
and in a much simplified form).
We were mindful of deadlocks caused by lock reversal so we had come up
with a policy whereby objects form a layered hierarchy. An object higher
up in the hierarchy was allowed to call methods of objects below while
holding locks. The opposite was not allowed; if an object desired to
call a method of an object above it (through a registered callback), it
had to relinquish all locks before doing so.
However, a situation like this arose:
class App:
def send_stream(self, sock):
with self.lock:
self.register_socket(sock)
class SocketWrapper:
def read(_, count):
return sock.recv(count)
def close(_):
sock.close()
with self.lock:
self.unregister_socket(sock)
self.transport.forward_and_close(SocketWrapper(sock))
class Transport:
def forward_and_close(self, readable):
with self.lock:
more = readable.read(1000)
if more is WOULDBLOCK:
self.reschedule(readable)
elif more:
... # out of scope for the anecdote
else:
# EOF reached
readable.close()
Now the dreaded lock reversal arises when the App object calls
self.transport.forward_and_close() and Transport calls readable.close()
at the same time.
So why lock categorically like that? Java has a handy "synchronized"
keyword that wraps the whole method in "with self.lock". Ideally, that
handy idiom could be employed methodically. More importantly, to avoid
locking problems, the methodology should be rigorous and mindless. If
the developer must perform a deep locking analysis at every turn, they
are bound to make mistakes, especially when more than one developer is
involved, with differing intuitions.
Unfortunately, that deep locking analysis *is* required at every turn,
and mistakes *are* bound to happen.
Marko
[toc] | [prev] | [next] | [standalone]
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2015-02-24 15:53 -0800 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <8761aqamss.fsf@jester.gateway.pace.com> |
| In reply to | #86304 |
Marko Rauhamaa <marko@pacujo.net> writes: > So why lock categorically like that? Java has a handy "synchronized" > keyword that wraps the whole method in "with self.lock". ... > Unfortunately, that deep locking analysis *is* required at every turn, > and mistakes *are* bound to happen. I wonder if synchronized was a mistake in Java. It confused me a lot when I tried to use it, but I never had to mess with it that much. I can't quite tell what your code is doing (why is it attempting a socket read with a lock held) and I'd be interested to see what an STM version would look like.
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2015-02-25 07:25 +0200 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <87zj82bm15.fsf@elektro.pacujo.net> |
| In reply to | #86362 |
Paul Rubin <no.email@nospam.invalid>: > Marko Rauhamaa <marko@pacujo.net> writes: >> So why lock categorically like that? Java has a handy "synchronized" >> keyword that wraps the whole method in "with self.lock". ... >> Unfortunately, that deep locking analysis *is* required at every turn, >> and mistakes *are* bound to happen. > > I wonder if synchronized was a mistake in Java. It confused me a lot > when I tried to use it, but I never had to mess with it that much. I > can't quite tell what your code is doing (why is it attempting a > socket read with a lock held) and I'd be interested to see what an STM > version would look like. Synchronized methods are actually quite a powerful idea, but a bit underappreciated by many Java developers. The main point of my example is this: * Effective thread-safe programming would require some guidelines, a routine, a methodology so the code would be free of locking anomalies as a matter of course. * The industry hasn't formed such guidelines that are generally followed. That makes it difficult to glue together systems from components that come from different sources. * Even the best of guidelines have significant, surprising corner cases that require special treatment and are easy to get wrong. * As a corollary, you cannot implement thread-safety locally in a class. Instead, your locking decisions must be made application-wide. That violates encapsulation, which is a cornerstone of object-oriented programming (and large-system manageability). Marko
[toc] | [prev] | [next] | [standalone]
| From | Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com> |
|---|---|
| Date | 2015-02-25 13:34 +0800 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <mailman.19167.1424842476.18130.python-list@python.org> |
| In reply to | #86374 |
[Multipart message — attachments visible in raw view] — view raw
On Wed, Feb 25, 2015 at 1:25 PM, Marko Rauhamaa <marko@pacujo.net> wrote: > Paul Rubin <no.email@nospam.invalid>: > > > Marko Rauhamaa <marko@pacujo.net> writes: > >> So why lock categorically like that? Java has a handy "synchronized" > >> keyword that wraps the whole method in "with self.lock". ... > >> Unfortunately, that deep locking analysis *is* required at every turn, > >> and mistakes *are* bound to happen. > > > > I wonder if synchronized was a mistake in Java. It confused me a lot > > when I tried to use it, but I never had to mess with it that much. I > > can't quite tell what your code is doing (why is it attempting a > > socket read with a lock held) and I'd be interested to see what an STM > > version would look like. > > Synchronized methods are actually quite a powerful idea, but a bit > underappreciated by many Java developers. > > Synchronized methods in Java really makes programming life simpler. But I think it is standard practice to avoid this if there is a lighter alternative as synchronized methods are slow. Worse case I used double checked locking. > > > Marko > -- > https://mail.python.org/mailman/listinfo/python-list > -- Marcos | I love PHP, Linux, and Java <http://javadevnotes.com/java-integer-to-string-examples>
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2015-02-25 07:46 +0200 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <87pp8ybl14.fsf@elektro.pacujo.net> |
| In reply to | #86375 |
Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com>: > Synchronized methods in Java really makes programming life simpler. > But I think it is standard practice to avoid this if there is a > lighter alternative as synchronized methods are slow. Worse case I > used double checked locking. I have yet to see code whose performance suffers from too much locking. However, I have seen plenty of code that suffers from anomalies caused by incorrect locking. Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-02-25 16:54 +1100 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <mailman.19168.1424843651.18130.python-list@python.org> |
| In reply to | #86376 |
On Wed, Feb 25, 2015 at 4:46 PM, Marko Rauhamaa <marko@pacujo.net> wrote: > Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com>: > >> Synchronized methods in Java really makes programming life simpler. >> But I think it is standard practice to avoid this if there is a >> lighter alternative as synchronized methods are slow. Worse case I >> used double checked locking. > > I have yet to see code whose performance suffers from too much locking. > However, I have seen plenty of code that suffers from anomalies caused > by incorrect locking. Uhh, I have seen *heaps* of code whose performance suffers from too much locking. At the coarsest and least intelligent level, a database program that couldn't handle concurrency at all, so I wrote an application-level semaphore that stopped two people from running it at once. You want to use that program? Ask the other guy to close it. THAT is a performance problem. And there are plenty of narrower cases, where it ends up being a transactions-per-second throughput limiter. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com> |
|---|---|
| Date | 2015-02-25 13:58 +0800 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <mailman.19169.1424843892.18130.python-list@python.org> |
| In reply to | #86376 |
[Multipart message — attachments visible in raw view] — view raw
On Wed, Feb 25, 2015 at 1:46 PM, Marko Rauhamaa <marko@pacujo.net> wrote: > Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com>: > > > Synchronized methods in Java really makes programming life simpler. > > But I think it is standard practice to avoid this if there is a > > lighter alternative as synchronized methods are slow. Worse case I > > used double checked locking. > > I have yet to see code whose performance suffers from too much locking. > However, I have seen plenty of code that suffers from anomalies caused > by incorrect locking. > > > Of course code with locking is slower than one without. The locking mechanism itself is the overhead. So use it only when necessary. There is a reason why double checked locking was invented by clever programmers. > Marko > -- > https://mail.python.org/mailman/listinfo/python-list > -- Marcos | I love PHP, Linux, and Java <http://javadevnotes.com/java-integer-to-string-examples>
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2015-02-24 23:02 -0700 |
| Subject | Re: Are threads bad? - was: Future of Pypy? |
| Message-ID | <mailman.19170.1424844176.18130.python-list@python.org> |
| In reply to | #86376 |
On Tue, Feb 24, 2015 at 10:54 PM, Chris Angelico <rosuav@gmail.com> wrote: > On Wed, Feb 25, 2015 at 4:46 PM, Marko Rauhamaa <marko@pacujo.net> wrote: >> Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com>: >> >>> Synchronized methods in Java really makes programming life simpler. >>> But I think it is standard practice to avoid this if there is a >>> lighter alternative as synchronized methods are slow. Worse case I >>> used double checked locking. >> >> I have yet to see code whose performance suffers from too much locking. >> However, I have seen plenty of code that suffers from anomalies caused >> by incorrect locking. > > Uhh, I have seen *heaps* of code whose performance suffers from too > much locking. At the coarsest and least intelligent level, a database > program that couldn't handle concurrency at all, so I wrote an > application-level semaphore that stopped two people from running it at > once. You want to use that program? Ask the other guy to close it. > THAT is a performance problem. And there are plenty of narrower cases, > where it ends up being a transactions-per-second throughput limiter. Is the name of that database program "Microsoft Access" perchance?
[toc] | [prev] | [next] | [standalone]
Page 2 of 4 — ← Prev page 1 [2] 3 4 Next page →
Back to top | Article view | comp.lang.python
csiph-web