Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #86105 > unrolled thread

Future of Pypy?

Started byDave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk>
First post2015-02-22 12:45 +0000
Last post2015-02-23 23:23 +0000
Articles 20 on this page of 71 — 19 participants

Back to article view | Back to comp.lang.python


Contents

  Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 12:45 +0000
    Re: Future of Pypy? jkn <jkn_gg@nicorp.f9.co.uk> - 2015-02-22 04:58 -0800
      Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 15:30 +0000
        [OT] - BASIC is still not a bad choice, was Re: Future of Pypy? Michael Torrie <torriem@gmail.com> - 2015-02-23 17:24 -0700
    Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 14:27 +0100
      Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 15:36 +0000
        Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 18:22 +0100
          Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 11:02 -0800
            Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 20:51 +0100
              Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 12:14 -0800
                Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 23:13 +0100
                  Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 18:45 -0800
            Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-23 12:18 +1100
              Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 18:04 -0800
                Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-23 13:16 +1100
                Re: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-23 03:16 +0000
                  Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 19:45 -0800
                    Re: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-23 04:00 +0000
                      Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-22 22:13 -0800
                        Re: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-23 07:32 +0000
                          Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 16:11 -0800
                            Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-24 11:31 +1100
                              Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 17:50 -0800
                                Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-24 13:03 +1100
                                  Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 20:40 -0800
                                    Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-24 17:57 +1100
                                      Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-27 13:40 -0800
                                        Re: Future of Pypy? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-02-27 18:47 -0500
                            Are threads bad? - was: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-24 00:35 +0000
                              Re: Are threads bad? - was: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 21:27 -0800
                                Re: Are threads bad? - was: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-24 16:57 +1100
                                  Re: Are threads bad? - was: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 22:23 -0800
                                  Re: Are threads bad? - was: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-24 10:08 +0200
                                    Re: Are threads bad? - was: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-24 15:53 -0800
                                      Re: Are threads bad? - was: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-25 07:25 +0200
                                        Re: Are threads bad? - was: Future of Pypy? Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com> - 2015-02-25 13:34 +0800
                                          Re: Are threads bad? - was: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-25 07:46 +0200
                                            Re: Are threads bad? - was: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-25 16:54 +1100
                                            Re: Are threads bad? - was: Future of Pypy? Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com> - 2015-02-25 13:58 +0800
                                            Re: Are threads bad? - was: Future of Pypy? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-24 23:02 -0700
                                            Re: Are threads bad? - was: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-25 17:07 +1100
                                            Re: Are threads bad? - was: Future of Pypy? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 16:37 +0000
                                            Re: Are threads bad? - was: Future of Pypy? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-25 10:00 -0700
                                            Re: Are threads bad? - was: Future of Pypy? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 17:16 +0000
                                            Re: Are threads bad? - was: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-26 04:22 +1100
                                            Re: Are threads bad? - was: Future of Pypy? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-02-25 19:44 -0500
                                Re: Are threads bad? - was: Future of Pypy? Ryan Stuart <ryan.stuart.85@gmail.com> - 2015-02-25 00:59 +0000
                                  Re: Are threads bad? - was: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-26 21:55 -0800
                Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-23 14:25 +1100
                Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-23 18:41 +1100
                  Re: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-23 10:16 +0200
                  Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-23 20:19 +1100
                    Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-24 17:56 +1100
                      Re: Future of Pypy? Chris Angelico <rosuav@gmail.com> - 2015-02-24 18:16 +1100
                        Re: Future of Pypy? wxjmfauth@gmail.com - 2015-02-23 23:57 -0800
                  Re: Future of Pypy? Ethan Furman <ethan@stoneleaf.us> - 2015-02-23 11:39 -0800
                    Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-24 13:15 +1100
                  Re: Future of Pypy? Paul Rubin <no.email@nospam.invalid> - 2015-02-23 17:47 -0800
                    Re: Future of Pypy? Marko Rauhamaa <marko@pacujo.net> - 2015-02-24 10:12 +0200
        Re: Future of Pypy? Emile van Sebille <emile@fenx.com> - 2015-02-24 09:57 -0800
    Re: Future of Pypy? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-23 01:05 +1100
      Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 15:44 +0000
        Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-22 19:20 +0000
          Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-22 22:45 +0100
            Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-23 14:04 +0000
              Re: Future of Pypy? Laura Creighton <lac@openend.se> - 2015-02-23 17:16 +0100
    Re: Future of Pypy? Terry Reedy <tjreedy@udel.edu> - 2015-02-23 01:34 -0500
    Re: Future of Pypy? Dave Cook <davecook@nowhere.net> - 2015-02-23 11:36 +0000
      Re: Future of Pypy? Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-02-23 14:13 +0000
        Cython - was: Future of Pypy? Stefan Behnel <stefan_ml@behnel.de> - 2015-02-23 16:43 +0100
        Re: Future of Pypy? Dave Cook <davecook@nowhere.net> - 2015-02-23 23:23 +0000

Page 2 of 4 — ← Prev page 1 [2] 3 4  Next page →


#86276

FromPaul Rubin <no.email@nospam.invalid>
Date2015-02-23 16:11 -0800
Message-ID<87bnkkb22u.fsf@jester.gateway.pace.com>
In reply to#86199
Ryan Stuart <ryan.stuart.85@gmail.com> writes:
>     Threads can also share read-only data and you can pass arbitrary
>     objects (such as code callables that you want the other thread to
>     execute--this is quite useful) through Queue.Queue. I don't think
>     you can do that with the multiprocessing module.
>
> These things might be convenient but they are error prone for the
> reasons pointed out.

I don't see the error-proneness since nothing there seems to set off
mutation of shared data.

> Also, the majority can be achieved via the process approach. For
> example, using fork to take a copy of the current process (including
> the heap) you want to use will give you access to any callables on the
> heap.

What if you want to dynamically construct a callable and send it to
another process?

> Even if you are extra careful to not touch any shared state in your
> code, you can almost be guaranteed that code higher up the stack, like
> malloc for example, *will* be using shared state.

This isn't the 1980's any more--any serious malloc implementation these
days is thread safe.  People write multi-threaded C programs all the
time and those programs use malloc in more than one thread.

> Even if you aren't sharing state in your code directly, code higher up
> the stack will be sharing state.  That is the whole point of a thread,
> that's what they were invented for.  Using threads safely might well
> be impossible much less verifiable.

You're basically saying it's impossible to write a reliable operating
system, since OS's by nature have to do that stuff.  Of course there are
verified OS's, and some of the early pioneers in concurrency were the
same guys who worked in program verification, e.g. Dijkstra's
semaphores.  Even Erlang uses data sharing under the hood (ETS tables
and large binaries) though their API makes it look like the data is
copied between processes.

What I'd say is that multi-threaded programs tend to have miniature OS's
inside them, so it helps to have had some exposure to OS implementation
techniques if you're going to write this kind of code.  But if you've
had that exposure then it all becomes less scary.

> So when there are other options that are just as viable/functional,
> result in far less risk and are often much quicker to implement
> correctly, why wouldn't you use them?

I should give the multiprocessing module a try sometime (haven't used it
so far because it's relatively new and I'm comfortable with threads).
It has the disadvantages that I noted, though.

> If it were easy to use threads in a verifiably safe manner, then there
> probably wouldn't be a GIL.

Nah, the GIL is just a CPython artifact.  As Steven says, IronPython and
Jython don't have GIL's.  Java has no GIL, OCaml has no GIL, GHC has no
GIL, etc.  Someone made a CPython version with no GIL some years ago and
it worked fine and it got a speedup on multiple cores.  The only problem
was that on a single core, it was significantly slower than regular
CPython, specifically because of the overhead of having to lock all the
refcount updates, so it was considered a failure.  Laura Creighton may
have more to say about this, but I've been under the impression that the
main obstacle to getting rid of the CPython GIL is the refcount system
(which is also easy to make mistakes with, by the way).  That's why I
was surprised to hear that PyPy has a GIL.

[toc] | [prev] | [next] | [standalone]


#86280

FromChris Angelico <rosuav@gmail.com>
Date2015-02-24 11:31 +1100
Message-ID<mailman.19110.1424737905.18130.python-list@python.org>
In reply to#86276
On Tue, Feb 24, 2015 at 11:11 AM, Paul Rubin <no.email@nospam.invalid> wrote:
> What if you want to dynamically construct a callable and send it to
> another process?

I'm not sure what that would actually mean. Do you try to construct it
out of code that already exists in the other process? Are you passing
actual code to the other process? Does the callable, when called,
actually execute in the calling process? And what about its context -
its globals, and possibly nonlocals (if it's a closure)?

ChrisA

[toc] | [prev] | [next] | [standalone]


#86284

FromPaul Rubin <no.email@nospam.invalid>
Date2015-02-23 17:50 -0800
Message-ID<871tlgaxi7.fsf@jester.gateway.pace.com>
In reply to#86280
Chris Angelico <rosuav@gmail.com> writes:
>> What if you want to dynamically construct a callable and send it to
>> another process?
> I'm not sure what that would actually mean. Do you try to construct it
> out of code that already exists in the other process? Are you passing
> actual code to the other process?

I gave an example in a reply to Steven, something like

  other_thread_queue.put(lambda x: x*x)

to tell the other thread it is supposed to square something.  It
receives a callable and calls it in its own context.

[toc] | [prev] | [next] | [standalone]


#86285

FromChris Angelico <rosuav@gmail.com>
Date2015-02-24 13:03 +1100
Message-ID<mailman.19113.1424743389.18130.python-list@python.org>
In reply to#86284
On Tue, Feb 24, 2015 at 12:50 PM, Paul Rubin <no.email@nospam.invalid> wrote:
> Chris Angelico <rosuav@gmail.com> writes:
>>> What if you want to dynamically construct a callable and send it to
>>> another process?
>> I'm not sure what that would actually mean. Do you try to construct it
>> out of code that already exists in the other process? Are you passing
>> actual code to the other process?
>
> I gave an example in a reply to Steven, something like
>
>   other_thread_queue.put(lambda x: x*x)
>
> to tell the other thread it is supposed to square something.  It
> receives a callable and calls it in its own context.

So, you would have to pass code to the other process, probably. What about this:

y = 4
other_thread_queue.put(lambda x: x*y)

Or this:

y = [4]
def next_y():
    y[0] += 1
    return y[0]
other_thread_queue.put(next_y)

It may not be obvious with your squaring example, but every Python
function has its context (module globals, etc). You can't pass a
function around without also passing, or sharing, its data.

With threads in a single process, this isn't a problem. They all
access the same memory space, so they can all share state. As soon as
you go to separate processes, these considerations become serious.

ChrisA

[toc] | [prev] | [next] | [standalone]


#86291

FromPaul Rubin <no.email@nospam.invalid>
Date2015-02-23 20:40 -0800
Message-ID<87mw43apmf.fsf@jester.gateway.pace.com>
In reply to#86285
Chris Angelico <rosuav@gmail.com> writes:

> So, you would have to pass code to the other process, probably. What
> about this:
> y = 4
> other_thread_queue.put(lambda x: x*y)

the y in the lambda is a free variable that's a reference to the
surrounding mutable context, so that's at best dubious.  You could use:

  other_thread_queue.put(lambda x, y=y: x*y)

> Or this:
>
> y = [4]
> def next_y():
>     y[0] += 1
>     return y[0]
> other_thread_queue.put(next_y)

There you have shared mutable data, which isn't allowed in this style.

> It may not be obvious with your squaring example, but every Python
> function has its context (module globals, etc). You can't pass a
> function around without also passing, or sharing, its data.

That is ok as long as the data can't change.

> With threads in a single process, this isn't a problem. They all
> access the same memory space, so they can all share state. As soon as
> you go to separate processes, these considerations become serious.

Right, that's a limitation of processes compared to threads.

[toc] | [prev] | [next] | [standalone]


#86301

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2015-02-24 17:57 +1100
Message-ID<54ec20d0$0$11103$c3e8da3@news.astraweb.com>
In reply to#86291
Paul Rubin wrote:

>> With threads in a single process, this isn't a problem. They all
>> access the same memory space, so they can all share state. As soon as
>> you go to separate processes, these considerations become serious.
> 
> Right, that's a limitation of processes compared to threads.
> 

I think the point is that it's not a *limitation* of processes, but a 
*feature* of processes that they don't share state. (Well, I think there are 
explicit ways to have shared memory, but that's another story.)

An interesting point of view: threading is harmful because it removes 
determinism from your program.

http://radar.oreilly.com/2007/01/threads-considered-harmful.html

As I once wrote:

A programmer had a problem, and thought Now he has "I know, I'll solve 
two it with threads!" problems.

http://code.activestate.com/lists/python-list/634273/


Some discussion of the pros and cons of threading:

http://c2.com/cgi/wiki?ThreadsConsideredHarmful




-- 
Steven

[toc] | [prev] | [next] | [standalone]


#86595

FromPaul Rubin <no.email@nospam.invalid>
Date2015-02-27 13:40 -0800
Message-ID<878ufj824g.fsf@jester.gateway.pace.com>
In reply to#86301
Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes:
> An interesting point of view: threading is harmful because it removes 
> determinism from your program.
> http://radar.oreilly.com/2007/01/threads-considered-harmful.html

Concurrent programs are inherently nondeterministic because they respond
to i/o events that can happen in any order.  I looked at the paper cited
in that article and it seemed like handwaving.  Then it talks about
threaded programs being equivalent if they are the same over all
interleavings of input, and then goes on about that being horribly
difficult to establish.  It talked about program inputs as infinite
sequences of bits.  OK, a standard conceit in mathematical logic is to
call an infinite sequence of bits a "real number".  So it seems to me
that such a proof would just be a theorem about real numbers or sets of
real numbers, and freshman calculus classes are already full of proofs
like that.  The presence of large sets doesn't necessarily make math all
that much harder.  The test suite for HOL Light actually uses an
inaccessible cardinal, if that means anything to you.

IOW he says it's difficult and maybe it is, but he doesn't make any
attempt to explain why it's difficult, at least once there's some tools
(synchronization primitives etc.) to control the concurrency.  He seems
instead to ignore decades of work going back to Dijkstra and Wirth and
those guys.  It would be a lot more convincing if he addressed that
existing literature and said why it wasn't good enough to help write
real programs that work.

He then advocates something he calls the "PN model" (processes
communicating by message passing) but that seems about the same as what
I've heard called communicating sequential processes (CSP), which are
the execution model of Erlang and is what I've been using with Python
threads and queues.  Maybe there's some subtle difference.  Anyway
there's again plenty of theory about CSP, which are modelled with
Pi-calculus (process calculus) which can be interpreted in lambda
calculus, so sequential verification techniques are still useful on it.

Hmm, I see there's a Wikipedia article "Kahn process networks" about PN
networks as mentioned, so I guess I'll look at it.  I see it claims a
KPN is deterministic on its inputs, while I think CSP's might not be.

> Some discussion of the pros and cons of threading:
> http://c2.com/cgi/wiki?ThreadsConsideredHarmful

This wasn't very informative either.  

[toc] | [prev] | [next] | [standalone]


#86602

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2015-02-27 18:47 -0500
Message-ID<mailman.19325.1425080856.18130.python-list@python.org>
In reply to#86595
On Fri, 27 Feb 2015 13:40:15 -0800, Paul Rubin <no.email@nospam.invalid>
declaimed the following:

>Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes:
>> An interesting point of view: threading is harmful because it removes 
>> determinism from your program.
>> http://radar.oreilly.com/2007/01/threads-considered-harmful.html
>
>Concurrent programs are inherently nondeterministic because they respond
>to i/o events that can happen in any order.  I looked at the paper cited

	And that aspect (nondeterministic) applies whether one is using threads
or processes to handle the I/O -- except, possibly, in the types of
architectures used for aircraft systems: fixed time slices for "partitions"
(which /may/ run threads internally on a partition OS, but the entire
partition gets scheduled as a chunk by an overarching OS); no dynamic
memory allocations (and no freeing either) once the system transitions from
"startup" to "running" (any message queues have all "entries" pre-allocated
in a list); dedicated message queues for data transfer between partitions
vs within a partition, etc. Oh, and some designs require a partition to
essentially complete all processing within some total period of CPU time
(say, three partition time slices) and then, in effect, start over from the
beginning.


>Hmm, I see there's a Wikipedia article "Kahn process networks" about PN
>networks as mentioned, so I guess I'll look at it.  I see it claims a
>KPN is deterministic on its inputs, while I think CSP's might not be.
>
	Oddly, KPN wouldn't fit the hard realtime of aircraft systems -- it
fails on the "unbounded FIFOs".

	Wikipedia gives "deterministic" as (paraphrased) same inputs give the
same outputs -- but does not take into account the timing of the system.

KPN> Hence, timing of the processes does not affect outputs of the system.

	Deterministic /timing/ is a factor in aircraft systems.
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#86281 — Are threads bad? - was: Future of Pypy?

FromRyan Stuart <ryan.stuart.85@gmail.com>
Date2015-02-24 00:35 +0000
SubjectAre threads bad? - was: Future of Pypy?
Message-ID<mailman.19111.1424738135.18130.python-list@python.org>
In reply to#86276

[Multipart message — attachments visible in raw view] — view raw

On Tue Feb 24 2015 at 10:15:40 AM Paul Rubin <no.email@nospam.invalid>
wrote:
>
> I don't see the error-proneness since nothing there seems to set off
> mutation of shared data.
>

I'm not sure what else to say really. It's just a fact of life that Threads
by definition run in the same memory space and hence always have the
possibility of nasty unforeseen problems. They are unforeseen because it is
extremely difficult (maybe impossible?) to try and map out and understand
all the different possible mutations to state. Sure, your code might not be
making any mutations (that you know of), but malloc definitely is [1], and
that's just the tip of the iceberg. Other things like buffers for stdin and
stdout, DNS resolution etc. all have the same issue.

I have no doubt someone can come up with a scenario where they need to use
threads. I can't come up with one myself, but maybe someone else can. But
in the work I have done, processes have sufficed - even for the example of
dynamic callables you gave.

To borrow from the original article I linked - "Nevertheless I still think
it’s a bad idea to make things harder for ourselves if we can avoid it."

Cheers

[1] Line 70 of glibc malloc -
https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/arena.c;h=8af51f05eb376ae2ba07e99c8c766a8ae8af425b;hb=bdf1ff052a8e23d637f2c838fa5642d78fcedc33#l70


>
> > Also, the majority can be achieved via the process approach. For
> > example, using fork to take a copy of the current process (including
> > the heap) you want to use will give you access to any callables on the
> > heap.
>
> What if you want to dynamically construct a callable and send it to
> another process?
>
> > Even if you are extra careful to not touch any shared state in your
> > code, you can almost be guaranteed that code higher up the stack, like
> > malloc for example, *will* be using shared state.
>
> This isn't the 1980's any more--any serious malloc implementation these
> days is thread safe.  People write multi-threaded C programs all the
> time and those programs use malloc in more than one thread.
>
> > Even if you aren't sharing state in your code directly, code higher up
> > the stack will be sharing state.  That is the whole point of a thread,
> > that's what they were invented for.  Using threads safely might well
> > be impossible much less verifiable.
>
> You're basically saying it's impossible to write a reliable operating
> system, since OS's by nature have to do that stuff.  Of course there are
> verified OS's, and some of the early pioneers in concurrency were the
> same guys who worked in program verification, e.g. Dijkstra's
> semaphores.  Even Erlang uses data sharing under the hood (ETS tables
> and large binaries) though their API makes it look like the data is
> copied between processes.
>
> What I'd say is that multi-threaded programs tend to have miniature OS's
> inside them, so it helps to have had some exposure to OS implementation
> techniques if you're going to write this kind of code.  But if you've
> had that exposure then it all becomes less scary.
>
> > So when there are other options that are just as viable/functional,
> > result in far less risk and are often much quicker to implement
> > correctly, why wouldn't you use them?
>
> I should give the multiprocessing module a try sometime (haven't used it
> so far because it's relatively new and I'm comfortable with threads).
> It has the disadvantages that I noted, though.
>
> > If it were easy to use threads in a verifiably safe manner, then there
> > probably wouldn't be a GIL.
>
> Nah, the GIL is just a CPython artifact.  As Steven says, IronPython and
> Jython don't have GIL's.  Java has no GIL, OCaml has no GIL, GHC has no
> GIL, etc.  Someone made a CPython version with no GIL some years ago and
> it worked fine and it got a speedup on multiple cores.  The only problem
> was that on a single core, it was significantly slower than regular
> CPython, specifically because of the overhead of having to lock all the
> refcount updates, so it was considered a failure.  Laura Creighton may
> have more to say about this, but I've been under the impression that the
> main obstacle to getting rid of the CPython GIL is the refcount system
> (which is also easy to make mistakes with, by the way).  That's why I
> was surprised to hear that PyPy has a GIL.
> --
> https://mail.python.org/mailman/listinfo/python-list
>

[toc] | [prev] | [next] | [standalone]


#86294 — Re: Are threads bad? - was: Future of Pypy?

FromPaul Rubin <no.email@nospam.invalid>
Date2015-02-23 21:27 -0800
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<87lhjnang1.fsf@jester.gateway.pace.com>
In reply to#86281
Ryan Stuart <ryan.stuart.85@gmail.com> writes:
> I'm not sure what else to say really. It's just a fact of life that
> Threads by definition run in the same memory space and hence always
> have the possibility of nasty unforeseen problems. They are unforeseen
> because it is extremely difficult (maybe impossible?) to try and map
> out and understand all the different possible mutations to state.

Sure, the shared memory introduces the possibility of some bad errors,
I'm just saying that I've found that by staying with a certain
straightforward style, it doesn't seem difficult in practice to avoid
those errors.

> Sure, your code might not be making any mutations (that you know of),
> but malloc definitely is [1], and that's just the tip of the iceberg.
> Other things like buffers for stdin and stdout, DNS resolution etc.
> all have the same issue. 

I don't understand what you mean about malloc.  I looked at that code
and there's a mutex to make multi-threaded programs work right, and an
ifdef (maybe for better performance) to use different code if there are
no threads.  IOW they spent a bunch of time handling threads.  Are you
saying there's a bug?  Re stdin/stdout: obviously you can't have
multiple threads messing with the same fd's; that's the same thing as
data sharing.  Re DNS: if gethostbyname isn't thread-safe I'd think of
that as a pretty bad bug.  But I'm having a vague memory of having had
an issue with this though, and IIRC it took part of a morning to figure
out what was going on, annoying but not a multi-month bug-hunt or
anything like that.  It didn't happen on my workstation, but only on the
embedded target that was probably running an old or weird libc.

> To borrow from the original article I linked - "Nevertheless I still
> think it’s a bad idea to make things harder for ourselves if we can
> avoid it."

That article was interesting in some ways but confused in others.  One
way it was interesting is it said various non-thread approaches (such as
coroutines) had about the same problems as threads.  Some ways it was
confused were:

  1) thinking Haskell threads were like processes with separate address
  spaces.  In fact they are in the same address space and programming
  with them isn't all that different from Python threads, though the
  synchronization primitives are a bit different.  There is also an STM
  library available that is ingenious though apparently somewhat slow.

  2) it has a weird story about the brass cockroach, that basically
  signified that they didn't have a robust enough testing system to be
  able to reproduce the bug.  That is what they should have worked on.

  3) It goes into various hazards of the balance transfer example not
  mentioning that STM (available in Haskell and Clojure) completely
  solves it.

  4) It says: "eventually a system which communicates exclusively through
  non-blocking queues effectively becomes a set of communicating event
  loops, and its problems revert to those of an event-driven system;
  it doesn’t look like regular programming with threads any more."

  That is essentially what an Erlang program is, and it misses the fact
  that those low-level event loops can use blocking operations to their
  heart's content, without the inversion of control (callback spaghetti)
  of traditional evented systems (I haven't used asyncio yet).  Also,
  the low-level loops can run in parallel on multiple cores, while a
  asyncio-style coroutine loop is sequential under the skin.

  In Erlang/OTP, you don't even see the event loops directly, since they
  are abstracted away by the OTP framework and it looks like RPC calls
  at the application level.  But, it helps to know what is going on
  underneath.

I'm realizing some people program Python in an ultra-dynamic style where
the mutability of modules, functions, etc. really comes into play, so
that make threads much more dangerous.  I've tended to write Python with
much less dynamism or even as if it were statically typed, so maybe that
helps.

Anyway, I got one thing out of this, which is that the multiprocessing
module looks pretty nice and I should try it even when I don't need
multicore parallelism, so thanks for that.

In reality though, Python is still my most productive language for
throwaway, non-concurrent scripting, but for more complicated concurrent
programs, alternatives like Haskell, Erlang, and Go all have significant
attractions.

[toc] | [prev] | [next] | [standalone]


#86297 — Re: Are threads bad? - was: Future of Pypy?

FromChris Angelico <rosuav@gmail.com>
Date2015-02-24 16:57 +1100
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<mailman.19118.1424757423.18130.python-list@python.org>
In reply to#86294
On Tue, Feb 24, 2015 at 4:27 PM, Paul Rubin <no.email@nospam.invalid> wrote:
>> Sure, your code might not be making any mutations (that you know of),
>> but malloc definitely is [1], and that's just the tip of the iceberg.
>> Other things like buffers for stdin and stdout, DNS resolution etc.
>> all have the same issue.
>
> Re stdin/stdout: obviously you can't have
> multiple threads messing with the same fd's; that's the same thing as
> data sharing.

Actually, you can quite happily have multiple threads messing with the
underlying file descriptors, that's not a problem. (Though you will
tend to get interleaved output. But if you always produce output in
single blocks of text that each contain one line with a trailing
newline, you should see interleaved lines that are each individually
correct. I'm also not sure of any sane way to multiplex stdin -
merging output from multiple threads is fine, but dividing input
between multiple threads is messy.) The problem is *buffers* for stdin
and stdout, where you have to be absolutely sure that you're not
trampling all over another thread's data structures. If you unbuffer
your output, it's probably going to be thread-safe.

ChrisA

[toc] | [prev] | [next] | [standalone]


#86299 — Re: Are threads bad? - was: Future of Pypy?

FromPaul Rubin <no.email@nospam.invalid>
Date2015-02-23 22:23 -0800
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<87h9ubakul.fsf@jester.gateway.pace.com>
In reply to#86297
Chris Angelico <rosuav@gmail.com> writes:
> Actually, you can quite happily have multiple threads messing with the
> underlying file descriptors,... The problem is *buffers* for stdin
> and stdout, where you have to be absolutely sure that you're not
> trampling all over another thread's data structures.

Oh ok, sure, yeah, the distinction is valid.  I guess the classic
interleaved "print 'a'" "print 'b'" loops could crash if the stdio
structures get corrupted through conflicting updates somehow.

[toc] | [prev] | [next] | [standalone]


#86304 — Re: Are threads bad? - was: Future of Pypy?

FromMarko Rauhamaa <marko@pacujo.net>
Date2015-02-24 10:08 +0200
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<87bnkjenpp.fsf@elektro.pacujo.net>
In reply to#86297
Chris Angelico <rosuav@gmail.com>:

> Actually, you can quite happily have multiple threads messing with the
> underlying file descriptors, that's not a problem. (Though you will
> tend to get interleaved output. But if you always produce output in
> single blocks of text that each contain one line with a trailing
> newline, you should see interleaved lines that are each individually
> correct. I'm also not sure of any sane way to multiplex stdin -
> merging output from multiple threads is fine, but dividing input
> between multiple threads is messy.) The problem is *buffers* for stdin
> and stdout, where you have to be absolutely sure that you're not
> trampling all over another thread's data structures. If you unbuffer
> your output, it's probably going to be thread-safe.

Here's an anecdote describing one real-life threading problem. We had a
largish multithreading framework (in Java, but I'm setting it in Python
and in a much simplified form).

We were mindful of deadlocks caused by lock reversal so we had come up
with a policy whereby objects form a layered hierarchy. An object higher
up in the hierarchy was allowed to call methods of objects below while
holding locks. The opposite was not allowed; if an object desired to
call a method of an object above it (through a registered callback), it
had to relinquish all locks before doing so.

However, a situation like this arose:

    class App:
        def send_stream(self, sock):
            with self.lock:
                self.register_socket(sock)

                class SocketWrapper:
                    def read(_, count):
                        return sock.recv(count)
                    def close(_):
                        sock.close()
                        with self.lock:
                            self.unregister_socket(sock)

                self.transport.forward_and_close(SocketWrapper(sock))

    class Transport:
        def forward_and_close(self, readable):
            with self.lock:
                more = readable.read(1000)
                if more is WOULDBLOCK:
                    self.reschedule(readable)
                elif more:
                    ... # out of scope for the anecdote
                else:
                    # EOF reached
                    readable.close()

Now the dreaded lock reversal arises when the App object calls
self.transport.forward_and_close() and Transport calls readable.close()
at the same time.

So why lock categorically like that? Java has a handy "synchronized"
keyword that wraps the whole method in "with self.lock". Ideally, that
handy idiom could be employed methodically. More importantly, to avoid
locking problems, the methodology should be rigorous and mindless. If
the developer must perform a deep locking analysis at every turn, they
are bound to make mistakes, especially when more than one developer is
involved, with differing intuitions.

Unfortunately, that deep locking analysis *is* required at every turn,
and mistakes *are* bound to happen.


Marko

[toc] | [prev] | [next] | [standalone]


#86362 — Re: Are threads bad? - was: Future of Pypy?

FromPaul Rubin <no.email@nospam.invalid>
Date2015-02-24 15:53 -0800
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<8761aqamss.fsf@jester.gateway.pace.com>
In reply to#86304
Marko Rauhamaa <marko@pacujo.net> writes:
> So why lock categorically like that? Java has a handy "synchronized"
> keyword that wraps the whole method in "with self.lock". ...
> Unfortunately, that deep locking analysis *is* required at every turn,
> and mistakes *are* bound to happen.

I wonder if synchronized was a mistake in Java.  It confused me a lot
when I tried to use it, but I never had to mess with it that much.  I
can't quite tell what your code is doing (why is it attempting a socket
read with a lock held) and I'd be interested to see what an STM version
would look like.

[toc] | [prev] | [next] | [standalone]


#86374 — Re: Are threads bad? - was: Future of Pypy?

FromMarko Rauhamaa <marko@pacujo.net>
Date2015-02-25 07:25 +0200
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<87zj82bm15.fsf@elektro.pacujo.net>
In reply to#86362
Paul Rubin <no.email@nospam.invalid>:

> Marko Rauhamaa <marko@pacujo.net> writes:
>> So why lock categorically like that? Java has a handy "synchronized"
>> keyword that wraps the whole method in "with self.lock". ...
>> Unfortunately, that deep locking analysis *is* required at every turn,
>> and mistakes *are* bound to happen.
>
> I wonder if synchronized was a mistake in Java. It confused me a lot
> when I tried to use it, but I never had to mess with it that much. I
> can't quite tell what your code is doing (why is it attempting a
> socket read with a lock held) and I'd be interested to see what an STM
> version would look like.

Synchronized methods are actually quite a powerful idea, but a bit
underappreciated by many Java developers.

The main point of my example is this:

 * Effective thread-safe programming would require some guidelines, a
   routine, a methodology so the code would be free of locking anomalies
   as a matter of course.

 * The industry hasn't formed such guidelines that are generally
   followed. That makes it difficult to glue together systems from
   components that come from different sources.

 * Even the best of guidelines have significant, surprising corner cases
   that require special treatment and are easy to get wrong.

 * As a corollary, you cannot implement thread-safety locally in a
   class. Instead, your locking decisions must be made application-wide.
   That violates encapsulation, which is a cornerstone of
   object-oriented programming (and large-system manageability).


Marko

[toc] | [prev] | [next] | [standalone]


#86375 — Re: Are threads bad? - was: Future of Pypy?

FromMarcos Almeida Azevedo <marcos.al.azevedo@gmail.com>
Date2015-02-25 13:34 +0800
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<mailman.19167.1424842476.18130.python-list@python.org>
In reply to#86374

[Multipart message — attachments visible in raw view] — view raw

On Wed, Feb 25, 2015 at 1:25 PM, Marko Rauhamaa <marko@pacujo.net> wrote:

> Paul Rubin <no.email@nospam.invalid>:
>
> > Marko Rauhamaa <marko@pacujo.net> writes:
> >> So why lock categorically like that? Java has a handy "synchronized"
> >> keyword that wraps the whole method in "with self.lock". ...
> >> Unfortunately, that deep locking analysis *is* required at every turn,
> >> and mistakes *are* bound to happen.
> >
> > I wonder if synchronized was a mistake in Java. It confused me a lot
> > when I tried to use it, but I never had to mess with it that much. I
> > can't quite tell what your code is doing (why is it attempting a
> > socket read with a lock held) and I'd be interested to see what an STM
> > version would look like.
>
> Synchronized methods are actually quite a powerful idea, but a bit
> underappreciated by many Java developers.
>
>
Synchronized methods in Java really makes programming life simpler.  But I
think it is standard practice to avoid this if there is a lighter
alternative as synchronized methods are slow.  Worse case I used double
checked locking.


>
>
> Marko
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Marcos | I love PHP, Linux, and Java
<http://javadevnotes.com/java-integer-to-string-examples>

[toc] | [prev] | [next] | [standalone]


#86376 — Re: Are threads bad? - was: Future of Pypy?

FromMarko Rauhamaa <marko@pacujo.net>
Date2015-02-25 07:46 +0200
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<87pp8ybl14.fsf@elektro.pacujo.net>
In reply to#86375
Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com>:

> Synchronized methods in Java really makes programming life simpler.
> But I think it is standard practice to avoid this if there is a
> lighter alternative as synchronized methods are slow. Worse case I
> used double checked locking.

I have yet to see code whose performance suffers from too much locking.
However, I have seen plenty of code that suffers from anomalies caused
by incorrect locking.


Marko

[toc] | [prev] | [next] | [standalone]


#86378 — Re: Are threads bad? - was: Future of Pypy?

FromChris Angelico <rosuav@gmail.com>
Date2015-02-25 16:54 +1100
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<mailman.19168.1424843651.18130.python-list@python.org>
In reply to#86376
On Wed, Feb 25, 2015 at 4:46 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com>:
>
>> Synchronized methods in Java really makes programming life simpler.
>> But I think it is standard practice to avoid this if there is a
>> lighter alternative as synchronized methods are slow. Worse case I
>> used double checked locking.
>
> I have yet to see code whose performance suffers from too much locking.
> However, I have seen plenty of code that suffers from anomalies caused
> by incorrect locking.

Uhh, I have seen *heaps* of code whose performance suffers from too
much locking. At the coarsest and least intelligent level, a database
program that couldn't handle concurrency at all, so I wrote an
application-level semaphore that stopped two people from running it at
once. You want to use that program? Ask the other guy to close it.
THAT is a performance problem. And there are plenty of narrower cases,
where it ends up being a transactions-per-second throughput limiter.

ChrisA

[toc] | [prev] | [next] | [standalone]


#86379 — Re: Are threads bad? - was: Future of Pypy?

FromMarcos Almeida Azevedo <marcos.al.azevedo@gmail.com>
Date2015-02-25 13:58 +0800
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<mailman.19169.1424843892.18130.python-list@python.org>
In reply to#86376

[Multipart message — attachments visible in raw view] — view raw

On Wed, Feb 25, 2015 at 1:46 PM, Marko Rauhamaa <marko@pacujo.net> wrote:

> Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com>:
>
> > Synchronized methods in Java really makes programming life simpler.
> > But I think it is standard practice to avoid this if there is a
> > lighter alternative as synchronized methods are slow. Worse case I
> > used double checked locking.
>
> I have yet to see code whose performance suffers from too much locking.
> However, I have seen plenty of code that suffers from anomalies caused
> by incorrect locking.
>
>
>
Of course code with locking is slower than one without. The locking
mechanism itself is the overhead.  So use it only when necessary.  There is
a reason why double checked locking was invented by clever programmers.



> Marko
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Marcos | I love PHP, Linux, and Java
<http://javadevnotes.com/java-integer-to-string-examples>

[toc] | [prev] | [next] | [standalone]


#86380 — Re: Are threads bad? - was: Future of Pypy?

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-02-24 23:02 -0700
SubjectRe: Are threads bad? - was: Future of Pypy?
Message-ID<mailman.19170.1424844176.18130.python-list@python.org>
In reply to#86376
On Tue, Feb 24, 2015 at 10:54 PM, Chris Angelico <rosuav@gmail.com> wrote:
> On Wed, Feb 25, 2015 at 4:46 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
>> Marcos Almeida Azevedo <marcos.al.azevedo@gmail.com>:
>>
>>> Synchronized methods in Java really makes programming life simpler.
>>> But I think it is standard practice to avoid this if there is a
>>> lighter alternative as synchronized methods are slow. Worse case I
>>> used double checked locking.
>>
>> I have yet to see code whose performance suffers from too much locking.
>> However, I have seen plenty of code that suffers from anomalies caused
>> by incorrect locking.
>
> Uhh, I have seen *heaps* of code whose performance suffers from too
> much locking. At the coarsest and least intelligent level, a database
> program that couldn't handle concurrency at all, so I wrote an
> application-level semaphore that stopped two people from running it at
> once. You want to use that program? Ask the other guy to close it.
> THAT is a performance problem. And there are plenty of narrower cases,
> where it ends up being a transactions-per-second throughput limiter.

Is the name of that database program "Microsoft Access" perchance?

[toc] | [prev] | [next] | [standalone]


Page 2 of 4 — ← Prev page 1 [2] 3 4  Next page →

Back to top | Article view | comp.lang.python


csiph-web