Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #57499

Re: Processing large CSV files - how to maximise throughput?

References (1 earlier) <mailman.1494.1382667030.18130.python-list@python.org> <5269e6f6$0$29972$c3e8da3$5496439d@news.astraweb.com> <l4cq6t$oq6$1@ger.gmane.org> <CAPTjJmqvjMMqd-JzaL3BtVu3=bwgYCdpFdMSHEa8kf5kdpVJyA@mail.gmail.com> <l4d3mg$hsf$1@ger.gmane.org>
Date 2013-10-25 18:26 +1100
Subject Re: Processing large CSV files - how to maximise throughput?
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.1500.1382686313.18130.python-list@python.org> (permalink)

Show all headers | View raw


On Fri, Oct 25, 2013 at 5:39 PM, Stefan Behnel <stefan_ml@behnel.de> wrote:
> Basically, with multiple processes, you start with independent systems and
> add connections specifically where needed, whereas with threads, you start
> with completely shared state and then prune away interdependencies and
> concurrency until it seems to work safely. That approach makes it
> essentially impossible to prove that threading is safe in a given setup,
> except for the really trivial cases.

Not strictly true. With multiple threads, you start with completely
shared global state and completely independent local state (in
assembly language, shared data segment and separate stack). If you
treat your globals as either read-only or carefully controlled, then
it makes little difference whether you're forking processes or
spinning off threads, except that with threads you don't need special
data structures (IPC-based ones) for the global state.

For me, threading largely grew out of the same sorts of concerns as
recursion - as long as all your internal state is in locals, nothing
can hurt you. Of course, it's still far easier to shoot yourself in
the foot with threads than with processes, but for the tasks I've used
them for, I've never found footholes; that may, however, be inherent
to the simplicity of the two main jobs I used threads for: socket
handling (where nearly everything's I/O bound) and worker threads spun
off to let the GUI remain responsive (posting a message back to the
main thread when there's a result).

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Processing large CSV files - how to maximise throughput? Victor Hooi <victorhooi@gmail.com> - 2013-10-24 18:38 -0700
  Re: Processing large CSV files - how to maximise throughput? Dave Angel <davea@davea.name> - 2013-10-25 02:10 +0000
    Re: Processing large CSV files - how to maximise throughput? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-10-25 03:35 +0000
      Re: Processing large CSV files - how to maximise throughput? Dave Angel <davea@davea.name> - 2013-10-25 03:57 +0000
      Re: Processing large CSV files - how to maximise throughput? Chris Angelico <rosuav@gmail.com> - 2013-10-25 17:13 +1100
      Re: Processing large CSV files - how to maximise throughput? Stefan Behnel <stefan_ml@behnel.de> - 2013-10-25 08:39 +0200
      Re: Processing large CSV files - how to maximise throughput? Chris Angelico <rosuav@gmail.com> - 2013-10-25 18:26 +1100
      Re: Processing large CSV files - how to maximise throughput? Dave Angel <davea@davea.name> - 2013-10-25 11:24 +0000
      Re: Processing large CSV files - how to maximise throughput? Chris Angelico <rosuav@gmail.com> - 2013-10-25 22:42 +1100
  Re: Processing large CSV files - how to maximise throughput? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-10-25 03:19 +0000
  Re: Processing large CSV files - how to maximise throughput? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-10-25 04:46 +0100
  Re: Processing large CSV files - how to maximise throughput? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-10-25 19:44 -0400
    Re: Processing large CSV files - how to maximise throughput? Roy Smith <roy@panix.com> - 2013-10-25 20:22 -0400
  Re: Processing large CSV files - how to maximise throughput? Walter Hurry <walterhurry@lavabit.com> - 2013-10-26 08:53 +0000

csiph-web