Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!news2.arglkargh.de!news.wiretrip.org!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
To: python-list@python.org
From: Stefan Behnel <stefan_ml@behnel.de>
Subject: Re: String concatenation - which is the fastest way ?
Date: Fri, 12 Aug 2011 11:25:06 +0200
References: <20110810111754.GD5045@host.pgf.com.pl>	<CAPTjJmrF0GcVs0onfoCHAbMe38b5iLXgFX1R7G_RXcKxjPH5wQ@mail.gmail.com>	<20110810133146.GE5045@host.pgf.com.pl> <j1ub3q$65c$1@dough.gmane.org>	<20110811064030.GB4990@host.pgf.com.pl>	<CAPTjJmqiiKcT7Zg-XkbrQ4r-keU9+1Kj=+sEwa4xp==2EG-5aQ@mail.gmail.com>	<20110811134613.GE4990@host.pgf.com.pl>	<CAPTjJmr7qDeGU+ttEXsdBOASqZ5w1mYgkg18o+RtFp55HHp8Ew@mail.gmail.com> <20110811143923.GG4990@host.pgf.com.pl>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110617 Lightning/1.0b2 Thunderbird/3.1.11
In-Reply-To: <20110811143923.GG4990@host.pgf.com.pl>
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.2213.1313141121.1164.python-list@python.org>
Lines: 28
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:11264

przemolicc@poczta.fm, 11.08.2011 16:39:
> On Thu, Aug 11, 2011 at 02:48:43PM +0100, Chris Angelico wrote:
>> On Thu, Aug 11, 2011 at 2:46 PM,<przemolicc@poczta.fm>  wrote:
>>> This is the way I am going to use.
>>> But what is the best data type to hold so many rows and then operate on them ?
>>>
>>
>> List of strings. Take it straight from your Oracle interface and work
>> with it directly.
>
> Can I use this list in the following way ?
> subprocess_1 - run on list between 1 and 10000
> subprocess_2 - run on list between 10001 and 20000
> subprocess_3 - run on list between 20001 and 30000
> etc
> ...

Sure. Just read the data as it comes in from the database and fill up a 
chunk, then hand that on to a process. You can also distribute it in 
smaller packets, just check what size gives the best throughput.

Still, I'd give each work parcel a number and then reorder the results 
while collecting them, that allows you to vary the chunk size and the 
process size independently, without having to wait for a process that 
happens to take longer.

Stefan