Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #11118
| Path | csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <przemolicc@poczta.fm> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.032 |
| X-Spam-Evidence | '*H*': 0.94; '*S*': 0.00; '(using': 0.05; 'python:': 0.05; 'python': 0.08; 'fetch': 0.09; 'spawn': 0.09; 'entries': 0.13; 'library': 0.15; 'cores': 0.16; 'end).': 0.16; 'rows': 0.16; 'subset': 0.16; 'xml)': 0.16; 'wrote:': 0.16; 'wed,': 0.17; 'trying': 0.21; 'header:In-Reply-To:1': 0.22; '+0100,': 0.23; 'joining': 0.23; 'pm,': 0.24; 'aug': 0.24; 'xml': 0.25; 'string': 0.26; 'server': 0.29; '(and': 0.29; 'script': 0.29; 'table.': 0.30; "skip:' 10": 0.30; 'chris': 0.32; 'list': 0.32; 'there': 0.33; 'to:addr:python-list': 0.33; 'however,': 0.34; 'header:User- Agent:1': 0.34; 'quite': 0.34; 'idea': 0.34; 'subject: ?': 0.34; 'running': 0.35; 'another': 0.37; 'fastest': 0.37; 'put': 0.37; 'several': 0.37; 'but': 0.37; 'something': 0.37; 'hello,': 0.38; 'subject:: ': 0.39; 'sets': 0.39; 'recommended': 0.39; 'data': 0.39; 'why': 0.39; 'to:addr:python.org': 0.39; "i'd": 0.40; 'more': 0.60; 'your': 0.61; 'from:no real name:2**0': 0.63; 'free': 0.63; 'charset:iso-8859-2': 0.66; 'view': 0.67; 'thousands': 0.67; 'database,': 0.68; 'received:pl': 0.84; '12:17': 0.84; 'slow.': 0.84; 'telefon': 0.84; 'url:pl': 0.93 |
| Date | Wed, 10 Aug 2011 15:31:46 +0200 |
| From | przemolicc@poczta.fm |
| To | python-list@python.org |
| Subject | Re: String concatenation - which is the fastest way ? |
| References | <20110810111754.GD5045@host.pgf.com.pl> <CAPTjJmrF0GcVs0onfoCHAbMe38b5iLXgFX1R7G_RXcKxjPH5wQ@mail.gmail.com> |
| MIME-Version | 1.0 |
| In-Reply-To | <CAPTjJmrF0GcVs0onfoCHAbMe38b5iLXgFX1R7G_RXcKxjPH5wQ@mail.gmail.com> |
| User-Agent | Mutt/1.5.18 (2008-05-17) |
| X-Interia-Antivirus | OK |
| X-EMID | 9e6cfc98 |
| Content-Type | text/plain; charset="iso-8859-2" |
| Content-Transfer-Encoding | quoted-printable |
| Content-Disposition | inline |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.12 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.2105.1312983110.1164.python-list@python.org> (permalink) |
| Lines | 99 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1312983110 news.xs4all.nl 23910 [2001:888:2000:d::a6]:37091 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | x330-a1.tempe.blueboxinc.net comp.lang.python:11118 |
Show key headers only | View raw
On Wed, Aug 10, 2011 at 01:32:06PM +0100, Chris Angelico wrote:
> On Wed, Aug 10, 2011 at 12:17 PM, <przemolicc@poczta.fm> wrote:
> > Hello,
> >
> > I'd like to write a python (2.6/2.7) script which connects to database, fetches
> > hundreds of thousands of rows, concat them (basically: create XML)
> > and then put the result into another table. Do I have any choice
> > regarding string concatenation in Python from the performance point of view ?
> > Since the number of rows is big I'd like to use the fastest possible library
> > (if there is any choice). Can you recommend me something ?
>
> First off, I have no idea why you would want to create an XML dump of
> hundreds of thousands of rows, only to store it in another table.
> However, if that is your intention, list joining is about as efficient
> as you're going to get in Python:
>
> lst=["asdf","qwer","zxcv"] # feel free to add 399,997 more list entries
> xml="<foo>"+"</foo><foo>".join(lst)+"</foo>"
>
> This sets xml to '<foo>asdf</foo><foo>qwer</foo><foo>zxcv</foo>' which
> may or may not be what you're after.
Chris,
since this process (XML building) is running now inside database (using native SQL commands)
and is one-thread task it is quite slow. What I wanted to do is to spawn several python subprocesses in parallel which
will concat subset of the whole table (and then merge all of them at the end).
Basically:
- fetch all rows from the database (up to 1 million): what is recommended data type ?
- spawn X python processes each one:
- concat its own subset
- merge the result from all the subprocesses
This task is running on a server which has many but slow cores and I am trying to divide this task
into many subtasks.
Regards
Przemyslaw Bak (przemol)
----------------------------------------------------------------
Doladuj telefon przez Internet!
Sprawdz >> http://linkint.pl/f2a06
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Re: String concatenation - which is the fastest way ? przemolicc@poczta.fm - 2011-08-10 15:31 +0200
csiph-web