Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #42300

Re: How to find bad row with db api executemany()?

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.032
X-Spam-Evidence '*H*': 0.94; '*S*': 0.00; 'essentially': 0.04; 'argument': 0.05; 'purpose.': 0.07; 'delimited': 0.09; 'mysql.': 0.09; 'newline': 0.09; 'statements': 0.09; 'subject:How': 0.10; 'creates': 0.14; '(everything': 0.16; '(postgresql': 0.16; 'comparisons,': 0.16; 'dump': 0.16; 'dump,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'roy': 0.16; 'worst': 0.16; 'sat,': 0.16; 'wrote:': 0.18; '(but': 0.19; 'memory': 0.22; 'case.': 0.24; 'initial': 0.24; 'versions': 0.24; 'nearly': 0.26; 'header:In-Reply-To:1': 0.27; 'appear': 0.29; 'testing': 0.29; 'chris': 0.29; 'am,': 0.29; 'strongly': 0.30; 'message-id:@mail.gmail.com': 0.30; 'went': 0.31; 'code': 0.31; '(maybe': 0.31; 'away.': 0.31; 'probably': 0.32; 'compatible': 0.32; "we're": 0.32; 'worked': 0.33; 'maybe': 0.34; "i'd": 0.34; 'could': 0.34; 'subject:with': 0.35; 'received:209.85': 0.35; 'received:209.85.220': 0.35; 'transaction': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'doing': 0.36; 'two': 0.37; 'project': 0.37; 'received:209': 0.37; 'performance': 0.37; 'server': 0.38; 'to:addr:python-list': 0.38; 'pm,': 0.38; 'does': 0.39; 'bad': 0.39; 'help,': 0.39; 'skip:. 10': 0.39; 'delete': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'either': 0.39; 'how': 0.40; 'even': 0.60; 'back': 0.62; "you'll": 0.62; 'complete': 0.62; 'day.': 0.63; 'more': 0.64; 'chance': 0.65; '30,': 0.65; 'mar': 0.68; 'smith': 0.68; 'guaranteed': 0.75; 'article': 0.77; 'gain': 0.79; 'fails,': 0.84; 'partial': 0.84; 'subject:find': 0.84; 'try,': 0.84; 'differences': 0.93; 'hand,': 0.93; '2013': 0.98
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=Wd4s1pLUROqV56fha6S/ZqYyYIcbdhzvZfswlfOaAYo=; b=SRm2flJddhAabv2jgKCe+jU0IsrQctQGG+8XI5TXtmlylLwcMwD976ox+eUuEmO2I7 +S9MeRwo7SHt/yauHsIyhixyjoWa/KzkAZ3arK5RgTbmEmTGiEsSugRgSBY1lHa96xcd GYi3P+nmJ9GCZgUUIUAL2UXCZd7/oOmeBWWvZbqobKE16YbGAaKzo6nJaPOTLD3chP48 NlRXjgFlS98TqIx84YdhiMABW3a6lpdHiPdqVyg/TGb4Uu2df9UIxHWazFn1jsw6deGx kknMURhV1Uof5R4AkeRgK+TIX2DjSUBrNwRURdRtUoRvwheppc09oKcMXDZl7mFWIDIV W6fQ==
MIME-Version 1.0
X-Received by 10.220.103.7 with SMTP id i7mr3469697vco.7.1364609108957; Fri, 29 Mar 2013 19:05:08 -0700 (PDT)
In-Reply-To <roy-AE1A29.21192229032013@news.panix.com>
References <F40EFF8D-9F4F-4ACB-8671-450F3CD761CA@panix.com> <5155E32A.1000403@davea.name> <mailman.3971.1364595940.2939.python-list@python.org> <roy-A61512.20410329032013@news.panix.com> <mailman.3977.1364605026.2939.python-list@python.org> <roy-AE1A29.21192229032013@news.panix.com>
Date Sat, 30 Mar 2013 13:05:08 +1100
Subject Re: How to find bad row with db api executemany()?
From Chris Angelico <rosuav@gmail.com>
To python-list@python.org
Content-Type text/plain; charset=ISO-8859-1
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.3979.1364609118.2939.python-list@python.org> (permalink)
Lines 45
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1364609118 news.xs4all.nl 6884 [2001:888:2000:d::a6]:59517
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:42300

Show key headers only | View raw


On Sat, Mar 30, 2013 at 12:19 PM, Roy Smith <roy@panix.com> wrote:
> In article <mailman.3977.1364605026.2939.python-list@python.org>,
>  Chris Angelico <rosuav@gmail.com> wrote:
>
>> On Sat, Mar 30, 2013 at 11:41 AM, Roy Smith <roy@panix.com> wrote:
>> > In article <mailman.3971.1364595940.2939.python-list@python.org>,
>> >  Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:
>> >
>> >> If using MySQLdb, there isn't all that much difference... MySQLdb is
>> >> still compatible with MySQL v4 (and maybe even v3), and since those
>> >> versions don't have "prepared statements", .executemany() essentially
>> >> turns into something that creates a newline delimited "list" of
>> >> "identical" (but for argument substitution) statements and submits that
>> >> to MySQL.
>> >
>> > Shockingly, that does appear to be the case.  I had thought during my
>> > initial testing that I was seeing far greater throughput, but as I got
>> > more into the project and started doing some side-by-side comparisons,
>> > it the differences went away.
>>
>> How much are you doing per transaction? The two extremes (everything
>> in one transaction, or each line in its own transaction) are probably
>> the worst for performance. See what happens if you pepper the code
>> with 'begin' and 'commit' statements (maybe every thousand or ten
>> thousand rows) to see if performance improves.
>>
>> ChrisA
>
> We're doing it all in one transaction, on purpose.  We start with an
> initial dump, then get updates about once a day.  We want to make sure
> that the updates either complete without errors, or back out cleanly.
> If we ever had a partial daily update, the result would be a mess.
>
> Hmmm, on the other hand, I could probably try doing the initial dump the
> way you describe.  If it fails, we can just delete the whole thing and
> start again.

One transaction for the lot isn't nearly as bad as one transaction per
row, but it can consume a lot of memory on the server - or at least,
that's what I found last time I worked with MySQL. (PostgreSQL works
completely differently, and I'd strongly recommend doing it all as one
transaction if you switch.) It's not guaranteed to help, but if it
won't hurt to try, there's a chance you'll gain some performance.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-29 18:25 -0400
  Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 20:41 -0400
    Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 11:57 +1100
      Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 21:19 -0400
        Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 13:05 +1100
        Re: How to find bad row with db api executemany()? Tim Chase <python.list@tim.thechases.com> - 2013-03-29 22:17 -0500
        Re: How to find bad row with db api executemany()? (PS) Tim Chase <python.list@tim.thechases.com> - 2013-03-29 22:38 -0500
        Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-29 23:38 -0400
    Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 22:44 -0400
      Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 13:49 +1100
        Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 23:09 -0400
          Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 14:14 +1100
            Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 23:36 -0400
              Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 14:57 +1100
                Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-30 00:10 -0400
                Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 15:21 +1100
                Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-30 10:19 -0400
        Re: How to find bad row with db api executemany()? rusi <rustompmody@gmail.com> - 2013-03-29 20:13 -0700
          Re: How to find bad row with db api executemany()? rusi <rustompmody@gmail.com> - 2013-03-29 20:15 -0700
          Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 23:40 -0400
      Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-29 23:53 -0400
      Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-30 00:19 -0400
      Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 15:24 +1100
      Re: How to find bad row with db api executemany()? Tim Chase <python.list@tim.thechases.com> - 2013-03-30 06:38 -0500

csiph-web