Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #42300

Re: How to find bad row with db api executemany()?

References (1 earlier) <5155E32A.1000403@davea.name> <mailman.3971.1364595940.2939.python-list@python.org> <roy-A61512.20410329032013@news.panix.com> <mailman.3977.1364605026.2939.python-list@python.org> <roy-AE1A29.21192229032013@news.panix.com>
Date 2013-03-30 13:05 +1100
Subject Re: How to find bad row with db api executemany()?
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.3979.1364609118.2939.python-list@python.org> (permalink)

Show all headers | View raw


On Sat, Mar 30, 2013 at 12:19 PM, Roy Smith <roy@panix.com> wrote:
> In article <mailman.3977.1364605026.2939.python-list@python.org>,
>  Chris Angelico <rosuav@gmail.com> wrote:
>
>> On Sat, Mar 30, 2013 at 11:41 AM, Roy Smith <roy@panix.com> wrote:
>> > In article <mailman.3971.1364595940.2939.python-list@python.org>,
>> >  Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:
>> >
>> >> If using MySQLdb, there isn't all that much difference... MySQLdb is
>> >> still compatible with MySQL v4 (and maybe even v3), and since those
>> >> versions don't have "prepared statements", .executemany() essentially
>> >> turns into something that creates a newline delimited "list" of
>> >> "identical" (but for argument substitution) statements and submits that
>> >> to MySQL.
>> >
>> > Shockingly, that does appear to be the case.  I had thought during my
>> > initial testing that I was seeing far greater throughput, but as I got
>> > more into the project and started doing some side-by-side comparisons,
>> > it the differences went away.
>>
>> How much are you doing per transaction? The two extremes (everything
>> in one transaction, or each line in its own transaction) are probably
>> the worst for performance. See what happens if you pepper the code
>> with 'begin' and 'commit' statements (maybe every thousand or ten
>> thousand rows) to see if performance improves.
>>
>> ChrisA
>
> We're doing it all in one transaction, on purpose.  We start with an
> initial dump, then get updates about once a day.  We want to make sure
> that the updates either complete without errors, or back out cleanly.
> If we ever had a partial daily update, the result would be a mess.
>
> Hmmm, on the other hand, I could probably try doing the initial dump the
> way you describe.  If it fails, we can just delete the whole thing and
> start again.

One transaction for the lot isn't nearly as bad as one transaction per
row, but it can consume a lot of memory on the server - or at least,
that's what I found last time I worked with MySQL. (PostgreSQL works
completely differently, and I'd strongly recommend doing it all as one
transaction if you switch.) It's not guaranteed to help, but if it
won't hurt to try, there's a chance you'll gain some performance.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-29 18:25 -0400
  Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 20:41 -0400
    Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 11:57 +1100
      Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 21:19 -0400
        Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 13:05 +1100
        Re: How to find bad row with db api executemany()? Tim Chase <python.list@tim.thechases.com> - 2013-03-29 22:17 -0500
        Re: How to find bad row with db api executemany()? (PS) Tim Chase <python.list@tim.thechases.com> - 2013-03-29 22:38 -0500
        Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-29 23:38 -0400
    Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 22:44 -0400
      Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 13:49 +1100
        Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 23:09 -0400
          Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 14:14 +1100
            Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 23:36 -0400
              Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 14:57 +1100
                Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-30 00:10 -0400
                Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 15:21 +1100
                Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-30 10:19 -0400
        Re: How to find bad row with db api executemany()? rusi <rustompmody@gmail.com> - 2013-03-29 20:13 -0700
          Re: How to find bad row with db api executemany()? rusi <rustompmody@gmail.com> - 2013-03-29 20:15 -0700
          Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 23:40 -0400
      Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-29 23:53 -0400
      Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-30 00:19 -0400
      Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 15:24 +1100
      Re: How to find bad row with db api executemany()? Tim Chase <python.list@tim.thechases.com> - 2013-03-30 06:38 -0500

csiph-web