Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #42289 > unrolled thread

Re: How to find bad row with db api executemany()?

Started byDennis Lee Bieber <wlfraed@ix.netcom.com>
First post2013-03-29 18:25 -0400
Last post2013-03-30 06:38 -0500
Articles 4 on this page of 24 — 5 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-29 18:25 -0400
    Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 20:41 -0400
      Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 11:57 +1100
        Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 21:19 -0400
          Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 13:05 +1100
          Re: How to find bad row with db api executemany()? Tim Chase <python.list@tim.thechases.com> - 2013-03-29 22:17 -0500
          Re: How to find bad row with db api executemany()? (PS) Tim Chase <python.list@tim.thechases.com> - 2013-03-29 22:38 -0500
          Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-29 23:38 -0400
      Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 22:44 -0400
        Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 13:49 +1100
          Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 23:09 -0400
            Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 14:14 +1100
              Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 23:36 -0400
                Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 14:57 +1100
                  Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-30 00:10 -0400
                    Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 15:21 +1100
                      Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-30 10:19 -0400
          Re: How to find bad row with db api executemany()? rusi <rustompmody@gmail.com> - 2013-03-29 20:13 -0700
            Re: How to find bad row with db api executemany()? rusi <rustompmody@gmail.com> - 2013-03-29 20:15 -0700
            Re: How to find bad row with db api executemany()? Roy Smith <roy@panix.com> - 2013-03-29 23:40 -0400
        Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-29 23:53 -0400
        Re: How to find bad row with db api executemany()? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-30 00:19 -0400
        Re: How to find bad row with db api executemany()? Chris Angelico <rosuav@gmail.com> - 2013-03-30 15:24 +1100
        Re: How to find bad row with db api executemany()? Tim Chase <python.list@tim.thechases.com> - 2013-03-30 06:38 -0500

Page 2 of 2 — ← Prev page 1 [2]


#42313

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2013-03-29 23:53 -0400
Message-ID<mailman.3985.1364615647.2939.python-list@python.org>
In reply to#42301
On Fri, 29 Mar 2013 22:44:53 -0400, Roy Smith <roy@panix.com> declaimed
the following in gmane.comp.python.general:

> 
> OMG, this is amazing.
> 
> http://stackoverflow.com/questions/3945642/
> 
> It turns out, the MySQLdb executemany() runs a regex over your SQL and 
> picks one of two algorithms depending on whether it matches or not.
> 
	Hmm, I never tracked deeper than the cursors.executemany() block...
Then again, I don't do regex's (for my uses I've often been able to code
a simple parser using .split(), "string" in list-or-string, etc. faster
than reading the help system for regex syntax).

> restr = (r"\svalues\s*"
>         r"(\(((?<!\\)'[^\)]*?\)[^\)]*(?<!\\)?'"
>         r"|[^\(\)]|"
>         r"(?:\([^\)]*\))"
>         r")+\))")
> 
> Leaving aside the obvious line-noise aspects, the operative problem here 
> is that it only looks for "values" (in lower case).
>
	Well, I suppose if one's application documentation is complete
enough to mention it, one could maybe edit that part of MySQLdb to be
case insensitive (or do a .lower() on the query string only where this
check is performed). Documented so the next time one updates the adapter
one as a reminder to adjust that functionality.
  
> I've lost my initial test script which convinced me that executemany() 
> would be a win; I'm assuming I used lower case for that.  Our production 
> code uses "VALUES".
>
	I'd not have been affected -- I don't write SQL with uppercase
keywords in practice; maybe only when discussing in a forum will I take
the time to uppercase them...
 
> The slow way (i.e. "VALUES"), I'm inserting 1000 rows about every 2.4 
> seconds.  When I switch to "values", I'm getting more like 1000 rows in 
> 100 ms!
>
	The slow algorithm literally is "one .execute() per record" --
looking at my MySQL references MySQL does support a form of INSERT in
which it supplies a "list" of parameter values (not as a "prepared
statement" with placeholders, which is what I suspect PostgreSQL's
adapters use).

> A truly breathtaking bug.

	Especially given how long MySQLdb has existed (the multi-record
INSERT goes back to MySQL 3.something).
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#42316

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2013-03-30 00:19 -0400
Message-ID<mailman.3987.1364617199.2939.python-list@python.org>
In reply to#42301
On Sat, 30 Mar 2013 13:49:56 +1100, Chris Angelico <rosuav@gmail.com>
declaimed the following in gmane.comp.python.general:


> Especially facepalm because there's some way to do this that's faster
> than straight INSERT statements, and it's not clearly documented as
> "hey, guys, if you want to dump loads of data in, use COPY instead"
> (it might be that, I don't know, but usually COPY isn't directly
> transliterable with INSERT).
>

	While COPY is found in my ancient PostgreSQL book, I don't find it
in MySQL. I suspect the equivalent is LOAD DATA INFILE. Firebird goes
completely differently: one does a CREATE TABLE exttable EXTERNAL FILE
"fixed-length.txt" (table definition), then creates a matching internal
table, and finally performs an INSERT INTO internal SELECT fields FROM
exttable.

	I think MySQL is the only common DBMS with an extension on INSERT of
allowing multiple records (I've not checked my Access 2010 docs, and my
MSDE/SQL-Server books are in storage -- but SQLite3, Firebird, and
PostgreSQL all seem to be "one INSERT = one record").



-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#42318

FromChris Angelico <rosuav@gmail.com>
Date2013-03-30 15:24 +1100
Message-ID<mailman.3989.1364617501.2939.python-list@python.org>
In reply to#42301
On Sat, Mar 30, 2013 at 3:19 PM, Dennis Lee Bieber
<wlfraed@ix.netcom.com> wrote:
>         I think MySQL is the only common DBMS with an extension on INSERT of
> allowing multiple records (I've not checked my Access 2010 docs, and my
> MSDE/SQL-Server books are in storage -- but SQLite3, Firebird, and
> PostgreSQL all seem to be "one INSERT = one record").

I don't know about performance, but syntactically at least, an INSERT
is certainly allowed to do multiple records. I do this all the time
for database dump/recreation, something like:

INSERT INTO table (field,field,field) VALUES
(value,value,value),(value,value,value);

I've done this in PostgreSQL, and I'm pretty sure also in MySQL. That
might be identical in performance to two separate statements, but at
least it's clearer.

ChrisA

[toc] | [prev] | [next] | [standalone]


#42334

FromTim Chase <python.list@tim.thechases.com>
Date2013-03-30 06:38 -0500
Message-ID<mailman.3996.1364643439.2939.python-list@python.org>
In reply to#42301
On 2013-03-30 00:19, Dennis Lee Bieber wrote:
> 	I think MySQL is the only common DBMS with an extension on
> INSERT of allowing multiple records (I've not checked my Access
> 2010 docs, and my MSDE/SQL-Server books are in storage -- but
> SQLite3, Firebird, and PostgreSQL all seem to be "one INSERT = one
> record").

MS SQL Server supports the "INSERT INTO ... SELECT" syntax as well.
Strangely, it *also* supports "SELECT ... INTO tblNew" but that
creates the tblNew, rather than inserting into an existing one.

I've seen Chris's suggestion about "INSERT INTO (f1,f2,f3) VALUES
(v1,v2,v3),(v4,v5,v6);" syntax, but that hasn't worked cross-RDBMS
for me, so I always have to resurrect the syntax.

-tkc

 

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.python


csiph-web