Path: csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.023 X-Spam-Evidence: '*H*': 0.95; '*S*': 0.00; 'explicit': 0.07; 'purpose.': 0.07; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'statements': 0.09; 'subject:How': 0.10; '(everything': 0.16; 'chunks': 0.16; 'dump,': 0.16; 'once.': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'roy': 0.16; 'worst': 0.16; 'wrote:': 0.18; 'url:home': 0.24; 'initial': 0.24; 'required.': 0.27; 'header:X-Complaints-To:1': 0.27; 'chris': 0.29; 'errors': 0.30; 'code': 0.31; '(maybe': 0.31; 'breaking': 0.31; 'end,': 0.31; 'probably': 0.32; "we're": 0.32; 'checking': 0.33; 'fri,': 0.33; 'maybe': 0.34; 'subject:with': 0.35; 'transaction': 0.35; 'doing': 0.36; 'charset:us-ascii': 0.36; 'should': 0.36; 'two': 0.37; 'performance': 0.37; 'received:76': 0.38; 'to:addr:python-list': 0.38; 'rather': 0.38; 'skip:. 10': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'either': 0.39; 'received:org': 0.40; 'how': 0.40; 'entire': 0.61; 'first': 0.61; 'back': 0.62; 'complete': 0.62; 'day.': 0.63; 'mar': 0.68; 'smith': 0.68; 'statement,': 0.68; 'records': 0.73; 'article': 0.77; 'partial': 0.84; 'subject:find': 0.84; '2013': 0.98 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Dennis Lee Bieber Subject: Re: How to find bad row with db api executemany()? Date: Fri, 29 Mar 2013 23:38:18 -0400 Organization: > Bestiaria Support Staff < References: <5155E32A.1000403@davea.name> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: adsl-76-249-20-75.dsl.klmzmi.sbcglobal.net X-Newsreader: Forte Agent 3.3/32.846 X-No-Archive: YES X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 33 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1364614714 news.xs4all.nl 6842 [2001:888:2000:d::a6]:36047 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:42311 On Fri, 29 Mar 2013 21:19:22 -0400, Roy Smith declaimed the following in gmane.comp.python.general: > In article , > Chris Angelico wrote: > > > > > How much are you doing per transaction? The two extremes (everything > > in one transaction, or each line in its own transaction) are probably > > the worst for performance. See what happens if you pepper the code > > with 'begin' and 'commit' statements (maybe every thousand or ten > > thousand rows) to see if performance improves. > > > > ChrisA > > We're doing it all in one transaction, on purpose. We start with an > initial dump, then get updates about once a day. We want to make sure > that the updates either complete without errors, or back out cleanly. > If we ever had a partial daily update, the result would be a mess. > As I recall, DB-API compliance should have started a transaction when you submit a statement, so explicit "begin" should not be required. My suggestion would be to first try just breaking the .executemany() into chunks rather than everything at once. Submit maybe 500-1000 records at a time via .executemany(), checking for errors -- if a block errors you can still rollback() the entire transaction AND you have a smaller set of data to examine. Then at the end, if no errors, do the commit() -- Wulfraed Dennis Lee Bieber AF6VN wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/