Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!eternal-september.org!feeder.eternal-september.org!mx04.eternal-september.org!.POSTED!not-for-mail From: Erland Sommarskog Newsgroups: comp.databases.ms-sqlserver Subject: Re: Break Up Large Table Query Into Results of N Rows Date: Wed, 01 Feb 2012 23:54:36 +0100 Organization: Erland Sommarskog Lines: 51 Message-ID: References: <5609f740-b8b5-4876-8a4a-5633aa91a3a8@eb6g2000vbb.googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Injection-Info: mx04.eternal-september.org; posting-host="nBFDv6s1VJQDuF1w6hpX2A"; logging-data="7506"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19EG0irs20SjpWtTFpp2YAr" User-Agent: Xnews/2006.08.24 Mime-proxy/2.1.c.0 (Win32) Cancel-Lock: sha1:0O1763/EC6HikHg9XLbN8jj2pKc= Xref: x330-a1.tempe.blueboxinc.net comp.databases.ms-sqlserver:923 pbd22 (dushkin@gmail.com) writes: > The specifics is that we are doing email deployments but google is > moving all of the email sent to gmail users to their spam boxes. As a > result, we have to "chunk" the gmail users out of the total amount and > send in manageable batches. We have figured that 80,000 per batch out of > the total gmail users in the table is possible. >... > Bob, the table we are querying against is pretty simple. Essentially, > it has one one column - "email_address" which is a varchar. Its data is > about 1 million email addresses (but that number changes often). The > result set table(s) should only have two columns, the count (INT) and > the email_address (varchar). Please see below. The simple-minded solution would be: WITH numbered AS ( SELECT email, row_number() OVER(ORDER BY email) AS rowno FROM addresses WHERE email LIKE '%@gmail.com' ) SELECT email FROM numbered WHERE rowno > (@batchno - 1) * @batchsize AND rowno <= (@batchno - 1) * @batchsize But it would be far more efficient to do: CREATE TABLE gmail_addresses (rowno int NOT NULL, gmail nvarchar(255) NOT NULL, CONSTRAINT pk_gmail PRIMARY KEY CLUSTERED (rowno), CONSTRAINT pk_unique UNIQUE NONCLUSTERED(gmail)) INSERT gmail_addresses(rowno, gmail) SELECT row_number() OVER(ORDER BY (SELECT 1)), gmail FROM (SELECT DISTINCT email FROM addresses WHERE email like '%@gmail.com') AS x Then you have materialised the row number once for all, and a selection of 80000 accounts will be quick. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Links for SQL Server Books Online: SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx