Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!feeder.news-service.com!feeder.news-service.com!85.214.198.2.MISMATCH!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: "Bob Barrows" Newsgroups: comp.databases.ms-sqlserver Subject: Re: SELECT DISTINCT and compound primary key. Date: Tue, 7 Jun 2011 08:54:24 -0400 Organization: A noiseless patient Spider Lines: 34 Message-ID: References: <5cafddb1-1d17-4a9c-bed5-fbb18cf49311@34g2000pru.googlegroups.com> Injection-Date: Tue, 7 Jun 2011 12:54:29 +0000 (UTC) Injection-Info: mx04.eternal-september.org; posting-host="NUmboHSNYG4cX5Ys9j8TNw"; logging-data="22994"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/vWs63BEebu2DiNlut7iUUAU4FY2WJjd8=" X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.2001 X-Antivirus-Status: Clean X-Newsreader: Microsoft Outlook Express 6.00.2800.2001 X-Antivirus: avast! (VPS 110607-0, 06/07/2011), Outbound message Cancel-Lock: sha1:WAMhvrLTZGIV83cdXei2zjuG+QY= X-Priority: 3 X-MSMail-Priority: Normal Xref: x330-a1.tempe.blueboxinc.net comp.databases.ms-sqlserver:402 Richard wrote: > Hello, > > I have a table (Main) with compound key and second, temporary table > (Temp) with same fields but without PK. Main table key is 3 columns > ( SourceFileHash, FileHash, PathHash ). There is also 4 other columns > in each table. > My intent is to use temp table to store batches of records with > duplicates and inserting only unique records to main table after every > batch is finished ( batches are inserted using SqlBulkCopy ). > > How can I select distinct records on Temp table but using only 3 PK > columns ? > > Here is query that I created - it won't work because it is trying to > insert duplicate records to main: > > SELECT temp.sourceFileHash, temp.pathHash, temp.fileHash, temp. > [length], temp.creationDate, temp.modificationDate, > temp.lastAccessDate, temp.line > FROM CatalogEntriesTemp temp > LEFT OUTER JOIN CatalogEntries main ON main.pathHash = temp.pathHash > and main.fileHash = temp.fileHash AND main.sourceFileHash = > temp.sourceFileHash > WHERE main.pathHash IS NULL AND main.fileHash IS NULL AND > main.sourceFileHash is null > > Thanks for any help in advance. You need to create a CTE or view that uses GROUP BY to select the unique rows from CatalogEntriesTemp, aggregating the non-key columns so that a single row is retrieved.