Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!eternal-september.org!feeder.eternal-september.org!mx04.eternal-september.org!.POSTED!not-for-mail From: "aminer" Newsgroups: comp.programming.threads,comp.programming Subject: Re: ParallelSort version 3.0 Date: Wed, 24 Oct 2012 11:25:19 -0500 Organization: A noiseless patient Spider Lines: 339 Message-ID: References: Injection-Date: Wed, 24 Oct 2012 15:21:27 +0000 (UTC) Injection-Info: mx04.eternal-september.org; posting-host="c43ca82f9e8d62a602307fe9d2e9b807"; logging-data="9697"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/KcZmqz9lSYNkeKIVjDW20" X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-RFC2646: Format=Flowed; Response X-Newsreader: Microsoft Outlook Express 6.00.2900.5931 Cancel-Lock: sha1:dOpNGz9ovepIS/MheW9tF6E6u+8= X-Priority: 3 X-MSMail-Priority: Normal Xref: csiph.com comp.programming.threads:1185 comp.programming:2384 Hello, Here is what have changed in version 3.0: I have implemented a Binary search like this: == function BinSearch(var a: TTabPointer; value:pointer;left, right:long;SCompare: TSortCompare): Integer; var First: long; Last: long; Pivot: long; Found: Boolean; begin First := left; Last := right; Found := False; Result := -1; while (First <= Last) and (not Found) do begin Pivot := (First+Last)div 2; if scompare(a[Pivot],value)=0 then begin Found := True; Result := Pivot; end else if scompare(a[Pivot],value)>0 then Last := Pivot - 1 else First := Pivot + 1; end; end; == And i have added a ne SequentialMerge like this: === Procedure SequentialMerge(var to1:TTabPointer; var temp:TTabPointer; lowX, highX,lowY, highY, lowTo:long;SCompare: TSortCompare); var highto:long; begin highTo := lowTo + highX - lowX + highY - lowY + 1; for lowto:=lowto to highto do begin if (lowX > highX) then begin to1[lowTo] := temp[lowY]; inc(lowY); end else if (lowY > highY) then begin to1[lowTo] := temp[lowX]; inc(lowX); end else begin if scompare(temp[lowX],temp[lowY])<0 then begin to1[lowTo]:=temp[lowX]; inc(lowX); end else begin to1[lowTo]:=temp[lowY]; inc(lowY); end; end; end; end; === And i have added the TParallelSort.merge_parallel1() method that will be called in parallel. like this: === procedure TParallelSort.merge_parallel1(job:pointer); begin merge_parallel(tjob2(job).t,tjob2(job).p1,tjob2(job).r1,tjob2(job).p2,tjob2(job).r2,tjob2(job).a,tjob2(job).p3,tjob2(job).comp,tjob2(job).value,tjob2(job).event); lockeddeclong1(tjob2(job).value); if long(tjob2(job).value^)=0 then tjob2(job).event.setevent; tjob2(job).free; end; == And i have added a TParallelSort.merge_parallel() method that uses my threadpool engine to call the TParallelSort.merge_parallel1 method: === procedure TParallelSort.merge_parallel(var t:TTabPointer; p1, r1:long; p2, r2:long;var a:TTabPointer; p3:long;SCompare: TSortCompare;value:pointer;event:TSimpleEvent); var length1,length2,tmp,q1,q2,q3:long; mergeLeft,mergeRight:TJob2; begin length1 := (r1 - p1) + 1; length2 := (r2 - p2) + 1; if ( length1 = 0 ) then exit; //if ((( length1 + length2 ) <= 8000) ) or (long(value^)>=(nbrprocs1-1)) then if (( length1 + length2 ) <= 8000) then begin SequentialMerge(a, t, p1, r1, p2, r2, p3,scompare); exit; end; if (length1 < length2) then begin merge_parallel(t, p2,r2,p1, r1, a,p3,SCompare,value,event); exit; end; q1 := (p1 +r1) div 2; q2 :=my_binary_search( t,t[q1 ], p2, r2,SCompare); // q2 := binsearch( t,t[q1 ], p2, r2,SCompare); {if (q2=-1) then begin writeln('titi'); SequentialMerge(a, t, p1, r1, p2, r2, p3,scompare); exit; end;} q3 := p3 + ((q1 - p1)) + ((q2 - p2)) ; a[q3] := t[q1]; //mergeLeft:=TParallelMerge.create(); mergeLeft:=TJob2.create(); mergeLeft.t:=t; mergeLeft.p1:=p1; mergeLeft.r1:=q1-1; mergeLeft.p2:=p2; mergeLeft.r2:=q2-1; mergeLeft.a:=a; mergeLeft.p3:=p3; //mergeLeft.obj:=self; mergeLeft.comp:=scompare; mergeleft.value:=value; mergeLeft.event:=event; lockedinclong1(value); { if (long(value^)>=(nbrprocs1-2)) then begin lockeddeclong1(value); SequentialMerge(a, t, p1, r1, p2, r2, p3,scompare); mergeleft.free; exit; end;} mergeRight:=TJob2.create(); mergeRight.t:=t; mergeRight.p1:=q1+1; mergeRight.r1:=r1; mergeRight.p2:=q2; mergeRight.r2:=r2; mergeRight.a:=a; mergeRight.p3:=q3+1; mergeRight.comp:=scompare; mergeRight.value:=value; mergeRight.event:=event; lockedinclong1(value); {if (long(value^)>=(nbrprocs1-2)) then begin lockeddeclong1(value); SequentialMerge(a, t, p1, r1, p2, r2, p3,scompare); mergeRight.free; exit; end;} TP.execute(self.merge_parallel1,pointer(mergeLeft)); TP.execute(self.merge_parallel1,pointer(mergeRight)); end; === And i have changed the TParallelSort.parallelmerge() method , now it calls TParallelSort.merge_parallel((), not merge1(), like this: == procedure TParallelSort.parallelmerge(obj:pointer) ; var local_count:long; value:long; event1:TSimpleEvent; i:long; begin event1:=tsimpleevent.create; value:=0; //Merge1(Tjob1(obj).a1,Tjob1(obj).a2,Tjob1(obj).Low,Tjob1(obj).Mid,Tjob1(obj).Hi,Tjob1(obj).comp); merge_parallel(Tjob1(obj).a1,Tjob1(obj).Low,Tjob1(obj).Mid,Tjob1(obj).Mid+1,Tjob1(obj).Hi,Tjob1(obj).a2,0,Tjob1(obj).comp,@value,event1); event1.waitfor(INFINITE); event1.resetevent; for i:=Tjob1(obj).Low to Tjob1(obj).Hi do begin Tjob1(obj).a1[i]:=Tjob1(obj).a2[i-Tjob1(obj).Low] end; setlength(Tjob1(obj).a2,0); local_count:=LockedIncLong(count1); if local_count=TJob1(obj).count then Tjob1(obj).event.setevent; TJob1(obj).free; event1.free; end; === Thank you, Amine Moulay Ramdane. "aminer" wrote in message news:k68sg9$c66$1@dont-email.me... > Hello, > > Parallel Sort Library was updated to version 3.0. > > In this new version i have implemented a Parallel hybrid > divide-and-conquer merge algorithm that performs 0.9-5.8 times better than > sequential merge, on a quad-core processor, with larger arrays > outperforming by over 5 times. Parallel processing combined with a hybrid > algorithm approach provides a powerful high performance result. > > > Description: > > Parallel Sort Library that supports Parallel Quicksort, Parallel HeapSort > and Parallel MergeSort on Multicores systems. > > Parallel Sort Library uses my Thread Pool Engine and quicksort many array > parts - of your array - in parallel using Quicksort or HeapSort or > MergeSort and after that it finally merge them - with the merge() > procedure - > > In the previous parallelsort version i have parallelized only the sort > part, but in this new parallelsort version i have parallelized also the > merge procedure part and it gives better performance. > > I have implemented a Parallel hybrid divide-and-conquer merge algorithm > that performs 0.9-5.8 times better than sequential merge, on a quad-core > processor, with larger arrays outperforming by over 5 times. Parallel > processing combined with a hybrid algorithm approach provides a powerful > high performance result. > > And as you know , Quicksort is a divide and conquer algorithm that have > the following best case performance: > > T(n) = T(n/2) + T(n/2) + O(n) > = 2T(n/2) + (n) > > And from case 2 of Master theorem, the recurrence equation gives: > > T(n) = (n lg n) > > > Parallelizing the Sorts > > One way to parallelize the sorts is: > > Divide the data among the processors > Sort the data on the individual processors. > Merge the various data > > Note that the merge operation is a reduction operation ! > > > And please look at test.pas inside the zip, a parallel quicksort on many > cores, you have to compile and run gendata.pas before running test.pas, > and please set the number of threads to 8 times the number of cores to be > more optimal, i have giving you an exemple on how to do it , look inside > test.pas... > > You can download ParallelSort library from: > > http://pages.videotron.com/aminer/ > > > > Thank you, > Amine Moulay Ramdane. > > >