Groups > comp.programming.threads > #1212

Re: Parallel Sort Library was updated to version 3.02

From	"aminer" <aminer@toto.com>
Newsgroups	comp.programming.threads, comp.programming
Subject	Re: Parallel Sort Library was updated to version 3.02
Date	2012-10-31 21:59 -0500
Organization	A noiseless patient Spider
Message-ID	<k6skvs$tgj$2@dont-email.me> (permalink)
References	<k6sj8u$lp0$2@dont-email.me>

Cross-posted to 2 groups.

Show all headers | View raw

I wrote:
>I have done some tests with my ParallelSort library and
>i have noticed that it can give up to 3.8x scalability with
>strings, and it give  3x scalability with integers..

The tests was done on a quad cores system.


Amine Moulay Ramdane.



"aminer" <aminer@toto.com> wrote in message 
news:k6sj8u$lp0$2@dont-email.me...
>
> Hello,
>
> Parallel Sort Library ws updated to version 3.02
>
>
> Description:
>
> Parallel Sort Library that supports Parallel Quicksort, Parallel HeapSort 
> and Parallel MergeSort on Multicores systems.
>
> Parallel Sort Library uses my Thread Pool Engine and sort many array 
> parts - of your array -  in parallel using Quicksort or HeapSort or 
> MergeSort and after that it finally merge them - with the merge() 
> procedure -
>
> In the previous parallelsort version i have parallelized only the sort 
> part, but in this new parallelsort version i have parallelized also the 
> merge procedure part and it gives better performance.
>
> I have implemented a Parallel hybrid divide-and-conquer merge algorithm 
> that performs 0.9-5.8 times better than sequential merge, on a quad-core 
> processor, with larger arrays outperforming by over 5 times. Parallel 
> processing combined with a hybrid algorithm approach provides a powerful 
> high performance result.
> The best case complexity of ParallelSort using mergesort  is:
>
> ((n/p)* log(n/p)) + O(n/p)
>
> p: is the number of cores
>
> the ((n/p)* log(n/p)) is the complexity of the sorting part.
>
> O(n/p) is the best case complexity of the merging part.
>
> so the best case complexity  is:  ((n/p)* log(n/p))
>
> The worst case complexity of parallel sort library using mergesort is:
>
> ((n/p)* log(n/p)) +  O(n)
>
> the ((n/p)* log(n/p)) is the complexity of the sorting part.
>
> O(n) is the worst  case complexity of the merging part.
>
> so the worst case complexity of parallelsort using mergesort is 
> approximatly: O(n)
>
> I have done some tests with my ParallelSort library and i have noticed 
> that it can give up to 3.8x scalability with strings, and it give 3x 
> scalability with integers.
> So, why it scale to 3.8x with strings and only 3x with integers ?
>
>
> I explain:
>
> In the SequentialMerge() method and QSort() method inside Parallel Sort 
> library, i am calling the Scompare() method and also in both of them i am 
> copying to the memory system.
>
> So when i am using strings the SCompare() method is more expensive, so the 
> parallel part p in the Amdahl equation 1/ S + P/N (S: the serial part, P: 
> parallel part and N: the number of cores) is bigger than with integers so 
> the Amdahl equation will scale better, but when we are using integers the 
> SCompare() method is less expensive than the SCompare() with strings, so 
> the parallel part p in the Amdahl equation is less bigger than with 
> strings. so this is why parallel sorting with strings scales better than 
> with integers.
>
> I have implemented mergsort and quicksort, but as you know the complexity 
> of mergesort in the worst case is better than quicksort , and the 
> mergesort that i have implemented is faster than quicksort, but mergesort 
> takes more space..
>
> The reccurence equation of the complexity of mergesort is:
>
> T(n) = 2 * T(n/2) + n
>
> cause it take O(n) for the merging part.
> It gives:
>
> T(n) = 2 * T(n/2) + n
> = 2 (2T(n/4) +n/2) + n
> =4T(n/4)+n+n
> =4T(n/4)+2*n
> =4 (2T(n/8) +n/4) + 2*n
> =8T(n/8)+n+2n
> =8T(n/8)+3*n
> =2k T(n/2^k) + k*n
>
> We want:
>
> n/2k = 1
> n = 2k
> log n = k
>
> so the reccurence equation gives:
>
> = nT(1) +n*log(n)
> = n+ (n * log(n))
>
> So the mergesort complexity in the best case and in the worst case is:
> n * log(n)
>
> But the complexity of the quicksort in the worst case is:
>
> T(n)= n + T(n-1)
>
> it gives:
>
> T(n) = n + (n-1) + T(n-2)
> T(n) = n + (n-1) + (n-2)+ T(n-3)
> T(n) = 1 + 2+ 3+.+N
> .T(n) = O(n^2)
>
> And Quicksort in best case performance is:
>
> T(n) = T(n/2) + T(n/2) + O(n)
>       = 2T(n/2) + (n)
>
> this gives: T(n) = (n lg n)
>
> Parallelizing the Sorts
>
> One way to parallelize the sorts is:
>
> - Divide the data among the processors
> - Sort the data on the individual processors.
> - Merge the various data
>
> Note that the merge operation is a reduction operation !
>
> I have done some scalability tests on my parallelsort library and i have 
> come to the conclusion that parallel heapsort is better on scalability 
> than parallel quicksort cause the P part (of the Amdahl equation) is 
> bigger in parallel heapsort
> than in parallel quicksort, the parallel heapsort is doing more on the 
> parallel part, it's why it scales better than parallel quicksort, but 
> parallel quicksort is still faster than parallel heapsort on my tests on a 
> quad core processor.
> And please look at test.pas a parallel quicksort on many cores - compile 
> and execute it...
>
> Language: FPC Pascal v2.2.0+ / Delphi 7+: http://www.freepascal.org/
>
> Operating Systems: Win , Linux and Mac (x86).
>
> Required FPC switches: -O3 -Sd -dFPC -dWin32 -dFreePascal
>
> -Sd for delphi mode....
>
> Required Delphi switches: -DMSWINDOWS -$H+
>
> For Delphi 5,6,7 use -DDelphi
>
> For Delphi 2005,2006,2007,2009,2010+ use the switch -DDELPHI2005+
>
>
>
> You can download ParallelSort library from:
>
> http://pages.videotron.com/aminer/
>
>
> Thank you,
> Amine Moulay Ramdane.
>
>
>
>
>

Back to comp.programming.threads | Previous | Next — Previous in thread | Find similar

Thread

Parallel Sort Library was updated to version 3.02 "aminer" <aminer@toto.com> - 2012-10-31 21:29 -0500
  Re: Parallel Sort Library was updated to version 3.02 "aminer" <aminer@toto.com> - 2012-10-31 21:59 -0500

csiph-web