Groups | Search | Server Info | Keyboard shortcuts | Login | Register
Groups > comp.programming.threads > #1212
| From | "aminer" <aminer@toto.com> |
|---|---|
| Newsgroups | comp.programming.threads, comp.programming |
| Subject | Re: Parallel Sort Library was updated to version 3.02 |
| Date | 2012-10-31 21:59 -0500 |
| Organization | A noiseless patient Spider |
| Message-ID | <k6skvs$tgj$2@dont-email.me> (permalink) |
| References | <k6sj8u$lp0$2@dont-email.me> |
Cross-posted to 2 groups.
I wrote: >I have done some tests with my ParallelSort library and >i have noticed that it can give up to 3.8x scalability with >strings, and it give 3x scalability with integers.. The tests was done on a quad cores system. Amine Moulay Ramdane. "aminer" <aminer@toto.com> wrote in message news:k6sj8u$lp0$2@dont-email.me... > > Hello, > > Parallel Sort Library ws updated to version 3.02 > > > Description: > > Parallel Sort Library that supports Parallel Quicksort, Parallel HeapSort > and Parallel MergeSort on Multicores systems. > > Parallel Sort Library uses my Thread Pool Engine and sort many array > parts - of your array - in parallel using Quicksort or HeapSort or > MergeSort and after that it finally merge them - with the merge() > procedure - > > In the previous parallelsort version i have parallelized only the sort > part, but in this new parallelsort version i have parallelized also the > merge procedure part and it gives better performance. > > I have implemented a Parallel hybrid divide-and-conquer merge algorithm > that performs 0.9-5.8 times better than sequential merge, on a quad-core > processor, with larger arrays outperforming by over 5 times. Parallel > processing combined with a hybrid algorithm approach provides a powerful > high performance result. > The best case complexity of ParallelSort using mergesort is: > > ((n/p)* log(n/p)) + O(n/p) > > p: is the number of cores > > the ((n/p)* log(n/p)) is the complexity of the sorting part. > > O(n/p) is the best case complexity of the merging part. > > so the best case complexity is: ((n/p)* log(n/p)) > > The worst case complexity of parallel sort library using mergesort is: > > ((n/p)* log(n/p)) + O(n) > > the ((n/p)* log(n/p)) is the complexity of the sorting part. > > O(n) is the worst case complexity of the merging part. > > so the worst case complexity of parallelsort using mergesort is > approximatly: O(n) > > I have done some tests with my ParallelSort library and i have noticed > that it can give up to 3.8x scalability with strings, and it give 3x > scalability with integers. > So, why it scale to 3.8x with strings and only 3x with integers ? > > > I explain: > > In the SequentialMerge() method and QSort() method inside Parallel Sort > library, i am calling the Scompare() method and also in both of them i am > copying to the memory system. > > So when i am using strings the SCompare() method is more expensive, so the > parallel part p in the Amdahl equation 1/ S + P/N (S: the serial part, P: > parallel part and N: the number of cores) is bigger than with integers so > the Amdahl equation will scale better, but when we are using integers the > SCompare() method is less expensive than the SCompare() with strings, so > the parallel part p in the Amdahl equation is less bigger than with > strings. so this is why parallel sorting with strings scales better than > with integers. > > I have implemented mergsort and quicksort, but as you know the complexity > of mergesort in the worst case is better than quicksort , and the > mergesort that i have implemented is faster than quicksort, but mergesort > takes more space.. > > The reccurence equation of the complexity of mergesort is: > > T(n) = 2 * T(n/2) + n > > cause it take O(n) for the merging part. > It gives: > > T(n) = 2 * T(n/2) + n > = 2 (2T(n/4) +n/2) + n > =4T(n/4)+n+n > =4T(n/4)+2*n > =4 (2T(n/8) +n/4) + 2*n > =8T(n/8)+n+2n > =8T(n/8)+3*n > =2k T(n/2^k) + k*n > > We want: > > n/2k = 1 > n = 2k > log n = k > > so the reccurence equation gives: > > = nT(1) +n*log(n) > = n+ (n * log(n)) > > So the mergesort complexity in the best case and in the worst case is: > n * log(n) > > But the complexity of the quicksort in the worst case is: > > T(n)= n + T(n-1) > > it gives: > > T(n) = n + (n-1) + T(n-2) > T(n) = n + (n-1) + (n-2)+ T(n-3) > T(n) = 1 + 2+ 3+.+N > .T(n) = O(n^2) > > And Quicksort in best case performance is: > > T(n) = T(n/2) + T(n/2) + O(n) > = 2T(n/2) + (n) > > this gives: T(n) = (n lg n) > > Parallelizing the Sorts > > One way to parallelize the sorts is: > > - Divide the data among the processors > - Sort the data on the individual processors. > - Merge the various data > > Note that the merge operation is a reduction operation ! > > I have done some scalability tests on my parallelsort library and i have > come to the conclusion that parallel heapsort is better on scalability > than parallel quicksort cause the P part (of the Amdahl equation) is > bigger in parallel heapsort > than in parallel quicksort, the parallel heapsort is doing more on the > parallel part, it's why it scales better than parallel quicksort, but > parallel quicksort is still faster than parallel heapsort on my tests on a > quad core processor. > And please look at test.pas a parallel quicksort on many cores - compile > and execute it... > > Language: FPC Pascal v2.2.0+ / Delphi 7+: http://www.freepascal.org/ > > Operating Systems: Win , Linux and Mac (x86). > > Required FPC switches: -O3 -Sd -dFPC -dWin32 -dFreePascal > > -Sd for delphi mode.... > > Required Delphi switches: -DMSWINDOWS -$H+ > > For Delphi 5,6,7 use -DDelphi > > For Delphi 2005,2006,2007,2009,2010+ use the switch -DDELPHI2005+ > > > > You can download ParallelSort library from: > > http://pages.videotron.com/aminer/ > > > Thank you, > Amine Moulay Ramdane. > > > > >
Back to comp.programming.threads | Previous | Next — Previous in thread | Find similar
Parallel Sort Library was updated to version 3.02 "aminer" <aminer@toto.com> - 2012-10-31 21:29 -0500 Re: Parallel Sort Library was updated to version 3.02 "aminer" <aminer@toto.com> - 2012-10-31 21:59 -0500
csiph-web