Groups | Search | Server Info | Login | Register


Groups > comp.lang.pascal.borland > #201

Re: Speeding up code - am I missing something?

From Robert Prins <robert@nospicedham.prino.org>
Newsgroups comp.lang.pl1, comp.lang.asm.x86, comp.lang.pascal.borland, comp.lang.pascal.misc
Subject Re: Speeding up code - am I missing something?
Date 2018-08-16 19:51 +0000
Organization A noiseless patient Spider
Message-ID <pl4a51$pit$1@dont-email.me> (permalink)
References <pl1s0v$r0c$1@dont-email.me> <378956968.556062427.636213.peter_flass-yahoo.com@news.eternal-september.org>

Cross-posted to 4 groups.

Show all headers | View raw


On 2018-08-15 21:52, Peter Flass wrote:
> Robert Prins <robert@nospicedham.prino.org> wrote:
>> Hi all,
>>
>> Not really specific x86, Virtual Pascal, or PL/I, but as I've got the same code
>> written in all three languages, I'll give it a try here:
>>
>> I have an array (actually three, but that's not very relevant for the issue) of
>> pointers to items in a linked list. The array is sorted into a specific order,
>> and after the sort, I use it to print Top-N tables. For the main Top-50 table (x
>> 3) that's easy enough, the first 50 items are selected.
>>
>> However, at this time the linked list also contains 5 sub-lists, each containing
>> additional sub-lists for a total of 390 (yes, three-hunderd-and-ninety)
>> sub-lists for which I want to print Top-10 tables (or if a sub-list contains
>> just 4 items, a Top-4 table). Because I don't really want to sort the main array
>> six (x 3) times, I just scan the array from the top, and pick out the items for
>> each sub-list, until I have 10 (or the 4 mentioned before) items. To speed up
>> the process a little, I make an initial pass over the array to find the first
>> item for each sub-list, but that doesn't really improve things a great deal.
>>
>> So given that "this" is the list, sorted on its main key, with the second and
>> third columns identifying additional sub-lists,
>>
>> 100, 'a', '1'
>> 099, 'b', '9'
>> 098, 'c', '1'
>> 097, 'b', '8'
>> 096, 'd', '3'
>> 095, 'a', '2'
>> 094, 'c', '8'
>> 093, 'c', '1'
>> 092, 'a', '3'
>> 091, 'f', '4'
>> 090, 'a', '2'
>>
>> the "Top-50" would contain the above.
>>
>> For sub-list 'a', I would scan the array, and the resulting "Top-10" would be
>>
>> 100, 'a', '1'
>> 095, 'a', '2'
>> 092, 'a', '3'
>> 090, 'a', '2'
>>
>> Likewise, for sub-list 'b', I would scan the array, and the resulting "Top-10"
>> would be, and the initial fill-the-cache scan would have set the starting index
>> for the 'b' sub-list to the cell containing 099, saving me one superfluous compare.
>>
>> 099, 'b', '9'
>> 097, 'b', '8'
>>
>> etc, and the same would happen for the sub-lists defined by column 3, the first
>> being
>>
>> 100, 'a', '1'
>> 098, 'c', '1'
>> 093, 'c', '1'
>>
>> As I wrote, I make an initial pass over the data to cache the first item for
>> each sub-list, some of them consist of a single item and are nearly at the end
>> of the sorted full array, and this saves some time, the count of executed
>> statements is reduced by around 10%, but if anyone can think of a way to speed
>> up things a bit more, without having to sort the array(s) five more times (in an
>> additional 10,000,000 executed statements, see below), please share it with me.
>>
>> Some numbers:
>>
>> - the caching code executes around 120,000 PL/I statements
>> - sorting the three arrays executes around in 2,000,000 PL/I statements (using a
>> Heapsort)
>> - printing the 390 top-n tables executes around 7,600,000 PL/I statements, and
>> the "IF" statement that selects the data to be printed has a hit-percentage of
>> just 0.5%:
>>
>> IF STY_TCN = 'Tr' & STY = LIST.TR | /* 208 type 1 */ 1,841,754 executed IF's
>>     STY_TCN = 'Na' & TCN = LIST.NA | /*  93 type 2 */
>>     STY_TCN = 'Yr' & STY = LIST.YR | /*  37 type 3 */
>>     STY_TCN = 'Co' & TCN = LIST.CO | /*  33 type 4 */
>>     STY_TCN = 'Ty' & TCN = LIST.TY | /*  19 type 5 */
>>     STY_TCN = '**' THEN              /*   1 main   */
>>    DO;
>>      #T = #T + 1;                                         9,594 - the actual
>> number of Top-N lines printed
>>
>> Robert
> 
> Assuming mainframe, the rule-of-thumb I learned was "let the sort do the
> work" since DFSORT is always more efficient than any user code. I can't
> visualize what you're doing from your description, but if possible I'd make
> one pass of the data and build records that can be sorted into the desired
> sequence by DFSORT. Then read the sorted records and print the listings. I
> would use PLISRTD to build and retrieve the records, but some would
> probably prefer two separate programs with the sort in between invoked by
> JCL, or build the records and ATTACH the sort, or...

The program processes my hitchhike data (yes, really), and I have three 
versions, one written in PL/I, which runs on z/OS, one written in Virtual 
Pascal, running on Windows, and one where most of the Pascal has been converted 
to in-line assembler. All three should produce the same output, and when there 
are differences (and there were some recently), I know that I'm doing something 
wrong.

The full set of hitchhike data is kept in a single linked list, but every item 
in the list contains a load of extra pointers to form additional lists,

1) per trip,
2) per type of driver,
3) per country,
4) per nationality of driver, and
5) per year.

The very first (Turbo) Pascal (V3.01a) version of the program produced just a 
few tables (see 
<https://prino.neocities.org/miscellaneous/keeping-statistics.html> for a longer 
description), but over the years, most as a result of in-ride conversations with 
my drivers "Did you ever consider determining ...?", I've added a "few" more tables.

This post refers to a relatively new set of tables, the Top-n ones. The program 
produces a Top-50 table of the longest rides (actually three Top-50 tables, for 
longest rides (km), longest rides (time), and fastest rides (km/h)) for the 
whole set, and those three tables, but then limited to Top-10 (or Top-<10) are 
then repeated for every trip (208 at the moment), type of driver (19), country 
(1+33), nationality of the driver (93) and year (37) for a total of 390 tables.

Sorting the array of pointers to the items of the list in distance order gives 
me all data in that order, and to obtain the top-10 fort a specific sub-list, I 
just skip the unwanted data, and which saves me, for example for case 1) above, 
three additional sorts (distance/time/velocity) where the trip-number is the 
primary key (and ditto for all other sub-lists), at the, as is obvious from the 
data presented, expense of rather a hell of a lot of extra compares.

And yes, on z/OS I could use the PLISRTx interface to DFSORT, and
using Virtual Pascal I could, with rather a lot more hassle, spawn M$' sort.exe, 
but that would mean writing out the set of data five times in pure text format, 
as SORT.EXE still cannot sort on anything other than a single starting column. 
Hell, even using PLISRTD requires me to write out the data over and over again...

So, unless someone comes up with some kind of brilliant suggestion (like Paul 
Green way back in 1998, when I asked how to speed-up another procedure(*)), I'm 
probably stuck with what I have now, whether I like it or not. (I don't, but 
c'est la vie...)

And lest you say, "Why the flipping 'ell are we discussing a program that deals 
with data relating to hitchhiking?"

My answer to that question?

The very first time (October 2006) I compiled an older it with Enterprise PL/I 
(V3.5.0) the compiler abended, not being able to handle the INIT of a based 
structure with a power ("INIT ((dont-remember-what**2))"), and it's directly 
responsible for

- https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=31632
- https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=84512
- https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=100073
- https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=115540

'nuff said?

Robert
-- 
PS:

(*) Executed PL/I statement counts:

- pre  Paul Green: NAT_SCAN 5,407,617,546
- post Paul Green: NAT_SCAN     1,237,239

Old 2.3.0 compiler actually requires compilation with (NOFOFL): prefix to avoid 
a decimal overflow exception abend in the PL/I library code.
-- 
Robert AH Prins
robert(a)prino(d)org

Back to comp.lang.pascal.borland | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Speeding up code - am I missing something? Robert Prins <robert@nospicedham.prino.org> - 2018-08-15 21:37 +0000
  Re: Speeding up code - am I missing something? Peter Flass <peter_flass@nospicedham.yahoo.com> - 2018-08-15 17:52 -0400
    Re: Speeding up code - am I missing something? Robert Prins <robert@nospicedham.prino.org> - 2018-08-16 19:51 +0000
  Re: Speeding up code - am I missing something? Robert Prins <robert@nospicedham.prino.org> - 2018-08-16 20:38 +0000
    Re: Speeding up code - am I missing something? Robert Prins <robert@nospicedham.prino.org> - 2018-08-17 19:20 +0000
      Re: Speeding up code - am I missing something? Robert Prins <robert@nospicedham.prino.org> - 2018-08-23 17:21 +0000
        Re: Speeding up code - am I missing something? Peter Flass <peter_flass@nospicedham.yahoo.com> - 2018-08-23 11:47 -0400
        Re: Speeding up code - am I missing something? rugxulo@gmail.com - 2018-08-31 09:47 -0700
          Re: Speeding up code - am I missing something? Robert Prins <robert@prino.org> - 2018-09-01 10:08 +0000
            Re: Speeding up code - am I missing something? Marco van de Voort <marcov@toad.stack.nl> - 2018-09-01 11:20 +0000
            Re: Speeding up code - am I missing something? rugxulo@gmail.com - 2018-09-03 17:32 -0700

csiph-web