Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.sys.apple2.programmer > #367

Re: Fastest method to copy/process a range of bytes?

From "Anton Treuenfels" <teamtempest@yahoo.com>
Newsgroups comp.sys.apple2.programmer
References <jvs69d$qa1$1@news.xmission.com> <mdGdnWkVEqOuvb7NnZ2dnUVZ_vednZ2d@earthlink.com> <mmphosis-1344492595@macgui.com>
Subject Re: Fastest method to copy/process a range of bytes?
Date 2012-08-09 18:54 -0500
Message-ID <B8ydnWQSV9RG0LnNnZ2dnUVZ_vSdnZ2d@earthlink.com> (permalink)

Show all headers | View raw


"mmphosis" <mmphosis@macgui.com> wrote in message 
news:mmphosis-1344492595@macgui.com...
> save a few more cycles by stuffing the entire routine in the zero page, 
> and
> rather than lda/sta (indirect),y instead make the last copy lda/sta
> absolute,y
>
> ptr = :5+1 ; source ptr is one byte past the lda absolute,y instruction
> ptr_mp = :d+1 ; destination ptr_mp is one byte past the lda absolute,y
> instruction
>   org $0080 ; * = $0080
>
> .
> .
> .
>
> :5   lda ptr,y           ; move odd byte (1, 3, ..., 255)
> :d   sta ptr_mp,y
>
> .
> .
> .
>
> * your assembler directives may vary
> ** and, every assembler seems to do labels differently
> *** I am using xa and ended up having to hand assemble this because xa
> couldn't resolve the ptr indirection to assemble to zero page instructions
>
>

Mmm, yes, well I'm not a huge fan of self-modifying code in the first place. 
In the second, "absolute,y" addressing is faster by one cycle each in 
loading and storing over "(indirect),y" a net gain of two cycles.  But using 
only one LDA/STA pair in the main loop requires a three-cycle branch for 
every byte moved. Using "(indirect),y" makes the main loop much easier to 
unroll - to use "absolute,y" in an unrolled loop, every LDA/STA pair has to 
be modified. Unrolling "(indirect),y" is just a matter of adding 
LDA/STA/INY to the main loop and making it is entered at the right spot the 
first time (the latter consideration also having to be accounted for by an 
unrolled "absolute,y" approach as well).

An "(indirect),y" loop unrolled once (as I showed) takes four cycles longer 
to load and store two bytes but branches only half as often, for a net 
disadvantage of only one cycle per pair of bytes. Unrolling the loop 
further, say to move four bytes in the main loop, takes eight cycles longer 
than absolute to load and store them but eliminates three branches taking 
nine cycles - a net gain of one cycle faster than absolute addressing to 
move four bytes. Eight bytes in the main loop is plus 16 on loads and 
stores, minus 21 on branches, net gain five cycles. You could go further. 
But by this time you really really want to have to go fast in order to put 
up with the code size required.

- Anton Treuenfels 

Back to comp.sys.apple2.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-07 16:53 -0600
  Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-07 18:57 -0700
    Re: Fastest method to copy/process a range of bytes? Daniel Kruszyna <dan@krue.net> - 2012-08-08 15:28 +0000
  Re: Fastest method to copy/process a range of bytes? aiiadict@gmail.com - 2012-08-07 22:07 -0700
  Re: Fastest method to copy/process a range of bytes? "Michael J. Mahon" <mjmahon@aol.com> - 2012-08-08 10:49 -0700
    Re: Fastest method to copy/process a range of bytes? "Michael J. Mahon" <mjmahon@aol.com> - 2012-08-08 10:59 -0700
  Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-08 21:27 -0500
    Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-08 22:12 -0700
      Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-08 22:23 -0700
        Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-09 18:35 -0500
          Re: Fastest method to copy/process a range of bytes? Jerry <awanderin@yahoo.ca> - 2012-08-11 01:03 -0600
            Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-11 11:33 -0700
    Re: Fastest method to copy/process a range of bytes?  mmphosis <mmphosis@macgui.com> - 2012-08-09 06:09 +0000
      Re: Fastest method to copy/process a range of bytes?  mmphosis <mmphosis@macgui.com> - 2012-08-09 09:40 +0000
      Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-09 18:54 -0500
        Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-09 17:48 -0700
        Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-09 17:46 -0700
          Re: Fastest method to copy/process a range of bytes? Michael J. Mahon <mjmahon@aol.com> - 2012-08-10 15:25 -0500
            Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-10 14:23 -0700
    Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-12 08:56 -0600
      Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-12 23:27 -0500
        Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-13 11:18 -0600
  Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-11 15:58 -0600
    Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-11 16:16 -0600

csiph-web