Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.sys.apple2.programmer > #367
| From | "Anton Treuenfels" <teamtempest@yahoo.com> |
|---|---|
| Newsgroups | comp.sys.apple2.programmer |
| References | <jvs69d$qa1$1@news.xmission.com> <mdGdnWkVEqOuvb7NnZ2dnUVZ_vednZ2d@earthlink.com> <mmphosis-1344492595@macgui.com> |
| Subject | Re: Fastest method to copy/process a range of bytes? |
| Date | 2012-08-09 18:54 -0500 |
| Message-ID | <B8ydnWQSV9RG0LnNnZ2dnUVZ_vSdnZ2d@earthlink.com> (permalink) |
"mmphosis" <mmphosis@macgui.com> wrote in message news:mmphosis-1344492595@macgui.com... > save a few more cycles by stuffing the entire routine in the zero page, > and > rather than lda/sta (indirect),y instead make the last copy lda/sta > absolute,y > > ptr = :5+1 ; source ptr is one byte past the lda absolute,y instruction > ptr_mp = :d+1 ; destination ptr_mp is one byte past the lda absolute,y > instruction > org $0080 ; * = $0080 > > . > . > . > > :5 lda ptr,y ; move odd byte (1, 3, ..., 255) > :d sta ptr_mp,y > > . > . > . > > * your assembler directives may vary > ** and, every assembler seems to do labels differently > *** I am using xa and ended up having to hand assemble this because xa > couldn't resolve the ptr indirection to assemble to zero page instructions > > Mmm, yes, well I'm not a huge fan of self-modifying code in the first place. In the second, "absolute,y" addressing is faster by one cycle each in loading and storing over "(indirect),y" a net gain of two cycles. But using only one LDA/STA pair in the main loop requires a three-cycle branch for every byte moved. Using "(indirect),y" makes the main loop much easier to unroll - to use "absolute,y" in an unrolled loop, every LDA/STA pair has to be modified. Unrolling "(indirect),y" is just a matter of adding LDA/STA/INY to the main loop and making it is entered at the right spot the first time (the latter consideration also having to be accounted for by an unrolled "absolute,y" approach as well). An "(indirect),y" loop unrolled once (as I showed) takes four cycles longer to load and store two bytes but branches only half as often, for a net disadvantage of only one cycle per pair of bytes. Unrolling the loop further, say to move four bytes in the main loop, takes eight cycles longer than absolute to load and store them but eliminates three branches taking nine cycles - a net gain of one cycle faster than absolute addressing to move four bytes. Eight bytes in the main loop is plus 16 on loads and stores, minus 21 on branches, net gain five cycles. You could go further. But by this time you really really want to have to go fast in order to put up with the code size required. - Anton Treuenfels
Back to comp.sys.apple2.programmer | Previous | Next — Previous in thread | Next in thread | Find similar
Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-07 16:53 -0600
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-07 18:57 -0700
Re: Fastest method to copy/process a range of bytes? Daniel Kruszyna <dan@krue.net> - 2012-08-08 15:28 +0000
Re: Fastest method to copy/process a range of bytes? aiiadict@gmail.com - 2012-08-07 22:07 -0700
Re: Fastest method to copy/process a range of bytes? "Michael J. Mahon" <mjmahon@aol.com> - 2012-08-08 10:49 -0700
Re: Fastest method to copy/process a range of bytes? "Michael J. Mahon" <mjmahon@aol.com> - 2012-08-08 10:59 -0700
Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-08 21:27 -0500
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-08 22:12 -0700
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-08 22:23 -0700
Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-09 18:35 -0500
Re: Fastest method to copy/process a range of bytes? Jerry <awanderin@yahoo.ca> - 2012-08-11 01:03 -0600
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-11 11:33 -0700
Re: Fastest method to copy/process a range of bytes? mmphosis <mmphosis@macgui.com> - 2012-08-09 06:09 +0000
Re: Fastest method to copy/process a range of bytes? mmphosis <mmphosis@macgui.com> - 2012-08-09 09:40 +0000
Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-09 18:54 -0500
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-09 17:48 -0700
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-09 17:46 -0700
Re: Fastest method to copy/process a range of bytes? Michael J. Mahon <mjmahon@aol.com> - 2012-08-10 15:25 -0500
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-10 14:23 -0700
Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-12 08:56 -0600
Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-12 23:27 -0500
Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-13 11:18 -0600
Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-11 15:58 -0600
Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-11 16:16 -0600
csiph-web