Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.sys.apple2.programmer > #374
| From | Egan Ford <datajerk@gmail.com> |
|---|---|
| Newsgroups | comp.sys.apple2.programmer |
| Subject | Re: Fastest method to copy/process a range of bytes? |
| Date | 2012-08-11 15:58 -0600 |
| Organization | XMission http://xmission.com/ |
| Message-ID | <k06khd$p8t$1@news.xmission.com> (permalink) |
| References | <jvs69d$qa1$1@news.xmission.com> |
On 8/7/12 4:53 PM, Egan Ford wrote:
> This is the fastest I could come up with. It aligns with what I have
> read online as well as in books. The draw back is that I have to use x
> and y, it's long, and the (copy in this case) code has to be declared
> twice.
>
> Any suggestions or tricks on doing this faster?
Gents,
Thank you all for all the pointers. My consolidated replies below.
On 8/7/12 7:57 PM, Antoine Vignau wrote:
> - use absolute addressing
My problem with absolute addressing is that it is, well, absolute. I
probably should have stated more clearly what my program does and what
my goals are.
I am writing multi-precision arithmetic code and the size of my arrays
are undetermined until run-time. The use of pointers (indirect
addressing) seems a bit more natural. To use absolute I'd have to have
self-modifying code. I am not strictly apposed to that (I do use it for
fast absolute table look ups), however an objective is to illustrate
conventional practices while also trying to optimize for speed.
On 8/8/12 11:49 AM, Michael J. Mahon wrote:
> (I've changed the nomenclature: "page" has the usual meaning
> and I use "block" to refer to the entire memory range being
> copied.)
For the readability of my comments I have changed all instances of block
to page. I used the term block because it is how I visualized the
memory. Thanks for the tip.
On 8/8/12 11:49 AM, Michael J. Mahon wrote:
> This modification copies the final partial page in a downward
> direction, which could be an issue if the source and destination
> blocks overlap.
It is also an issue with my mp math code. I have to process the bytes
in order (LSB to MSB for add, sub, asl, mult; reverse for div). I
should not have used copy as an example for speed up since it eliminates
restrictions that other processing code has.
However, I will try to leverage inc ptr+1 to free up x or y. The
problem with my mult and div code is that I have another loop that needs
to run very fast inside the x/y loop. Right now I have to backup x/y or
try to find a way to merge the loops.
On 8/8/12 9:28 AM, Daniel Kruszyna wrote:
> Another idea is to unroll the first inner loop (lda sta iny).
This simple idea just shaved off 0.5 sec (out of 6.5). I call copy a
lot (931 instances). I unrolled 4x.
On 8/8/12 8:27 PM, Anton Treuenfels wrote:> Here's an approach that uses
pointer adjustments so as to have only one
> main loop. Also that loop is unrolled a bit. Setup takes around 50
> cycles worst case but the last two high byte pointer increments are
> avoided, saving 10 cycles. So net 40 cycles. The unrolled loop saves
> three cycles each time through ("bne" not executed). So a net gain if
> moving 14 or more bytes in a partial page. At least I think that's right.
Anton, this is brilliant. Unrolling 4x with my old (forward processing)
code, vs. your code with only 2x unrolls performs almost the same. I'll
be experimenting with this further. I also have to see if I can do this
in reverse as well. Current example of my reverse (n .. 0) array
processing code:
add_mp:
sta ptr ; store ptr lo from A
tya
clc
adc arrayend+1 ; add number of pages since we have to
sta ptr+1 ; go backwards for add/sub/asl
lda ptr_mp+1
clc
adc arrayend+1 ; add number of pages since we have to
sta ptr_mp+1 ; go backwards for add/sub/asl
ldx arrayend+1 ; full pages
ldy arrayend ; partial
clc
bcc :+++
: dex
dec ptr+1 ; previous page of 256
dec ptr_mp+1 ; previous page of 256
: dey
: lda (ptr),y
adc (ptr_mp),y
sta (ptr),y
tya
bne :--
txa
bne :---
rts
The above code is just a bit slower than two loops (full pages and
partial), but it is shorter and since I do not call add, sub, etc... as
often as div, any optimization will be minimal as well.
Thanks again.
Back to comp.sys.apple2.programmer | Previous | Next — Previous in thread | Next in thread | Find similar
Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-07 16:53 -0600
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-07 18:57 -0700
Re: Fastest method to copy/process a range of bytes? Daniel Kruszyna <dan@krue.net> - 2012-08-08 15:28 +0000
Re: Fastest method to copy/process a range of bytes? aiiadict@gmail.com - 2012-08-07 22:07 -0700
Re: Fastest method to copy/process a range of bytes? "Michael J. Mahon" <mjmahon@aol.com> - 2012-08-08 10:49 -0700
Re: Fastest method to copy/process a range of bytes? "Michael J. Mahon" <mjmahon@aol.com> - 2012-08-08 10:59 -0700
Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-08 21:27 -0500
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-08 22:12 -0700
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-08 22:23 -0700
Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-09 18:35 -0500
Re: Fastest method to copy/process a range of bytes? Jerry <awanderin@yahoo.ca> - 2012-08-11 01:03 -0600
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-11 11:33 -0700
Re: Fastest method to copy/process a range of bytes? mmphosis <mmphosis@macgui.com> - 2012-08-09 06:09 +0000
Re: Fastest method to copy/process a range of bytes? mmphosis <mmphosis@macgui.com> - 2012-08-09 09:40 +0000
Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-09 18:54 -0500
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-09 17:48 -0700
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-09 17:46 -0700
Re: Fastest method to copy/process a range of bytes? Michael J. Mahon <mjmahon@aol.com> - 2012-08-10 15:25 -0500
Re: Fastest method to copy/process a range of bytes? Antoine Vignau <antoine.vignau@laposte.net> - 2012-08-10 14:23 -0700
Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-12 08:56 -0600
Re: Fastest method to copy/process a range of bytes? "Anton Treuenfels" <teamtempest@yahoo.com> - 2012-08-12 23:27 -0500
Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-13 11:18 -0600
Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-11 15:58 -0600
Re: Fastest method to copy/process a range of bytes? Egan Ford <datajerk@gmail.com> - 2012-08-11 16:16 -0600
csiph-web