Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.forth > #9050

Re: Ideas for a portable Forth

From "Rod Pemberton" <do_not_have@noavailemail.cmm>
Newsgroups comp.lang.forth
Subject Re: Ideas for a portable Forth
Date 2012-01-19 17:33 -0500
Organization Aioe.org NNTP Server
Message-ID <jfa5rh$j9q$1@speranza.aioe.org> (permalink)
References (2 earlier) <59885a2e-666d-45f3-b872-d9b5e5e1e0e0@m4g2000pbc.googlegroups.com> <jf6om9$uq5$1@speranza.aioe.org> <261b0427-52d2-43bf-9ad2-9f64b2f23031@s18g2000vby.googlegroups.com> <jf8mao$us3$1@speranza.aioe.org> <9af39726-2641-4674-98e3-f05bff830204@m4g2000vbc.googlegroups.com>

Show all headers | View raw


"Alex McDonald" <blog@rivadpm.com> wrote in message
news:9af39726-2641-4674-98e3-f05bff830204@m4g2000vbc.googlegroups.com...
> On Jan 19, 8:06 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "Alex McDonald" <b...@rivadpm.com> wrote in message
> >
news:261b0427-52d2-43bf-9ad2-9f64b2f23031@s18g2000vby.googlegroups.com...
> > > On Jan 18, 2:34 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> > > wrote:
...

> > > > XCHG should be avoided since it's slow.
>
> > > XCHG reg, reg is not expensive; you mean XCHG reg, mem which
> > > is expensive due to the implied LOCK.
>
> > I meant XCHG reg, reg is slow because it's non-pairable (or was) ...
> > He should be able to find and use other instructions which are faster
> > in combination.
>
> On PII and above XCHG is pairable.

Is this a test?

From "Opt. Intel 32-Bit Proc." 1995 for 386, 486, Pentium, and P6, XCHG is
"NP" (non-pairable).

From "Intel Arch. Opt. Man." 1997 for Pentium and P6, XCHG is "NP"
(non-pairable)

From "Intel Arch. Opt. Ref. Man." 1998 for PII and PIII, XCHG's "# of uops"
is 3 for reg,reg or "complex" for memory.  It takes more uops than other
instructions even for reg, reg ... i.e., probably non-pairable, but then
again maybe not.  I'd like to assume "complex" means non-pairable ...

From "IA-32 Intel arch. Opt." 2003 for P4 and PM, "minimize the use of xchg
instructions on memory locations."  XCHG's latency is 1.5.  That's probably
for register operands...  I.e., XCHG  is non-pairable.  The pairable
instructions have 0.5 for latency.

From "Intel 64 & IA-32 Opt. Ref. Man." 2006 for Netburst, PM, Intel Core,
"minimize the use of xchg instructions on memory locations."  XCHG's latency
is 1.5 or 1, cpu respective.  I.e., XCHG  is non-pairable.  The pairable
instructions have 1 or 0.5 for latency, cpu respective.

From "AMD Soft. Opt. Guide AMD64 Processors" 2005, "VectorPath instructions
block the decoding of DirectPath instructions."  I think means VectorPath
instructions non-pairable ...  XCHG is a VectorPath instruction, i.e.,
non-pairable.  There is one exception for an encoding of XCHG: the NOP
variant which has special hardware support.  XCHG's latency is 2 for
register operands, or 16 if there is a memory operand.

From "AMD Soft. Opt. Guide 10h Family" 2007 for Opteron and Phenom,
"VectorPath instructions block the decoding of DirectPath instructions."
XCHG is either VectorPath for 8-bit operands or DirectPath Double for other
operands.  XCHG's latency is 1 or 2 for register encodings, 15 or 16 for
memory encodings.

So, maybe XCHG is pairable on PII and PIII or maybe it isn't, and XCHG is
what I'd like to assume is "partially pairable" (DirectPath Double) for
Opteron and Phenom.

> > > Just use EBP as the data stack, and manipulate it directly instead
> > > of XCHGing and using POP/PUSH. The instructions are longer
> > > than using POP/PUSH/XCHG etc, but the speed of stack access
> > > isn't compromised.
>
> > Both ESP and EBP use SS by default for all 32-bit address forms. If
> > you're using ESP on SS and EBP on SS, then there is a chance that
> > both stacks could collide, depending on the space available and
> > consumed and where EBP and ESP initially point.
>
> I can't see why you mention it, since that's equally true in an
> XCHG EBP, ESP scheme.
>

The speed of stack access *is* compromised if one uses an override.  Yes,
you're correct in that it can apply to the "XCHG EBP,ESP scheme".  However,
in an "XCHG EBP,ESP scheme", one is not likely to use overrides.  If one
did, then every PUSH and POP would need an override .... (assume many).  So,
one would more likely attempt to ensure the stacks are separate for an "XCHG
EBP,ESP scheme" and so not use an override in order to use PUSH and POP
easily.

> 32bit Windows, Linux, BSD and so on don't use
> segment registers in userland; [...]

Ok, a very polite "wtf" are you talking about?!?

If they don't use segment registers in userland for 32-bits, then they can't
address anything!

*ALL* protected-mode addressing on x86 is formed from a base address plus an
offset.  The offset is what the programmer generally considers to be the
code or data address.  It's relative to the base address for that selector.
Simplified, the segment registers have a selector which points to a
descriptor in a table which contains the base address.  For *ANY* "userland"
code to execute, it must have at least two selectors: code and data.  I.e.,
at least two segment registers are used in "userland", CS and DS, and most
probably SS too since you don't want the stack overwriting code or data ...

> [...] apart from specialised OS uses of FS and GS, the segment
> registers CS, SS, DS and ES are all equal.

No, they aren't.  For 32-bit PM, CS cannot be equal to the others.  It's the
selector of a code segment.  CS's descriptor must be setup with different
information.  All the others are selectors for data segments and can be the
same.  They don't have to be though.  They can have a whole host of
differences: segment start, segment size, size of data (granularity),
rights, etc.  64-bit x86 makes some simplifications so that essentially only
FS and GS are used.  I'm not update on 64-bit changes.


Rod Pemberton


Back to comp.lang.forth | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Ideas for a portable Forth Brad <hwfwguy@gmail.com> - 2012-01-16 07:26 -0800
  Re: Ideas for a portable Forth Tarkin <tarkin000@gmail.com> - 2012-01-16 10:32 -0800
    Re: Ideas for a portable Forth Brad <hwfwguy@gmail.com> - 2012-01-17 06:44 -0800
      Re: Ideas for a portable Forth "Rod Pemberton" <do_not_have@noavailemail.cmm> - 2012-01-18 09:34 -0500
        Re: Ideas for a portable Forth Mat <dambere@web.de> - 2012-01-18 12:09 -0800
        Re: Ideas for a portable Forth Alex McDonald <blog@rivadpm.com> - 2012-01-18 14:57 -0800
          Re: Ideas for a portable Forth "Rod Pemberton" <do_not_have@noavailemail.cmm> - 2012-01-19 03:06 -0500
            Re: Ideas for a portable Forth Alex McDonald <blog@rivadpm.com> - 2012-01-19 03:14 -0800
              Re: Ideas for a portable Forth "Rod Pemberton" <do_not_have@noavailemail.cmm> - 2012-01-19 17:33 -0500
                Re: Ideas for a portable Forth stephenXXX@mpeforth.com (Stephen Pelc) - 2012-01-20 09:37 +0000
                Re: Ideas for a portable Forth Alex McDonald <blog@rivadpm.com> - 2012-01-20 03:02 -0800
                Re: Ideas for a portable Forth Alex McDonald <blog@rivadpm.com> - 2012-01-20 04:25 -0800
                Re: Ideas for a portable Forth Mat <dambere@web.de> - 2012-01-21 07:42 -0800
                Re: Ideas for a portable Forth "Rod Pemberton" <do_not_have@noavailemail.cmm> - 2012-01-21 15:48 -0500
                Re: Ideas for a portable Forth Alex McDonald <blog@rivadpm.com> - 2012-01-21 15:01 -0800
                Re: Ideas for a portable Forth "Rod Pemberton" <do_not_have@noavailemail.cmm> - 2012-01-21 21:14 -0500
  Re: Ideas for a portable Forth Albert van der Horst <albert@spenarnc.xs4all.nl> - 2012-01-16 22:42 +0000
  Re: Ideas for a portable Forth "Rod Pemberton" <do_not_have@noavailemail.cmm> - 2012-01-17 07:43 -0500

csiph-web