Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.arch > #5850

Re: Register file splits, a new look.

From MitchAlsup <MitchAlsup@aol.com>
Newsgroups comp.arch
Subject Re: Register file splits, a new look.
Date 2012-02-09 15:52 -0800
Organization http://groups.google.com
Message-ID <18962822.1178.1328831552908.JavaMail.geo-discussion-forums@yqks7> (permalink)
References <ggtgp-1DA77E.06332706022012@netnews.mchsi.com> <7z1uq4qpf4.fsf@ask.diku.dk> <21254419.964.1328818372560.JavaMail.geo-discussion-forums@yqad38> <d89e4a91-57c5-4ffb-9350-ca0f22af4991@x19g2000yqh.googlegroups.com>

Show all headers | View raw


On Thursday, February 9, 2012 5:27:20 PM UTC-6, Paul A. Clayton wrote:
> On Feb 9, 3:12 pm, MitchAlsup <MitchAl...@aol.com> wrote:
> > On Thursday, February 9, 2012 4:19:59 AM UTC-6, Torben Ægidius Mogensen wrote:
> > > A disadvantage of a single unified register file is that you need more
> > > read/write ports to support the same total number of reads and writes
> > > per cycle.
> >
> > Important side note: One can increase the number of read ports
> > rather easily by replication. It is much more difficult to
> > increase the number of write ports.
> 
> OTOH, writes are more tolerant of buffering--which can not only
> average overall demand over time but also average bank usage
> over time<snip>

OTOOH; the scheduler is probably already involved by the time the write data is placed on the result bus, so while the register file might be tollerent, the input demands of the now launching dependent instruction is definately not.

That is you can buffer the writes, you cannot buffer the forwarding.
 

> Another factor might be that the front-end map might be
> read for write register names 

If Store instructions use the destination register bit-positionin their encoding, the register containing the value to be stored must be read in its MAP. {Most instruction sets do this.} Thus, the destination regsiter might need to be a read operation through its MAP, unless you can determine way ahead of time that it is not a Store instruction.

Instruction sets that require write insertion (like x86) require either the destinations be read, or all uses of the destination register be serialized.
 
> Presumably clever (pre)decoding tricks could reduce the
> throughput requirements for renaming (at least a little)?

In a packet cache (or Track cache) one can arrange that only memory reference instructions need read MAP access on the destination registers. This can save some MAP ports.
 
> [snip]
> > In the separate approach, it still requires 2 files of 128
> > registers each to support the integer throughput, and then it
> > takes another 2 files of 96-entries to support the FP
> > throughput requirements. And you require 2 MAPs. The reason
> > it takes 2 files here is that you still have to absorb and
> > deliver memory (2 Load 1 store per cycle) requests to the FP
> > files as the FP units (Add, Mul, Div; or 2 MACs and a DIV)
> > are crunching along.
> 
> I suspect having much different sizes of entries may also
> make separate files more attractive (though a two-cluster
> RF might act as a double-width single register file).

Yes! It is finally here, where separate register files make sense. That is have a unified register file for int, mem, FP, and have a separate register file for short vector activities.
 
> Side thought/question: Could a two bit per cell RF be
> relatively easily double-pumped to support SIMD?

Not with todays wire delay
 
> Having architecturally separate registers also opens the
> possibility of sharing the FP/SIMD resources among multiple
> cores (like the SPARC T1) or 'cores' (like AMD's Bulldozer).

While SIMD thiings can contain FP data, they are also likely to have all sorts of short vector data types suitable for graphics,... These are throughput orrented parts of the data path and should be properly anointed with componetry like swizzling multiplexeers, multi-operand function units, and don't need back-to-back forwarding to meet their throughput requirements.

One wants something like a 256-bit register and 16-to-32 of them so that 4 DP FP values can be manipulated at once (big machines like x86, not so much the cell-phone machines).
 
> Are there differences in latency vs. throughput optimization
> for register files?  

Other than size and select line wire delay, the register files are essentially independent of throughput versus latency. Once you surround the file with forwarding you DO pick up the throughput versus latency issues.

> If no FP/SIMD operations are used and no state is
> maintained, it seems that aggressive power saving mechanisms
> could be implemented.  Even if state was retained, moving
> such into a particular subset of the registers [only 32
> registers vs. 96] would seem to allow power saving
> mechanisms to be used.

Unlikely, unless you have already sacrificed latency in the register file access because you have time in some other pipe stage leading towards computation.
 
> Alternately, the otherwise unused register state might be
> used to hold additional contexts for switch-on-event-MT.

ACK!

> Would it be practical to have the register files be
> separate but aligned to allow very fast transfer of data?

If your architecture explicitly expresses threads, maybe.

Mitch

Back to comp.arch | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Register file splits, a new look. Brett Davis <ggtgp@yahoo.com> - 2012-02-06 06:33 -0600
  Re: Register file splits, a new look. MitchAlsup <MitchAlsup@aol.com> - 2012-02-06 09:29 -0800
    Re: Register file splits, a new look. Brett Davis <ggtgp@yahoo.com> - 2012-02-07 07:08 -0600
  Re: Register file splits, a new look. torbenm@diku.dk (Torben Ægidius Mogensen) - 2012-02-09 11:19 +0100
    Re: Register file splits, a new look. MitchAlsup <MitchAlsup@aol.com> - 2012-02-09 12:12 -0800
      Re: Register file splits, a new look. "Paul A. Clayton" <paaronclayton@gmail.com> - 2012-02-09 15:27 -0800
        Re: Register file splits, a new look. MitchAlsup <MitchAlsup@aol.com> - 2012-02-09 15:52 -0800
      Re: Register file splits, a new look. nmm1@cam.ac.uk - 2012-02-10 08:07 +0000
        Re: Register file splits, a new look. "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-10 06:37 -0800
      Re: Register file splits, a new look. Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-10 09:36 +0100
      Re: Register file splits, a new look. Brett Davis <ggtgp@yahoo.com> - 2012-02-10 12:22 -0600
        Re: Register file splits, a new look. MitchAlsup <MitchAlsup@aol.com> - 2012-02-10 12:26 -0800
          Re: Register file splits, a new look. Brett Davis <ggtgp@yahoo.com> - 2012-02-11 07:25 -0600
            Re: Register file splits, a new look. Brett Davis <ggtgp@yahoo.com> - 2012-02-16 17:17 -0600
            Re: Register file splits, a new look. MitchAlsup <MitchAlsup@aol.com> - 2012-02-16 16:25 -0800
              Re: Register file splits, a new look. Brett Davis <ggtgp@yahoo.com> - 2012-02-17 14:08 -0600
                Re: Register file splits, a new look. MitchAlsup <MitchAlsup@aol.com> - 2012-02-28 08:54 -0800
                Re: Register file splits, a new look. Brett Davis <ggtgp@yahoo.com> - 2012-02-28 19:33 -0600

csiph-web