Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.forth > #1088 > unrolled thread

More Thoughts on Green Arrays

Started byrickman <gnuarm@gmail.com>
First post2011-04-09 10:44 -0700
Last post2011-04-26 11:03 +0000
Articles 20 on this page of 40 — 10 participants

Back to article view | Back to comp.lang.forth


Contents

  More Thoughts on Green Arrays rickman <gnuarm@gmail.com> - 2011-04-09 10:44 -0700
    Re: More Thoughts on Green Arrays "Greg Bailey" <greg@GreenArrayChips.com> - 2011-04-09 17:53 -0700
      Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-04-10 14:53 -0700
        Re: More Thoughts on Green Arrays "Greg Bailey" <greg@greenarraychips.com> - 2011-04-11 10:18 -0700
    Re: More Thoughts on Green Arrays Brad <hwfwguy@gmail.com> - 2011-04-11 14:44 -0700
      Re: More Thoughts on Green Arrays Brad <hwfwguy@gmail.com> - 2011-04-11 17:11 -0700
        Re: More Thoughts on Green Arrays rickman <gnuarm@gmail.com> - 2011-04-11 21:41 -0700
          Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-04-12 00:06 -0700
            Re: More Thoughts on Green Arrays rickman <gnuarm@gmail.com> - 2011-04-12 06:22 -0700
              Re: More Thoughts on Green Arrays Charley Shattuck <cshattuck@surewest.net> - 2011-04-17 21:11 +0000
                Re: More Thoughts on Green Arrays rickman <gnuarm@gmail.com> - 2011-04-17 14:53 -0700
                Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-04-19 21:13 -0700
                  Re: More Thoughts on Green Arrays Albert van der Horst <albert@spenarnc.xs4all.nl> - 2011-04-20 17:42 +0000
                    Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-04-23 04:03 -0700
                      Re: More Thoughts on Green Arrays foxchip <fox@ultratechnology.com> - 2011-04-23 23:10 -0700
                        Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-04-25 09:00 -0700
                          Re: More Thoughts on Green Arrays rickman <gnuarm@gmail.com> - 2011-04-25 17:10 -0700
                          Re: More Thoughts on Green Arrays foxchip <fox@ultratechnology.com> - 2011-05-01 10:37 -0700
                            Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-05-02 00:14 -0700
                              Re: More Thoughts on Green Arrays foxchip <fox@ultratechnology.com> - 2011-05-02 08:20 -0700
                                Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-05-03 21:26 -0700
                                  Re: More Thoughts on Green Arrays "Greg Bailey" <greg@greenarraychips.com> - 2011-05-07 20:11 -0700
                                  Re: More Thoughts on Green Arrays "Greg Bailey" <greg@greenarraychips.com> - 2011-05-08 22:35 -0700
                                    Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-05-09 08:11 -0700
                        Re: More Thoughts on Green Arrays Albert van der Horst <albert@spenarnc.xs4all.nl> - 2011-04-26 11:22 +0000
                          Re: More Thoughts on Green Arrays foxchip <fox@ultratechnology.com> - 2011-05-01 10:05 -0700
                            Re: More Thoughts on Green Arrays Albert van der Horst <albert@spenarnc.xs4all.nl> - 2011-05-02 12:01 +0000
                              Re: More Thoughts on Green Arrays foxchip <fox@ultratechnology.com> - 2011-05-02 07:51 -0700
                                Re: More Thoughts on Green Arrays "Greg Bailey" <greg@greenarraychips.com> - 2011-05-08 22:23 -0700
                Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-04-27 23:17 -0700
              Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-04-17 20:41 -0700
                Re: More Thoughts on Green Arrays rickman <gnuarm@gmail.com> - 2011-04-18 21:38 -0700
                  Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-04-19 22:20 -0700
                    Re: More Thoughts on Green Arrays rickman <gnuarm@gmail.com> - 2011-04-20 00:56 -0700
                      Re: More Thoughts on Green Arrays Paul Rubin <no.email@nospam.invalid> - 2011-04-20 23:02 -0700
                        Re: More Thoughts on Green Arrays rickman <gnuarm@gmail.com> - 2011-04-22 10:32 -0700
                          Re: More Thoughts on Green Arrays Bernd Paysan <bernd.paysan@gmx.de> - 2011-04-22 22:06 +0200
                            Re: More Thoughts on Green Arrays rickman <gnuarm@gmail.com> - 2011-04-23 18:42 -0700
                          Re: More Thoughts on Green Arrays "Paul E. Bennett" <Paul_E.Bennett@topmail.co.uk> - 2011-04-23 12:17 +0100
            Re: More Thoughts on Green Arrays Albert van der Horst <albert@spenarnc.xs4all.nl> - 2011-04-26 11:03 +0000

Page 1 of 2  [1] 2  Next page →


#1088 — More Thoughts on Green Arrays

Fromrickman <gnuarm@gmail.com>
Date2011-04-09 10:44 -0700
SubjectMore Thoughts on Green Arrays
Message-ID<853c5ae6-879d-4e83-ba88-ebaee2894ea7@a12g2000yqk.googlegroups.com>
I had a conversation about the Green Arrays chips in an FPGA
discussion group where I said the devices should be thought of as
processor arrays with the emphasis on array rather than processor.
The point is that no one thinks about a LUT in an FPGA in terms of
keeping it "busy" or utilizing every gate, etc.  They know that there
are lots of LUTs and FFs and they are cheap, so there is no need to
optimize every gate.

Another poster was very concerned about the sorts of issues you have
when you are multitasking on typical processors, like preventing
deadlocks.  I can't remember ever worrying about deadlocks when
designing logic although I'm sure it is possible.  But I couldn't
explain how they are avoided when designing hardware.

In fact, that is how I see the Green Array devices as programmable
hardware, not software engines.  But I'm not sure I fully understand
how to use all the features in this way. Consider the memory
interface.  It uses three processors to control memory and has to be
accessible from many nodes in the chip.  I suppose one will be the
primary node to handle access requests.  I suppose some of the other
nodes would need to pass messages around and act as multiplexors.

The one thing I am having trouble getting around is the tiny 64 word
memories.  I can accept a small stack and I can see where minimal data
memory is needed, but I am stuck thinking about how to use a node for
processing and also use it for passing data between the other nodes.

Do the tools have a simulator so that code can be tested before the
hardware is available?  That is another way I see these as being
similar to hardware, I will feel a lot more comfortable using a
simulator that lets me "see" what is going on in the chip.  How else
will you be able to debug 144 processors?

Rick

[toc] | [next] | [standalone]


#1100

From"Greg Bailey" <greg@GreenArrayChips.com>
Date2011-04-09 17:53 -0700
Message-ID<wYSdnZehW5zeYj3QnZ2dnUVZ_uudnZ2d@giganews.com>
In reply to#1088
Darn right they include a simulator, Rick!  One can find that out by
RTFM but on the assumption you aren't the only fellow who has not
done that we just updated the website so that it is a little bit harder
to miss now.

In addition, we just today released a new version of arrayForth that
incorporates a lot of improvements John, Jeff, Charley and others have
made over the past months in softsim as a result of having used it
extensively themselves.  So the timing is good for you to get
acquainted with it.  hotline@greenarraychips.com is the right email
address to talk with about usage issues if any.  Downloadable right
now, free of charge, http://www.greenarraychips.com  where there
is a bunch of other news being announced.

Cheers - Greg

"rickman"  wrote in message 
news:853c5ae6-879d-4e83-ba88-ebaee2894ea7@a12g2000yqk.googlegroups.com...

<snip>
Do the tools have a simulator so that code can be tested before the
hardware is available?  That is another way I see these as being
similar to hardware, I will feel a lot more comfortable using a
simulator that lets me "see" what is going on in the chip.  How else
will you be able to debug 144 processors?

Rick

[toc] | [prev] | [next] | [standalone]


#1135

FromPaul Rubin <no.email@nospam.invalid>
Date2011-04-10 14:53 -0700
Message-ID<7xipulept0.fsf@ruckus.brouhaha.com>
In reply to#1100
"Greg Bailey" <greg@GreenArrayChips.com> writes:
> Darn right they include a simulator, Rick!  One can find that out by
> RTFM but on the assumption you aren't the only fellow who has not
> done that we just updated the website so that it is a little bit harder
> to miss now.
  
Hey Greg, good that you're hear, and I hope things are going well at GA.

Have you released the source code for your Eforth port, including the
virtual machine that runs on the GA nodes?  That might be re-usable for
purposes of targeting a C compiler to the GA, like you suggested in your
Forth Day video.  I don't feel likely to be the guy to do such a port,
but it would be interesting to look at, and probably helpful to someone
wanting to pursue running C on the chip.

[toc] | [prev] | [next] | [standalone]


#1156

From"Greg Bailey" <greg@greenarraychips.com>
Date2011-04-11 10:18 -0700
Message-ID<vtSdnb31QpdUqj7QnZ2dnUVZ_hWdnZ2d@giganews.com>
In reply to#1135
"Paul Rubin"  wrote in message news:7xipulept0.fsf@ruckus.brouhaha.com...

"Greg Bailey" <greg@GreenArrayChips.com> writes:
> Darn right they include a simulator, Rick!  One can find that out by
> RTFM but on the assumption you aren't the only fellow who has not
> done that we just updated the website so that it is a little bit harder
> to miss now.

Hey Greg, good that you're hear, and I hope things are going well at GA.

Have you released the source code for your Eforth port, including the
virtual machine that runs on the GA nodes?  That might be re-usable for
purposes of targeting a C compiler to the GA, like you suggested in your
Forth Day video.  I don't feel likely to be the guy to do such a port,
but it would be interesting to look at, and probably helpful to someone
wanting to pursue running C on the chip.

--------------------------------------------------------------------------------

Howdy!

Source for the current VM is included in the arrayForth release and 
documentation is underway, look for a prelim release soon. 

[toc] | [prev] | [next] | [standalone]


#1163

FromBrad <hwfwguy@gmail.com>
Date2011-04-11 14:44 -0700
Message-ID<b3deb02e-38f9-425e-8809-8dd118688247@w36g2000vbi.googlegroups.com>
In reply to#1088
On Apr 9, 10:44 am, rickman <gnu...@gmail.com> wrote:
> I had a conversation about the Green Arrays chips in an FPGA
> discussion group where I said the devices should be thought of as
> processor arrays with the emphasis on array rather than processor.

The FPGA analogy is pretty good IMHO, with GreenArrays using a 180nm
process so you would have to compare them to 180nm FPGAs. FPGAs have
embedded RAM blocks and hard multipliers these days, and those will
hopefully come to GreenArrays.

> ... I suppose some of the other
> nodes would need to pass messages around and act as multiplexors.

Yep, the nodes are often used as routing. They can do things to the
signal as they pass it along. It takes very little power to act as a
smart wire.

> The one thing I am having trouble getting around is the tiny 64 word
> memories.  I can accept a small stack and I can see where minimal data
> memory is needed, but I am stuck thinking about how to use a node for
> processing and also use it for passing data between the other nodes.

I don't agree with the one-size-fits-all philosophy.  There are sure
to be tasks that would like double or quadruple the RAM so there
should be some cores in the mix (that would take up double the
footprint of a normal C18) that are RAM-intensive. It's okay if they
run a little slower, since they're async cores. IMHO, with factoring
the power of code grows exponentially up to a point of diminishing
returns. 64 words is probably well before that point.

-Brad

[toc] | [prev] | [next] | [standalone]


#1167

FromBrad <hwfwguy@gmail.com>
Date2011-04-11 17:11 -0700
Message-ID<5eb2c674-b0c4-47f4-83a7-bae99790898a@p16g2000vbo.googlegroups.com>
In reply to#1163
On Apr 11, 2:44 pm, Brad <hwfw...@gmail.com> wrote:
> I don't agree with the one-size-fits-all philosophy.

Here's a picture of what I'd like to see in an array:
https://sites.google.com/site/forthtoychest/sea.png

-Brad

[toc] | [prev] | [next] | [standalone]


#1170

Fromrickman <gnuarm@gmail.com>
Date2011-04-11 21:41 -0700
Message-ID<456746e6-3be5-455d-855b-14695498cae0@24g2000yqk.googlegroups.com>
In reply to#1167
On Apr 11, 8:11 pm, Brad <hwfw...@gmail.com> wrote:
> On Apr 11, 2:44 pm, Brad <hwfw...@gmail.com> wrote:
>
> > I don't agree with the one-size-fits-all philosophy.
>
> Here's a picture of what I'd like to see in an array:https://sites.google.com/site/forthtoychest/sea.png
>
> -Brad

They may not have provided you with CPUs with 8x the memory, but they
did provide the option of shutting off one node to allow an adjacent
node to use the RAM giving that node double the amount.  I am pretty
sure I read that somewhere.

There are a number of features of the GA devices that make me a bit
uncomfortable, but I'm not ready to start recommending changes until I
have tried using the parts.

My biggest concern is that there won't be enough app notes to provide
the techniques to solve practical problems.  For example, the
oscillator app note does not give enough info to add a crystal to the
chip for an oscillator.  Then of course, the big difference of these
chips from others is the software and I have no idea how much will be
required to explain how to properly work with that.

I'm pretty convinced these parts have a lot of applications if I can
understand how to properly use them.  So I'll be interested in
reviewing the docs they come out with.

Rick

[toc] | [prev] | [next] | [standalone]


#1171

FromPaul Rubin <no.email@nospam.invalid>
Date2011-04-12 00:06 -0700
Message-ID<7xk4f0ymmp.fsf@ruckus.brouhaha.com>
In reply to#1170
rickman <gnuarm@gmail.com> writes:
> They may not have provided you with CPUs with 8x the memory, but they
> did provide the option of shutting off one node to allow an adjacent
> node to use the RAM giving that node double the amount.  I am pretty
> sure I read that somewhere.

As I understand, you can use the adjacent nodes' ram as (essentially)
I/O devices, so it's slower than using local ram, and uses some of your
code space. 

> I'm pretty convinced these parts have a lot of applications if I can
> understand how to properly use them.  So I'll be interested in
> reviewing the docs they come out with.

It does seem cool to get so much computing speed, at so small power
consumption and low cost.  But I spent a while trying to think of
applications, and almost everything I could want to do turns out to not
really be workable.  Most everything I do wants 8-bit bytes and 32-bit
arithmetic, or else wants gobs of ram, or else wants fast DSP blocks
or floating point arithmetic, etc.

[toc] | [prev] | [next] | [standalone]


#1177

Fromrickman <gnuarm@gmail.com>
Date2011-04-12 06:22 -0700
Message-ID<2fe6c13d-10f3-4e79-bd03-ef2fad3b50cf@n10g2000yqf.googlegroups.com>
In reply to#1171
On Apr 12, 3:06 am, Paul Rubin <no.em...@nospam.invalid> wrote:
> rickman <gnu...@gmail.com> writes:
> > They may not have provided you with CPUs with 8x the memory, but they
> > did provide the option of shutting off one node to allow an adjacent
> > node to use the RAM giving that node double the amount.  I am pretty
> > sure I read that somewhere.
>
> As I understand, you can use the adjacent nodes' ram as (essentially)
> I/O devices, so it's slower than using local ram, and uses some of your
> code space.

I had the impression it was a direct connection, but perhaps it is
just a message protocol that requires software on both nodes.  That
would not be as fast, but it would be completely extensible to all
nodes you wish to make part of the "RAM pool".

Rather than speculate, I guess we should wait for more details.  This
is the sort o thing that I am waiting for.  We won't have an idea of
what problems these parts may be applied to until we have more details
on how to make the part do useful things.


> > I'm pretty convinced these parts have a lot of applications if I can
> > understand how to properly use them.  So I'll be interested in
> > reviewing the docs they come out with.
>
> It does seem cool to get so much computing speed, at so small power
> consumption and low cost.  But I spent a while trying to think of
> applications, and almost everything I could want to do turns out to not
> really be workable.  Most everything I do wants 8-bit bytes and 32-bit
> arithmetic, or else wants gobs of ram, or else wants fast DSP blocks
> or floating point arithmetic, etc.

I don't understand how needing 8 bit data can be a problem.  Are you
saying that you want the device to automatically limit the data range
somehow?  I don't see how wanting 32 bit arithmetic can be a problem
either.  Can't multiple precision arithmetic do the job?  The RAM
issue can be dealt with by adding external memory, either static or
dynamic RAM.  Is that not fast enough?  I don't follow the fast DSP
concern at all.  666 MIPS is not fast enough?  Ok, so the serial
multiply slows it to say 30 MMACS... times 144 is... well you do the
math.

For signal processing these devices don't seem suited to the high end,
RADAR sample rates that FPGAs can handle.  But they seem very capable
of handling audio processing.

Could it be that you are used to thinking in terms of the available
solutions which all fit a "standard" mold?  I find the unique features
of this part to be applicable to many applications.  I just don't know
enough to understand how to properly apply them to an application.  I
do think that the low power of this device will be what will put this
device into new applications that other parts can't do.

Rick

[toc] | [prev] | [next] | [standalone]


#1269

FromCharley Shattuck <cshattuck@surewest.net>
Date2011-04-17 21:11 +0000
Message-ID<a2eb6$4dab5769$42cd8ada$954@EVERESTKC.NET>
In reply to#1177
Take a look at the appnote on MD5,  http://www.greenarraychips.com/home/
documents/pub/AP001-MD5.html  on the Green Arrays website. Skip down to 
the source blocks 842, 844, and 850. The block 842 puts a 64 word table 
of numbers into one node. Block 844 does the same for another node. In 
this case we're doing 32 bit arithmetic by letting one row of nodes have 
the high 16 bits and an adjacent row have the low 16 bits. Once in awhile 
two nodes will share carry information.

But for now just look at one 64 word table. That node jumps to the port 
shared with a neighbor and waits for instructions to be provided by that 
neighbor. Once the data node has jumped to or called its code provider 
the code in block 850's GET will read consecutive numbers from the data 
table as needed. It looks like:

: get   right b! @p ..  @+ !p ..  !b @b ;

So, "right b!" causes the B register to point to the data node neighbor. 
"@p .." fetches the next word in memory to the data stack and pads the 
rest of this word with nops. The word fetched is the instruction word 
compiled by "@+ !p .." "@+" makes the data node fetch a word of data from 
where its A register points. (The A register was initialized to 0 in the 
word GO). "!p" writes the value back into the port. Meanwhile the code 
node has executed "!b" to send the instruction word to the data node and 
is waiting for that data with a "@b".

Note that the data node has absolutely no code in its RAM, just data. The 
neighbor with the code simply writes instruction words into the shared 
comm port to tell the data node what to do.

I think that if you read enough of the website you'll see that there *is* 
some info about how to do things. It will only get better as time goes on.

And, as Greg said, the appropriate place to ask such questions is the 
hotline@GreenArrayChips.com if you'd like to get an answer from someone 
who knows it.

Charley.

[toc] | [prev] | [next] | [standalone]


#1273

Fromrickman <gnuarm@gmail.com>
Date2011-04-17 14:53 -0700
Message-ID<699fce67-9396-4bd7-88a1-b8720ed5b941@l6g2000vbn.googlegroups.com>
In reply to#1269
On Apr 17, 5:11 pm, Charley Shattuck <cshatt...@surewest.net> wrote:
> Take a look at the appnote on MD5,  http://www.greenarraychips.com/home/
> documents/pub/AP001-MD5.html  on the Green Arrays website. Skip down to
> the source blocks 842, 844, and 850. The block 842 puts a 64 word table
> of numbers into one node. Block 844 does the same for another node. In
> this case we're doing 32 bit arithmetic by letting one row of nodes have
> the high 16 bits and an adjacent row have the low 16 bits. Once in awhile
> two nodes will share carry information.
>
> But for now just look at one 64 word table. That node jumps to the port
> shared with a neighbor and waits for instructions to be provided by that
> neighbor. Once the data node has jumped to or called its code provider
> the code in block 850's GET will read consecutive numbers from the data
> table as needed. It looks like:
>
> : get   right b! @p ..  @+ !p ..  !b @b ;
>
> So, "right b!" causes the B register to point to the data node neighbor.
> "@p .." fetches the next word in memory to the data stack and pads the
> rest of this word with nops. The word fetched is the instruction word
> compiled by "@+ !p .." "@+" makes the data node fetch a word of data from
> where its A register points. (The A register was initialized to 0 in the
> word GO). "!p" writes the value back into the port. Meanwhile the code
> node has executed "!b" to send the instruction word to the data node and
> is waiting for that data with a "@b".
>
> Note that the data node has absolutely no code in its RAM, just data. The
> neighbor with the code simply writes instruction words into the shared
> comm port to tell the data node what to do.
>
> I think that if you read enough of the website you'll see that there *is*
> some info about how to do things. It will only get better as time goes on.
>
> And, as Greg said, the appropriate place to ask such questions is the
> hotl...@GreenArrayChips.com if you'd like to get an answer from someone
> who knows it.

Charley,

The last thing I want to do is to knock Green Arrays.  I like your
ideas and I like the possibilities of the products.  But I think your
main difficulty is going to be getting past the bias many engineers
will have against a part like this and educating them on how to
properly use the chip in fruitful ways.  I get that you may have tons
of stuff on your web site but expecting a potential user to reverse
engineer code to learn how to use these parts is a bit of a reach.

Again, I'm not trying to knock your company's efforts, but my one
attempt to evaluate the parts was to look at an app which would be
doing ADC and DAC conversions and so would need a very stable clock.
I tried to read the app note on a 32 kHz crystal oscillator and was
not able to verify in a simulation that I could make it work.  I heard
from someone from your company and was told this would bre fleshed out
in detail later and indicated that a 10 MHz oscillator was working in
the lab as proof that it was practical.  Working "in the lab" is not
at all the same thing as being ready for prime time.  Maybe it was
more comment of others in this group insisting that this constituted
"proof" that it was good to go.  But to be used in a design an
oscillator has to have a lot of properties that can't be verified by
"it works in the lab".

I'm still positive on the GA devices.  But I will need much better
docs and app notes before I will be able to create useful designs with
them.  I'm looking forward to that time.

Rick

[toc] | [prev] | [next] | [standalone]


#1337

FromPaul Rubin <no.email@nospam.invalid>
Date2011-04-19 21:13 -0700
Message-ID<7xtydtsgp1.fsf@ruckus.brouhaha.com>
In reply to#1269
Charley Shattuck <cshattuck@surewest.net> writes:
> I think that if you read enough of the website you'll see that there *is* 
> some info about how to do things. It will only get better as time goes on.

Yes, there is a reasonable amount of docs now (at least for software)
and it was also helpful to look at the Seaforth docs which are also
still online.  There are some parts that I'm a bit confused by
conceptually, like how one gets code to the GA144's interior nodes, or
what the rom of those nodes contains.  I see node 105(?) containing the
eforth vm is rather far from the memory controller--does that mean using
several wire nodes between the vm and external memory?  I guess those
delays of a few ns are still much faster than external dram.

I spent a little time last night looking at the vm and related code in
the arrayforth zip file.  I didn't understand much of it, and it was
clearly anachronistic (I'd never seen code as "blocks" before though I'd
heard of it), but it was in a certain way beautiful.  I'd like to try to
figure it out some more.

[toc] | [prev] | [next] | [standalone]


#1349

FromAlbert van der Horst <albert@spenarnc.xs4all.nl>
Date2011-04-20 17:42 +0000
Message-ID<ljyp6c.1cj@spenarnc.xs4all.nl>
In reply to#1337
In article <7xtydtsgp1.fsf@ruckus.brouhaha.com>,
Paul Rubin  <no.email@nospam.invalid> wrote:
>Charley Shattuck <cshattuck@surewest.net> writes:
>> I think that if you read enough of the website you'll see that there *is*
>> some info about how to do things. It will only get better as time goes on.
>
>Yes, there is a reasonable amount of docs now (at least for software)
>and it was also helpful to look at the Seaforth docs which are also
>still online.  There are some parts that I'm a bit confused by
>conceptually, like how one gets code to the GA144's interior nodes, or
>what the rom of those nodes contains.  I see node 105(?) containing the
>eforth vm is rather far from the memory controller--does that mean using
>several wire nodes between the vm and external memory?  I guess those
>delays of a few ns are still much faster than external dram.

Getting code to interior nodes? That is like being worried about not
knowing what is going on inside a compiler.

I had parpi running on actual chips, you know. The Intellasys chips.
There were a couple of pieces of code, and then you tell the
system which code runs on what node, and how they are connected.
The system communicated with a PC through a serial line.
in : lN
out : the number of primes under N (after a couple of mS.)

We didn't have to write the code to handle a serial line either,
would you have expected otherwise?

>
>I spent a little time last night looking at the vm and related code in
>the arrayforth zip file.  I didn't understand much of it, and it was
>clearly anachronistic (I'd never seen code as "blocks" before though I'd
>heard of it), but it was in a certain way beautiful.  I'd like to try to
>figure it out some more.

We (Leon Konings and I) have moved away from blocks,
and now use ascii source, with some (own) tools.
(Been mentioned before on this forum.)

Groetjes Albert

--
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

[toc] | [prev] | [next] | [standalone]


#1456

FromPaul Rubin <no.email@nospam.invalid>
Date2011-04-23 04:03 -0700
Message-ID<7xd3kd9qmm.fsf@ruckus.brouhaha.com>
In reply to#1349
Albert van der Horst <albert@spenarnc.xs4all.nl> writes:
> Getting code to interior nodes? That is like being worried about not
> knowing what is going on inside a compiler.

As a compiler hacker, of course I want to know what compilers do.

> There were a couple of pieces of code, and then you tell the
> system which code runs on what node, and how they are connected.
> The system communicated with a PC through a serial line.

OK, so there is some kind of boot loader that moves the code around.
Can you replace code on the fly on individual nodes, during different
phases of a computation?

> We (Leon Konings and I) have moved away from blocks,
> and now use ascii source, with some (own) tools.
> (Been mentioned before on this forum.)

The blocks thing has its charms, but yeah, ascii seems more practical.

[toc] | [prev] | [next] | [standalone]


#1478

Fromfoxchip <fox@ultratechnology.com>
Date2011-04-23 23:10 -0700
Message-ID<dd08c476-df9c-4480-84b4-410623a0dbc9@j16g2000pro.googlegroups.com>
In reply to#1456
On Apr 23, 3:03 am, Paul Rubin <no.em...@nospam.invalid> wrote:
> OK, so there is some kind of boot loader that moves the code around.
> Can you replace code on the fly on individual nodes, during different
> phases of a computation?

There is an SPI flash boot node, an autobaud asynchronous serial
boot node, an synchronous serial boot node, a 1-wire serial boot
node, and two serdes boot nodes with boot code in ROM. Any of
these nodes can load boot any or all nodes into RAM and then
any node with pins can be used to boot in other ways.  Of course
you can replace code on the fly on individual nodes by writing
to RAM.

You can also boot from a combination of nodes, loading a program
from SPI flash to all nodes on the chip, or loading an external
memory driver and then booting from parallel flash, then you
can load programs interactively or run debug diagnostic software
from any node with pins.  This is all very basic stuff that has
been discussed for years.

I found this very helpful when I spent a couple of days creating
the talking voltmeter demo that used five nodes.  I loaded programs
and ran them interactively from the PC, then I wrote some of it
to flash after that was debugged.  Then I booted it from SPI flash
and continued to run it interactively from the PC, added a little
more code and wrote that to flash to boot the full application.
I picked it as an example because I noticed that the parts it
used, booting from flash, reading data from flash, using analog
output pins or doing pwm analog output on a digital pin to play
speech, doing analog input, linearizing analog input, converting
A/D input into digits, indexing into SPI flash to play the
recorded spoken analog digits, buffering the streamed voice output,
and running from flash in a stand-along mode or offering various
ways to interact with it at a Forth command line did not take
much code.  These were the sorts of beginners tutorials I saw
being offered on other microcontrollers and I wanted to show
how simple it all was to partition and implement and how the
whole thing could be done in about 175 words of memory.  I
wanted to show that that sort of program only needed a tiny
fraction of the resources on a 24node or 40node design.

> > We (Leon Konings and I) have moved away from blocks,
> > and now use ascii source, with some (own) tools.
> > (Been mentioned before on this forum.)

VentureForth also used ascii source in files so that programers
who had abandoned Forth blocks or never appreciated the advantages
they have over files could play using the tools that they used
for other things.  Some people write editors, some say that they
would kill if someone tried to take away the editor that they
are used to using that was written by someone else. ;-)

Chuck never used VentureForth or the ascii files.  I made the
mistake of talking about blocks in past tense once around Chuck
and he corrected me.  I was the person who felt that most
people had never appreciated the advantages of blocks for
simplicity of software, speed, throughput, editing of source,
and data storage and that we could offer an enviroment that
required only using a small fraction of Forth.  I also did
CAD and software development work in colorforth so I was in
a pretty good position to compare the details of the use of
both environments.

I did feel sort of like a traitor for offering the VentureForth
tools to the public in ANS Forth, using files, etc. when I knew
what really worked best.  But I also had seen how many people had
abandoned the best features of Forth and how many never grasped
what they were about and had seen how many people in c.l.f just
absolutely hated how different colorforth was from things they
already knew.

> The blocks thing has its charms, but yeah, ascii seems more practical.

VentureForth did have some features that were not implemented in
colorforth blocks at the time like generating code templates and
doing dataflow algebra to generate test code templates, automate
place and route of functions, and prove correctness.  Of course
those things are not hard but many people complained that they
didn't understand those things, and were sure they were hard and
would be the source of their worst errors or just scare them
away from wanting to try simple stuff kids learn to do.

I found it amusing when Dave Guzeman would tell people that they
were silly to be so worried about place and route.  He would say that
from everything he had seen it was about like deciding how to
place your groceries in your refrigerator and he doubted if people
really had to rely on computer programs to put their food in their
fridge.  I was also offering tools designed for when one connected
up a few hundred or a few thousand cluster chips together which
does make place and route, data flow correctness, and interactive
debugging in Forth a little more complicated than on chips with
only a few small nodes to deal with.

Best Wishes

[toc] | [prev] | [next] | [standalone]


#1519

FromPaul Rubin <no.email@nospam.invalid>
Date2011-04-25 09:00 -0700
Message-ID<7xpqoauxr1.fsf@ruckus.brouhaha.com>
In reply to#1478
> I found this very helpful when I spent a couple of days creating
> the talking voltmeter demo that used five nodes.  ...

That sounds like it would make a nice tutorial/application note.

> These were the sorts of beginners tutorials I saw
> being offered on other microcontrollers and I wanted to show
> how simple it all was to partition and implement and how the
> whole thing could be done in about 175 words of memory.  I
> wanted to show that that sort of program only needed a tiny
> fraction of the resources on a 24node or 40node design.

See, here's the issue I'm wondering about.  175 words of code plus some
data can fit on 5 nodes, which could be 1 central node doing something
like remote procedure calls to its 4 neighbors using ports, fine.  But
what happens if you need 10 or 20 nodes?  Now you need nodes
communicating with other node that are somewhat distant on the chip, so
there has to be code on the intermediate nodes to route traffic around
while still getting on with its own functions, and it starts to seem
really cramped with so few channels and so little code space.  It's not
like a PC where can you just plop in a multi-threaded TCP server.  How
do you handle this?

> VentureForth did have some features that were not implemented in
> colorforth blocks at the time like generating code templates and
> doing dataflow algebra to generate test code templates, automate
> place and route of functions, and prove correctness.  Of course
> those things are not hard

Are you saying colorforth has this stuff now?  Maybe you're not using
the terminology the same way as me, but I think of those as nontrivial
problems.

> I found it amusing when Dave Guzeman would tell people that they
> were silly to be so worried about place and route.  He would say that
> from everything he had seen it was about like deciding how to
> place your groceries in your refrigerator and he doubted if people
> really had to rely on computer programs to put their food in their
> fridge.

I know this is a big issue for CAD programs that tend to be one of the
driving applications for SAT solvers.

[toc] | [prev] | [next] | [standalone]


#1528

Fromrickman <gnuarm@gmail.com>
Date2011-04-25 17:10 -0700
Message-ID<3493f4c0-e53c-493d-9def-033bde39ee5c@u15g2000vby.googlegroups.com>
In reply to#1519
On Apr 25, 12:00 pm, Paul Rubin <no.em...@nospam.invalid> wrote:
> > I found this very helpful when I spent a couple of days creating
> > the talking voltmeter demo that used five nodes.  ...
>
> That sounds like it would make a nice tutorial/application note.
>
> > These were the sorts of beginners tutorials I saw
> > being offered on other microcontrollers and I wanted to show
> > how simple it all was to partition and implement and how the
> > whole thing could be done in about 175 words of memory.  I
> > wanted to show that that sort of program only needed a tiny
> > fraction of the resources on a 24node or 40node design.
>
> See, here's the issue I'm wondering about.  175 words of code plus some
> data can fit on 5 nodes, which could be 1 central node doing something
> like remote procedure calls to its 4 neighbors using ports, fine.  But
> what happens if you need 10 or 20 nodes?  Now you need nodes
> communicating with other node that are somewhat distant on the chip, so
> there has to be code on the intermediate nodes to route traffic around
> while still getting on with its own functions, and it starts to seem
> really cramped with so few channels and so little code space.  It's not
> like a PC where can you just plop in a multi-threaded TCP server.  How
> do you handle this?

Not sure how you get 175 words of code.  It is 64 18 bit words per
node, no?  Each word has up to... I forget, four instruction?  So it
would be up to 5 * 64 or 320 words or over 1000 instructions in five
nodes.

If you can't design an app to split the task between a number of nodes
so that each one fit in the memory available, I don't think I would
use the GA devices.  Typically for these devices, I would consider the
data path first and allocate processors along the line of the data
flow, typically data flow is linear.  Then if there will be processors
that need to interpret commands I would allocate a flow for the
control path.  Depending on the app, that may be more of a tree
structure, or it may suit to pass the control along with the data in
packets.

Just consider how you want to decompose the design and then allocate
to nodes... like groceries in your refrigerator... I like that.


> > I found it amusing when Dave Guzeman would tell people that they
> > were silly to be so worried about place and route.  He would say that
> > from everything he had seen it was about like deciding how to
> > place your groceries in your refrigerator and he doubted if people
> > really had to rely on computer programs to put their food in their
> > fridge.
>
> I know this is a big issue for CAD programs that tend to be one of the
> driving applications for SAT solvers.

I don't know about the rest of it, but this is a question I can
answer.  The difference between place and route in an ASIC or FPGA and
putting away your groceries is the order of magnitude.  I can picture
everything in my refrigerator, even if that is not an altogether
pleasant thought, while it is virtually impossible for a human to
manage place and route for 10,000 or 100,000 LUT/FFs with some 6 or
more inputs and up to two outputs each.  144 nodes with just four
channels each is  a piece of cake... in the fridge.

Rick

[toc] | [prev] | [next] | [standalone]


#1681

Fromfoxchip <fox@ultratechnology.com>
Date2011-05-01 10:37 -0700
Message-ID<dec18823-c359-4d26-86d3-9cc920014454@q12g2000prb.googlegroups.com>
In reply to#1519
On Apr 25, 8:00 am, Paul Rubin <no.em...@nospam.invalid> wrote:
> See, here's the issue I'm wondering about.  175 words of code plus some
> data can fit on 5 nodes, which could be 1 central node doing something
> like remote procedure calls to its 4 neighbors using ports, fine.  But
> what happens if you need 10 or 20 nodes?  

The design began by looking at the required dataflow to have analog
readings coming in on one end, the SPI flash serving up recorded sound
on the other end, and node doing analog our or pwm analog out on a
digital pin to one side, and a few nodes doing some conversions along
the way.  Once a simple dataflow pattern was determined a very simple
code template for the data flow was generated and a little conversion
code was added to that to define the application.

When laying it out any nodes that needed to be wires just used the
code for the dataflow template instantiated with IN and OUT ports
specified by its placement.  The dataflow code was generated and
the instantiated IN and OUT on each node was generated by placement.

The idea was the this simple approach worked for 5, 10, or 10,000
nodes quite easily.  Doing 5 or 10 nodes manually is easy but
doing 10,000 or 100,000 nodes individually one by one would not
be very productive.

The SEAforth24 design done first had SPI FLASH more distant from
analog so it needed a few simple wire nodes instantiated with the
data flow pattern template code. The SEAforth40 had SPI flash close
to analog nodes so did not need the simple wire nodes with the
simple dataflow template code.  It was a simple demo after all
meant to show it was at least as simple and maybe smaller than
the same demo done on other small embedded micros.

> Now you need nodes
> communicating with other node that are somewhat distant on the chip, so
> there has to be code on the intermediate nodes to route traffic around
> while still getting on with its own functions, and it starts to seem
> really cramped with so few channels and so little code space.  

That was not the case. You are overly concerned about something about
as simple as something could be.  How complicated is a line or two of
machine generated code plopped down on a node?

> It's not like a PC where can you just plop in a multi-threaded
> TCP server.  How do you handle this?

You plop in a mult-threaded TCP server if you have one in the
library just like on a PC.  When ever I mention something like
that Greg always says, you are just talking about a TCP/IP server
as he has lots of experience with those.

It was also a demo of how people who plop things in a from a
library to create an application could do just that in a GUI
based drag and drop environment just like they like to do on a
PC or with some embedded programming tools. ;-)

> > VentureForth did have some features that were not implemented in
> > colorforth blocks at the time like generating code templates and
> > doing dataflow algebra to generate test code templates, automate
> > place and route of functions, and prove correctness.  Of course
> > those things are not hard
>
> Are you saying colorforth has this stuff now?  Maybe you're not using
> the terminology the same way as me, but I think of those as nontrivial
> problems.

Did you read the paper explaining it?  I did a little more reading
on Haskell and how it approaches parallel programming and got more
insight into why you see so many things as being so complicated.

No. I was saying that VentureForth was designed for drag and drop
programmers who want to plop things in from a library and have the
compiler make sure that the details are correct and the dataflow
is verified correct with the generated templates instantiated by
just doing a drag and drop onto a picture.  It was not hard stuff
to do but it was not the sort of thing the programmers who were
doing work in colorforth were doing.  And in those days colorforth
did not even have a softsim module.

> I know this is a big issue for CAD programs that tend to be one of the
> driving applications for SAT solvers.

Most of the colorforth users back then were doing VLSI CAD designing
simple circuits that other CAD programs could not comprehend.  They
were not writing application programs and the code they were using
was not designed to work like the code used by drag and drop
CAD programmers.  GA being a smaller company has focused on using
their colorforth tools.

Best Wishes

[toc] | [prev] | [next] | [standalone]


#1695

FromPaul Rubin <no.email@nospam.invalid>
Date2011-05-02 00:14 -0700
Message-ID<7xhb9dshf2.fsf@ruckus.brouhaha.com>
In reply to#1681
foxchip <fox@ultratechnology.com> writes:
> When laying it out any nodes that needed to be wires just used the
> code for the dataflow template instantiated with IN and OUT ports
> specified by its placement.  The dataflow code was generated and
> the instantiated IN and OUT on each node was generated by placement.
>
> The idea was the this simple approach worked for 5, 10, or 10,000
> nodes quite easily.  Doing 5 or 10 nodes manually is easy but
> doing 10,000 or 100,000 nodes individually one by one would not
> be very productive.

Thanks, maybe one issue is that I haven't seen your tools in action, so
I have to just go by the chip data sheet.  Do your tools have manuals
online?  If yes, I haven't seen them.

An example of what I'm asking: suppose you want to implement a 4Kbyte
lookup table in ram.  With 2 bytes per word that takes 32 nodes, with no
ram space left for holding code.  So if you have a 4*8 block of nodes
for the ram, how do you get data out of the interior ones?  Can a code
word written to an exterior node (on one edge) route all the way to the
other side of the block somehow?  Is there around 5ns delay for each
node that the data has to traverse?

I also wonder if there's a way to 1) asynchronously check whether input
is available on a port; 2) listen to all 4 ports simultaneously and
sleep until input appears on one of them (like "select" in unix).

> You plop in a mult-threaded TCP server if you have one in the
> library just like on a PC. 

That is pretty impressive with just 64 words of ram...

>> Are you saying colorforth has this stuff now? 
> Did you read the paper explaining it? 

Probably not--which paper? 

> No. I was saying that VentureForth was designed for drag and drop
> programmers who want to plop things in from a library

Oh I see, I didn't realize that.  

[toc] | [prev] | [next] | [standalone]


#1699

Fromfoxchip <fox@ultratechnology.com>
Date2011-05-02 08:20 -0700
Message-ID<8adc3ef4-17f6-428c-8f24-4825abc929f2@i39g2000prd.googlegroups.com>
In reply to#1695
On May 1, 11:14 pm, Paul Rubin <no.em...@nospam.invalid> wrote:
> Thanks, maybe one issue is that I haven't seen your tools in action, so
> I have to just go by the chip data sheet.  Do your tools have manuals
> online?  If yes, I haven't seen them.

I think this is where I came in when over a year ago you were
complaining that the megabytes of documentation, gigabytes of
videos, and the many tutorials and explanations I had given
didn't exist at all.  I do get tired of repeating myself to
people who prefer to argue that they can keep their eyes and
ears and mind closed.

It would be so much easier if people would listen to answers
or read documentation or even watch videos rather than just
ask the same questions over and over.

> I also wonder if there's a way to 1) asynchronously check whether input
> is available on a port;

Sure, that's basic stuff documented a hundred times in a hundred
places.  The IOCS register contains the status of neighbor
ports.  One can read it without halting and see if any neighbors are
sleeping waiting for responses to port reads or writes.

> 2) listen to all 4 ports simultaneously and
> sleep until input appears on one of them (like "select" in unix).

Sure, that's basic stuff documented a hundred times in a hundred
places.  It's called multiport read to address RDLU which stands
for Right, Down, Left, Up.  This name is also the order in which
the status bits appear in the IOCS register.

I get the impression more and more that you still haven't read
any documentation.

> > Did you read the paper explaining it?
>
> Probably not--which paper?

As I say, I think this is where I began discussions with you
about a year ago when you were complaining that the explanations,
documentation, and videos didn't exist.  I do get tired of
repeating myself to answer your same basic questions when you
simply don't bother to look at documentation.

The answer to that question is that I posted that explanation and
further documentation to my 2008 Forth Day presentation in my
blog at http://www.ultratechnology.com/blog.htm#ForthDay2008

where there is also a link to

A Transformational Algebra for Communicating Sequential Process
Data-Flow Diagram Statements in Classes of Parallel Forthlet
Objects for Design, Automated Place and Route, and Application
Development on the SEAforth Architecture.

http://www.ultratechnology.com/CSP-data-flow_diagrams.doc

and where there was also code to "plop down" a talking
voltmeter demo.  The example given did not show the GUI
based drag and drop interface done by one of the Russian
programmers but it did mention it.

> > No. I was saying that VentureForth was designed for drag and drop
> > programmers who want to plop things in from a library
>
> Oh I see, I didn't realize that.  

Yes.  Now I would prefer to move on from the 2008 documentation
of the work done before that.

Best Wishes

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.forth


csiph-web