Groups > comp.lang.java.programmer > #9434 > unrolled thread

particle container in Java

Started by	bob <bob@coolgroups.com>
First post	2011-11-03 01:05 -0700
Last post	2011-11-03 19:03 -0400
Articles	15 — 6 participants

Back to article view | Back to comp.lang.java.programmer

  particle container in Java bob <bob@coolgroups.com> - 2011-11-03 01:05 -0700
    Re: particle container in Java Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2011-11-03 08:33 +0000
    Re: particle container in Java Roedy Green <see_website@mindprod.com.invalid> - 2011-11-03 08:12 -0700
      Re: particle container in Java BGB <cr88192@hotmail.com> - 2011-11-03 13:51 -0700
        Re: particle container in Java Roedy Green <see_website@mindprod.com.invalid> - 2011-11-04 15:54 -0700
          Re: particle container in Java BGB <cr88192@hotmail.com> - 2011-11-04 19:48 -0700
            Re: particle container in Java Roedy Green <see_website@mindprod.com.invalid> - 2011-11-05 06:42 -0700
    Re: particle container in Java Giovanni Azua <bravegag@hotmail.com> - 2011-11-03 23:38 +0100
      Re: particle container in Java BGB <cr88192@hotmail.com> - 2011-11-03 22:41 -0700
        Re: particle container in Java Roedy Green <see_website@mindprod.com.invalid> - 2011-11-05 06:49 -0700
          Re: particle container in Java BGB <cr88192@hotmail.com> - 2011-11-05 09:16 -0700
            Re: particle container in Java Roedy Green <see_website@mindprod.com.invalid> - 2011-11-06 03:07 -0800
              Re: particle container in Java Giovanni Azua <bravegag@hotmail.com> - 2011-11-06 15:15 +0100
                Re: particle container in Java BGB <cr88192@hotmail.com> - 2011-11-06 11:06 -0700
    Re: particle container in Java Arne Vajhøj <arne@vajhoej.dk> - 2011-11-03 19:03 -0400

#9434 — particle container in Java

From	bob <bob@coolgroups.com>
Date	2011-11-03 01:05 -0700
Subject	particle container in Java
Message-ID	<17fed3e9-15e0-466c-bb24-10e74633ea1b@t8g2000yql.googlegroups.com>

What is the best data structure to use for a particle container in
Java?

It needs the following ops to be fast:

insert

delete

iteration

It will be used for fx like explosions, smoke, and fire.

[toc] | [next] | [standalone]

#9436

From	Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at>
Date	2011-11-03 08:33 +0000
Message-ID	<slrnjb4ker.fvg.avl@gamma.logic.tuwien.ac.at>
In reply to	#9434

bob <bob@coolgroups.com> wrote:
> What is the best data structure to use for a particle container in
> Java?
> It needs the following ops to be fast:
> insert
> delete
> iteration
> It will be used for fx like explosions, smoke, and fire.

For "insert" and "delete" it depends by what criteria the 
position would be determined.

Maybe LinkedList does it, if deletion and insertion happen
at iterator-positions already obtained.

If you need to "delete by object", then TreeSet might be
worth considering.

[toc] | [prev] | [next] | [standalone]

#9447

From	Roedy Green <see_website@mindprod.com.invalid>
Date	2011-11-03 08:12 -0700
Message-ID	<hlb5b7pfb0kanf6knl8l813gjr8m4q6ckt@4ax.com>
In reply to	#9434

On Thu, 3 Nov 2011 01:05:59 -0700 (PDT), bob <bob@coolgroups.com>
wrote, quoted or indirectly quoted someone who said :

>What is the best data structure to use for a particle container in
>Java?

Possibly a hanging moss structure.  See
http://mindprod.com/jgloss/hangingmoss.html

I assume your problem is you have objects larger than a pixel that
move around in some pseudorandom way and you have to keep computing
their new positions and rendering the result?

The advantage of hanging moss is you can find objects quickly within a
region, and you can rapidly find objects close to a given one -- so
can do collision-rebound logic.
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and 
faster each year, but the focus on money makes software manufacturers 
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore 
ignored. The manufacturer focuses on cheap gimicks like dancing paper 
clips to dazzle naive first-time buyers. The needs of existing 
experienced users are almost irrelevant. I see software rental as the 
best remedy.

[toc] | [prev] | [next] | [standalone]

#9465

From	BGB <cr88192@hotmail.com>
Date	2011-11-03 13:51 -0700
Message-ID	<j8uutg$dhb$1@news.albasani.net>
In reply to	#9447

On 11/3/2011 8:12 AM, Roedy Green wrote:
> On Thu, 3 Nov 2011 01:05:59 -0700 (PDT), bob<bob@coolgroups.com>
> wrote, quoted or indirectly quoted someone who said :
>
>> What is the best data structure to use for a particle container in
>> Java?
>
> Possibly a hanging moss structure.  See
> http://mindprod.com/jgloss/hangingmoss.html
>
> I assume your problem is you have objects larger than a pixel that
> move around in some pseudorandom way and you have to keep computing
> their new positions and rendering the result?
>
> The advantage of hanging moss is you can find objects quickly within a
> region, and you can rapidly find objects close to a given one -- so
> can do collision-rebound logic.

yep...

in my 3D engine (mostly written in plain C) I use a BSP-tree style 
structure.

in this case, it will form an approximate binary tree of all of the 
particles (via an estimated centroid and balance normal, which is used 
to calculate the current division plane), and will occasionally 
re-balance the tree as new particles spawn and die.

the tree can also serve to sort the particles (more or less) from 
back-to-front, allowing drawing them with Z-buffering disabled and 
getting the desired blending effects (if one draws alpha-blended 
particles with Z-buffering enabled, they may clip in nasty looking ways, 
so best results are to disable Z-buffering).

(sadly, particles/models/scene-geometry are not themselves interlaced 
and drawn in proper back-to-front ordering, mostly creating issues with 
alpha-blended objects/particles/sprites/... interacting badly with windows).

I had before experimented with adding inter-particle interactions 
(attractive and repulsive forces), but this dropped the framerate too 
much and caused lots of funky behavior, so generally particles don't 
interact (most particle effects neither really need nor benefit from 
inter-particle forces).

the issue is mostly the large numbers of particles needed to make things 
like like fire effects, ... look decent.

I also have some particle effects which use fragment shaders (such as 
trying to fake flowing water), but sadly particles+shaders does bad 
things to the framerate (so, presently each is drawn as a largish disk). 
it looked better with spheres, but this did too much damage to the 
framerate.

sadly, making "realistic" flowing water which is more than simply a 
visual effect (IOW: has physics, interacts with the player, ...) would 
be a somewhat more complex problem.

the more "traditional" way of representing fluids is by using large 
brushes/polyhedra representing the fluid (giant cubes of "water" and 
similar). however, these can't "flow", and although particle-based water 
can flow, it is much more computationally expensive and expensive to 
render. in this case, fluids would likely exist as semi-solid 
frictionless spheres.

or such...

[toc] | [prev] | [next] | [standalone]

#9537

From	Roedy Green <see_website@mindprod.com.invalid>
Date	2011-11-04 15:54 -0700
Message-ID	<89r8b7964oa0l75o7cvgh06r53l3f4t96d@4ax.com>
In reply to	#9465

On Thu, 03 Nov 2011 13:51:38 -0700, BGB <cr88192@hotmail.com> wrote,
quoted or indirectly quoted someone who said :

>in my 3D engine (mostly written in plain C) I use a BSP-tree style 
>structure.

I had never heard of Binary Space Partitioning trees before.
 http://en.wikipedia.org/wiki/Binary_space_partitioning

I invented Hanging Moss myself back in the 1970s to optimise the
motions of pen-plotters.  The problem was to rapidly find a nearby
thing to draw next.  It was astoundingly faster that computing the
distance to everything else and finding the minimum.

It is just a classifying grid.  At each square you have a chain of
objects.  To find a point near another you chase the chain in the
appropriate square. If you don't find anything you search the next
outer ring of squares. 

If you doubly link the chain, you can efficiently delete and insert.
If you singly link, you can do LIFO/FIFO.

-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and 
faster each year, but the focus on money makes software manufacturers 
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore 
ignored. The manufacturer focuses on cheap gimicks like dancing paper 
clips to dazzle naive first-time buyers. The needs of existing 
experienced users are almost irrelevant. I see software rental as the 
best remedy.

[toc] | [prev] | [next] | [standalone]

#9559

From	BGB <cr88192@hotmail.com>
Date	2011-11-04 19:48 -0700
Message-ID	<j92868$df4$1@news.albasani.net>
In reply to	#9537

On 11/4/2011 3:54 PM, Roedy Green wrote:
> On Thu, 03 Nov 2011 13:51:38 -0700, BGB<cr88192@hotmail.com>  wrote,
> quoted or indirectly quoted someone who said :
>
>> in my 3D engine (mostly written in plain C) I use a BSP-tree style
>> structure.
>
> I had never heard of Binary Space Partitioning trees before.
>   http://en.wikipedia.org/wiki/Binary_space_partitioning
>

I use BSP tree variants for lots of things (rendering, physics, 
particles, scene-graph, ...).

early on (before about 2005 or so), I just used them mostly for 
geometry, but then later discovered algorithms which allowed updating 
and rebuilding them in real-time (or, IOW, a dynamic BSP), which I have 
been making use of since (my 3D engine does not use a prebuilt BSP).

there are a few drawbacks (the BSPs are less optimal, thus far my 3D 
engine doesn't do inter-brush clipping/CSG, ...), but in general it is a 
reasonable tradeoff (not having to pre-build the scene allows a lot more 
nifty stuff).

> I invented Hanging Moss myself back in the 1970s to optimise the
> motions of pen-plotters.  The problem was to rapidly find a nearby
> thing to draw next.  It was astoundingly faster that computing the
> distance to everything else and finding the minimum.
>

yes, it would be...
a BSP is much faster than a linear search as well, FWIW.

another (common) option is oct-trees, but a drawback of oct-trees is 
that one either needs a bounding-box for the region, or to decide upon a 
maximum region size, before one can build an oct-tree.

FWIW, I did not exist in the 70s (my existence began in the 80s).

> It is just a classifying grid.  At each square you have a chain of
> objects.  To find a point near another you chase the chain in the
> appropriate square. If you don't find anything you search the next
> outer ring of squares.
>
> If you doubly link the chain, you can efficiently delete and insert.
> If you singly link, you can do LIFO/FIFO.
>

so, like a chain-hash?...

I guess it could work.

I guess a downside of BSPs is that they can really be built 
asynchronously and/or scaled to an arbitrary size (at least not with my 
current algorithms).

a partial solution (in my 3D engine) may be (or may have been with) the 
introduction of "regions" (loosely influenced by the regions in 
Minecraft), which are currently cubes with a size of 16384 units (1.5 
inches/unit) on a side (albeit they are only 4096 units tall). however, 
currently the "region" system exists independently of the BSP (it mostly 
manages voxel terrain and similar).

the idea would be to allow an arbitrarily large yet continuous world 
(like that in Minecraft) with piecewise loading/unloading as the player 
moves around the world.

as-is, there had been some work at trying to integrate voxels with the 
pre-existing Quake-style worlds.

[toc] | [prev] | [next] | [standalone]

#9570

From	Roedy Green <see_website@mindprod.com.invalid>
Date	2011-11-05 06:42 -0700
Message-ID	<c0eab71ffcvagl7p5ov7t4m7hbai4rl47v@4ax.com>
In reply to	#9559

On Fri, 04 Nov 2011 19:48:12 -0700, BGB <cr88192@hotmail.com> wrote,
quoted or indirectly quoted someone who said :

>so, like a chain-hash?...
there is no hashing in the sense that HashMap does. In hanging moss,
you select the grid cell by dividing x and y by the x and y grid
resolution giving two ints to index into an array of chains.

It is a similar concept to a HashMap.  HashMap categories elements of
its big set into subcategories that it can linearly seach.  I do the
same thing, except instead of hashing, I divide to home in on the
subcategory.

It has the same problem as your oct-tree in that you need a bounding
box to start.

I have added some comments to the entry at
http://mindprod.com/jgloss/hangingmoss.html


-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and 
faster each year, but the focus on money makes software manufacturers 
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore 
ignored. The manufacturer focuses on cheap gimicks like dancing paper 
clips to dazzle naive first-time buyers. The needs of existing 
experienced users are almost irrelevant. I see software rental as the 
best remedy.

[toc] | [prev] | [next] | [standalone]

#9475

From	Giovanni Azua <bravegag@hotmail.com>
Date	2011-11-03 23:38 +0100
Message-ID	<CAD8D679.891A%bravegag@hotmail.com>
In reply to	#9434

Hi,

On 11/3/11 9:05 AM, in article
17fed3e9-15e0-466c-bb24-10e74633ea1b@t8g2000yql.googlegroups.com, "bob"
<bob@coolgroups.com> wrote:
> What is the best data structure to use for a particle container in
> Java?
> 
> It needs the following ops to be fast:
> insert
> delete
> iteration
> 
> It will be used for fx like explosions, smoke, and fire.
> 
Storing the particles is easy, the questions are:

1) how many particles: 10 to the what?
2) particle-particle interactions are O(N^2) so ...

you might want to have a look at the Multipole O(n log n) and Fast Multipole
O(n) methods where you reduce the interaction computation by operating
either at: Box-Particle or Box-Box interactions rather than always
Particle-Particle.

I implemented this in C++ as a homework couple of weeks ago for simulating
fluid dynamics. Basically you create a Quad Tree where every node contains
all Particles within its boundaries, then it will have 4 children Nodes that
contain the subset of Particles in those children Nodes and so on. You split
recursively until every leaf Node contains at most M Particles. Then to do
the simulation you traverse the Tree in pre- or post order and find out
whether you can optimize computation using Box-Box, Box-Particle, or
Particle-Particle interactions. You have to read the algorithm in detail.

Here you find the details and a sample implementation in C++ (HW solution
1):
http://cse-lab.ethz.ch/teaching/classes/mmc2011.html

Regarding the insert/deletes it is log_4 to insert/delete, you do a contains
check top-bottom starting from the root Node ... that's pretty fast I would
say.

HTH,
Best regards,
Giovanni

[toc] | [prev] | [next] | [standalone]

#9509

From	BGB <cr88192@hotmail.com>
Date	2011-11-03 22:41 -0700
Message-ID	<j8vtuh$qhm$1@news.albasani.net>
In reply to	#9475

On 11/3/2011 3:38 PM, Giovanni Azua wrote:
> Hi,
>
> On 11/3/11 9:05 AM, in article
> 17fed3e9-15e0-466c-bb24-10e74633ea1b@t8g2000yql.googlegroups.com, "bob"
> <bob@coolgroups.com>  wrote:
>> What is the best data structure to use for a particle container in
>> Java?
>>
>> It needs the following ops to be fast:
>> insert
>> delete
>> iteration
>>
>> It will be used for fx like explosions, smoke, and fire.
>>
> Storing the particles is easy, the questions are:
>
> 1) how many particles: 10 to the what?

relevant question...

in my case, particles are generally in the 10^3 range (my 3D engine 
currently has a hard-coded limit of 10k particles, but I am not sure how 
many is "typical").

I am aware that probably around 1000-5000 particles are used for most 
particle effects (such as explosions), where one will have maybe 2500 
particles for the "fire and sparks", another 1000 or so for smoke 
particles, ...

IIRC, flame particles are approx 3 inches, and smoke particles 5 inches.

IIRC, most fire sources generally emit 250 particles per second, with 
particles having an approx 5 second lifespan (with a variance of +-3s), 
so around 1250 for a typical fire-source (such as a torch).

there may be a number of such torches in a scene (however, a lot of 
times many may start sputtering, indicating that likely the particle 
limit is being reached).

> 2) particle-particle interactions are O(N^2) so ...
>

in a naive case, yes.
with a BSP or similar it is around O(n log2 n).

however, when n=10k, this may still put a dent in the framerate.

it is also like when writing rigid body physics code:
the vast majority of the time is not often going into actual "physics 
stuff", so much as it is trying to cull away everything that is not 
moving and can't possibly collide.

it is sad in a way:
3D stuff is a good way to show just how little CPU power and RAM one 
really has sometimes.

granted, there is a little more fancy-looking stuff one can do when they 
don't need to try keep things at 40-60 fps, or when one can do things 
with the otherwise absurdly small scenes typical in a tech demo (for 
example, being able to simulate a small box of water doesn't really mean 
it will scale to a lake).

> you might want to have a look at the Multipole O(n log n) and Fast Multipole
> O(n) methods where you reduce the interaction computation by operating
> either at: Box-Particle or Box-Box interactions rather than always
> Particle-Particle.
>
> I implemented this in C++ as a homework couple of weeks ago for simulating
> fluid dynamics. Basically you create a Quad Tree where every node contains
> all Particles within its boundaries, then it will have 4 children Nodes that
> contain the subset of Particles in those children Nodes and so on. You split
> recursively until every leaf Node contains at most M Particles. Then to do
> the simulation you traverse the Tree in pre- or post order and find out
> whether you can optimize computation using Box-Box, Box-Particle, or
> Particle-Particle interactions. You have to read the algorithm in detail.
>

I use a BSP variant personally.

oct-tree would also make sense, but I have seen little to suggest that 
an oct-tree is either particularly faster or simpler than a BSP.

IMHO, a quad-tree seems like a bad idea in 3D for managing particles (it 
makes sense to separate them on all 3 axes if possible).

> Here you find the details and a sample implementation in C++ (HW solution
> 1):
> http://cse-lab.ethz.ch/teaching/classes/mmc2011.html
>
> Regarding the insert/deletes it is log_4 to insert/delete, you do a contains
> check top-bottom starting from the root Node ... that's pretty fast I would
> say.
>

yeah, although IME, inserting/deleting is not so much the main time 
eater, at least vs possible interactions between particles and/or 
rendering them.

if I were doing a fluid, I would likely consider 16 or 32 inch water 
particles. I don't think much smaller would be ideal for the framerate 
(assuming non-trivial amounts of water).

probably the particles would try to maintain an equilibrium distance, 
becoming attractive or repulsive depending on their variation from said 
ideal distance.

also, probably they would need the ability to interact with scene 
geometry, ...

but, it could be cool.

I guess the main issue is how many such particles one can reasonably 
have in a scene while keeping a good frame-rate (I can't entirely 
predict, not having tried doing this).

> HTH,
> Best regards,
> Giovanni
>

[toc] | [prev] | [next] | [standalone]

#9571

From	Roedy Green <see_website@mindprod.com.invalid>
Date	2011-11-05 06:49 -0700
Message-ID	<4mfab7hjme9lrklbo9k3415fljail4cgl4@4ax.com>
In reply to	#9509

On Thu, 03 Nov 2011 22:41:05 -0700, BGB <cr88192@hotmail.com> wrote,
quoted or indirectly quoted someone who said :

>I am aware that probably around 1000-5000 particles are used for most 
>particle effects (such as explosions), where one will have maybe 2500 
>particles for the "fire and sparks", another 1000 or so for smoke 
>particles, ...

are you doing this rendering in "real" time?  In other words fast
enough for game rendering, or are you producing movie effects where
you can take days to produce a few seconds of rendering?
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and 
faster each year, but the focus on money makes software manufacturers 
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore 
ignored. The manufacturer focuses on cheap gimicks like dancing paper 
clips to dazzle naive first-time buyers. The needs of existing 
experienced users are almost irrelevant. I see software rental as the 
best remedy.

[toc] | [prev] | [next] | [standalone]

#9576

From	BGB <cr88192@hotmail.com>
Date	2011-11-05 09:16 -0700
Message-ID	<j93ngj$a4n$1@news.albasani.net>
In reply to	#9571

On 11/5/2011 6:49 AM, Roedy Green wrote:
> On Thu, 03 Nov 2011 22:41:05 -0700, BGB<cr88192@hotmail.com>  wrote,
> quoted or indirectly quoted someone who said :
>
>> I am aware that probably around 1000-5000 particles are used for most
>> particle effects (such as explosions), where one will have maybe 2500
>> particles for the "fire and sparks", another 1000 or so for smoke
>> particles, ...
>
> are you doing this rendering in "real" time?  In other words fast
> enough for game rendering, or are you producing movie effects where
> you can take days to produce a few seconds of rendering?

real time...

more so, I try where possible to keep things in the 40-60fps range, and 
try some to keep it still usable on older HW (a laptop of mine from 
2003, and another laptop of mine from 2009 but with a lame Intel GPU, 
although the frame-rates are not very good on these).

if I were doing it offline, I could have many more particles than this.

the basic strategy for making a particle explosion in my case is 
basically to have lots of fire and smoke particles, which all "explode" 
outward from a central point. most of these are small sprites.

most particles are also non-interacting.

for offline rendering, one would likely have much larger numbers of 
particles.

this is also why one will limit the max number of particles in a scene, 
as otherwise too make particle effects would hurt the framerate (as-is, 
if there are too many particle effects, they will start sputtering).
the current limit is 10k particles, but IIRC this limit was set probably 
back when I was using a Radeon 9200 or similar. on modern HW, a higher 
limit would probably work, but I have no immediate need to change it.

[toc] | [prev] | [next] | [standalone]

#9647

From	Roedy Green <see_website@mindprod.com.invalid>
Date	2011-11-06 03:07 -0800
Message-ID	<8dqcb7dpnonje061cp69hr944tinltg0eq@4ax.com>
In reply to	#9576

On Sat, 05 Nov 2011 09:16:03 -0700, BGB <cr88192@hotmail.com> wrote,
quoted or indirectly quoted someone who said :

>I try where possible to keep things in the 40-60fps range,

Another totally different approach is to fob the work off on the GPU.
I have not coded at low level for a very long time, so I don't know
what modern video cards can do, and how standard the APIs are.

But, it seem to me what you are doing is common to many games, and
surely modern video cards have some way of helping out.  They have
some skookum processing power.  I would think some parallel processing
hardware would really make this sort of thing fly.
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and 
faster each year, but the focus on money makes software manufacturers 
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore 
ignored. The manufacturer focuses on cheap gimicks like dancing paper 
clips to dazzle naive first-time buyers. The needs of existing 
experienced users are almost irrelevant. I see software rental as the 
best remedy.

[toc] | [prev] | [next] | [standalone]

#9652

From	Giovanni Azua <bravegag@hotmail.com>
Date	2011-11-06 15:15 +0100
Message-ID	<CADC54F8.8D01%bravegag@hotmail.com>
In reply to	#9647

Hi Roedy,

On 11/6/11 12:07 PM, in article 8dqcb7dpnonje061cp69hr944tinltg0eq@4ax.com,
"Roedy Green" <see_website@mindprod.com.invalid> wrote:
> On Sat, 05 Nov 2011 09:16:03 -0700, BGB <cr88192@hotmail.com> wrote,
> quoted or indirectly quoted someone who said :
> 
>> I try where possible to keep things in the 40-60fps range,
> 
> Another totally different approach is to fob the work off on the GPU.
> I have not coded at low level for a very long time, so I don't know
> what modern video cards can do, and how standard the APIs are.
> 
> But, it seem to me what you are doing is common to many games, and
> surely modern video cards have some way of helping out.  They have
> some skookum processing power.  I would think some parallel processing
> hardware would really make this sort of thing fly.
>
Absolutely :) when I saw this OP, the first thing that crossed my mind was
"why is he using Java for this ... ?". These are the reasons I would stay
away from Java and rather use C++ to do such a job:

1) Manual optimization for the memory hierarchies e.g. Scalar replacement;
   optimizing loops with the level of unrolling that maximizes the cache hit
   for the underlying system cache specification. This needs "manual
   analysis and manual doing" since e.g. memory aliasing and function calls
   prevent C++ compilers from doing such optimizations automatically. Same
   concepts apply to CUDA 3) below.

2) Same reason as 1) to do manual vectorization i.e. SSE, SSE2, SSE3, SSE4
   etc e.g.
    float a[4] = {1.0, 2.0, 3.0, 4.0};
    __m128 t = _mm_load_ps(a);
    b = _mm_mul_ps(a, a);

3) CUDA nVidia C++ extensions that allows embedding GPU code and invoking it
   from CPU. Here an example that matches the OP:
   http://http.developer.nvidia.com/GPUGems3/gpugems3_ch31.html

   Needs a lot of brainwork to make get it right i.e. To be able
   to outperform CPU in numerical methods e.g. MMM, MVM.
   Trying to implement a Fast Multipole with CUDA would be a daunting task
   ... but is not impossible:

<http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCEQFjA
A&url=http%3A%2F%2Farxiv.org%2Fpdf%2F1010.1482&ei=wZW2TpHAI4mJ4gTX5f37Aw&usg
=AFQjCNGln1eVVemPaPZRP3RtOxKcYeNWjw&sig2=WONUfOoU5zSEYjzUfWLR3w>

<http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CCoQFjA
B&url=http%3A%2F%2Farxiv.org%2Fabs%2F1010.1482&ei=wZW2TpHAI4mJ4gTX5f37Aw&usg
=AFQjCNH8rO4vBk9E-p8W2Le11kDPiQYjkg&sig2=x0t31s43D22f6yN7cq8l6Q>

Best regards,
Giovanni

[toc] | [prev] | [next] | [standalone]

#9662

From	BGB <cr88192@hotmail.com>
Date	2011-11-06 11:06 -0700
Message-ID	<j96ibo$d8r$1@news.albasani.net>
In reply to	#9652

On 11/6/2011 7:15 AM, Giovanni Azua wrote:
> Hi Roedy,
>
> On 11/6/11 12:07 PM, in article 8dqcb7dpnonje061cp69hr944tinltg0eq@4ax.com,
> "Roedy Green"<see_website@mindprod.com.invalid>  wrote:
>> On Sat, 05 Nov 2011 09:16:03 -0700, BGB<cr88192@hotmail.com>  wrote,
>> quoted or indirectly quoted someone who said :
>>
>>> I try where possible to keep things in the 40-60fps range,
>>
>> Another totally different approach is to fob the work off on the GPU.
>> I have not coded at low level for a very long time, so I don't know
>> what modern video cards can do, and how standard the APIs are.
>>

except:
GPGPU requires newer cards (excludes older cards and many laptops and 
entry-level PCs with "Intel GMA" GPUs or similar, ...);
doesn't really work as well when the card is already busy doing 
"everything else";
it is at-present, a bit of a hassle;
...

more so, it is not so relevant when ones' engine is mostly getting 
bogged down trying to shove stuff through the video card (in this case, 
the raw performance on the CPU is not nearly as important...).

>> But, it seem to me what you are doing is common to many games, and
>> surely modern video cards have some way of helping out.  They have
>> some skookum processing power.  I would think some parallel processing
>> hardware would really make this sort of thing fly.
>>
> Absolutely :) when I saw this OP, the first thing that crossed my mind was
> "why is he using Java for this ... ?". These are the reasons I would stay
> away from Java and rather use C++ to do such a job:
>
> 1) Manual optimization for the memory hierarchies e.g. Scalar replacement;
>     optimizing loops with the level of unrolling that maximizes the cache hit
>     for the underlying system cache specification. This needs "manual
>     analysis and manual doing" since e.g. memory aliasing and function calls
>     prevent C++ compilers from doing such optimizations automatically. Same
>     concepts apply to CUDA 3) below.
>

above: probably overkill, FWIW...

though, in my case, pretty much all of my 3D rendering and "core 
functionality" is mostly in C (a few things use C++, but these are 
elsewhere in the project).

I also use Java and BGBScript (my own language) in a few places, but 
this is not anywhere near the 3D renderer (both were used in my case 
with the intention of being for "server side" functionality, which 
mostly means AI/NPC behavior world entities/events and similar).

as-is, much of the "server side" functionality is still in C as well though.

a major downside of Java here was mostly that JNI is not terribly 
usable, and that with a naive VM, it tends to spew a lot of garbage (not 
so good if ones' GC is also not very good).

I was able to make my own language much nicer WRT cross-language 
interfacing.

decided to leave out a glob of history here (interactions and 
developments in my own project WRT the creation of VMs and relations 
between programming languages...).

> 2) Same reason as 1) to do manual vectorization i.e. SSE, SSE2, SSE3, SSE4
>     etc e.g.
>      float a[4] = {1.0, 2.0, 3.0, 4.0};
>      __m128 t = _mm_load_ps(a);
>      b = _mm_mul_ps(a, a);
>

I also do this, but mostly wrapped the intrinsics to make them look more 
like a nicer (GLSL-style) vector API (with fallback logic for if SSE 
wasn't enabled as a compiler switch, where plain structs and scalar math 
are used instead).

sadly, the way it was implemented doesn't currently allow operator 
overloading (in C++), so I have to be more like:
vec3 u, v;
float f, g;
u=vec3(1,2,3,4);
v=v3mul(u, u);
f=v3dot(u, v);
g=v3dot(u, v3scale(v, f));
...

but, it works.

it is implemented as a mix of macros and "static inline" functions.

gradually, this is replacing my "older strategy", which was to represent 
all vectors directly as arrays of floats, and perform operations by 
passing pointers to these float arrays.

the new style can somewhat outperform the old style with SSE enabled, 
but is slightly slower without SSE enabled (mostly due to the added 
memory-copying in passing/returning structs). either way, it is more 
convenient.

> 3) CUDA nVidia C++ extensions that allows embedding GPU code and invoking it
>     from CPU. Here an example that matches the OP:
>     http://http.developer.nvidia.com/GPUGems3/gpugems3_ch31.html
>
>     Needs a lot of brainwork to make get it right i.e. To be able
>     to outperform CPU in numerical methods e.g. MMM, MVM.
>     Trying to implement a Fast Multipole with CUDA would be a daunting task
>     ... but is not impossible:
>
> <http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCEQFjA
> A&url=http%3A%2F%2Farxiv.org%2Fpdf%2F1010.1482&ei=wZW2TpHAI4mJ4gTX5f37Aw&usg
> =AFQjCNGln1eVVemPaPZRP3RtOxKcYeNWjw&sig2=WONUfOoU5zSEYjzUfWLR3w>
>
> <http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CCoQFjA
> B&url=http%3A%2F%2Farxiv.org%2Fabs%2F1010.1482&ei=wZW2TpHAI4mJ4gTX5f37Aw&usg
> =AFQjCNH8rO4vBk9E-p8W2Le11kDPiQYjkg&sig2=x0t31s43D22f6yN7cq8l6Q>
>

FWIW, there is also OpenCL.

OpenCL is to CUDA sort of like what OpenGL is to DirectX...

[toc] | [prev] | [next] | [standalone]

#9476

From	Arne Vajhøj <arne@vajhoej.dk>
Date	2011-11-03 19:03 -0400
Message-ID	<4eb31dd9$0$288$14726298@news.sunsite.dk>
In reply to	#9434

On 11/3/2011 4:05 AM, bob wrote:
> What is the best data structure to use for a particle container in
> Java?
>
> It needs the following ops to be fast:
>
> insert
>
> delete
>
> iteration
>
> It will be used for fx like explosions, smoke, and fire.

Some collection from java.util.

ArrayList, LinkedList or one of the other depending on specifics.

Arne

[toc] | [prev] | [standalone]

csiph-web

particle container in Java

Contents

#9434 — particle container in Java

#9436

#9447

#9465

#9537

#9559

#9570

#9475

#9509

#9571

#9576

#9647

#9652

#9662

#9476