Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <mco23o$edu$1@ger.gmane.org>
References: <82642f3a-49e8-4982-b135-66ffc04d67d9@googlegroups.com> <54ee8ce2$0$11109$c3e8da3@news.astraweb.com> <1424963166.30927.73.camel@gmail.com> <1915907417446661989.682673sturla.molden-gmail.com@news.gmane.org> <1424972883.30927.138.camel@gmail.com> <mco23o$edu$1@ger.gmane.org>
Date: Thu, 26 Feb 2015 17:28:56 -0500
Subject: Re: Parallelization of Python on GPU?
From: Jason Swails <jason.swails@gmail.com>
To: python list <python-list@python.org>
Content-Type: multipart/alternative; boundary=001a11c1044e07ab0d051005478c
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.19294.1424989738.18130.python-list@python.org>
Lines: 101
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:86552

--001a11c1044e07ab0d051005478c
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Thu, Feb 26, 2015 at 4:10 PM, Sturla Molden <sturla.molden@gmail.com>
wrote:

> On 26/02/15 18:48, Jason Swails wrote:
>
>> On Thu, 2015-02-26 at 16:53 +0000, Sturla Molden wrote:
>>
>>> GPU computing is great if you have the following:
>>>
>>> 1. Your data structures are arrays floating point numbers.
>>>
>>
>> It actually works equally great, if not better, for integers.
>>
>
> Right, but not complicated data structures with a lot of references or
> pointers. It requires data are laid out in regular arrays, and then it ac=
ts
> on these arrays in a data-parallel manner. It is designed to process
> vertices in parallel for computer graphics, and that is a limitation whic=
h
> is always there. It is not a CPU with 1024 cores. It is a "floating point
> monster" which can process 1024 vectors in parallel. You write a tiny
> kernel in a C-like language (CUDA, OpenCL) to process one vector, and the=
n
> it will apply the kernel to all the vectors in an array of vectors. It is
> very comparable to how GLSL and Direct3D vertex and fragment shaders work=
.
> (The reason for which should be obvious.) The GPU is actually great for a
> lot of things in science, but it is not a CPU. The biggest mistake in the
> GPGPU hype is the idea that the GPU will behave like a CPU with many core=
s.


Very well summarized.  At least in my field, though, it is well-known that
GPUs are not 'uber-fast CPUs'.  Algorithms have been redesigned, programs
rewritten to take advantage of their architecture.  It has been a *massive*
investment of time and resources, but (unlike the Xeon Phi coprocessor [1])
has reaped most of its promised rewards.

=E2=80=8B--Jason

[1] I couldn't resist the jab.  At several times the cost of the top of the
line NVidia gaming card, the GPU is about 15-20x faster...

--001a11c1044e07ab0d051005478c
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"color:#000000"><br><=
/div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Thu, Feb =
26, 2015 at 4:10 PM, Sturla Molden <span dir=3D"ltr">&lt;<a href=3D"mailto:=
sturla.molden@gmail.com" target=3D"_blank">sturla.molden@gmail.com</a>&gt;<=
/span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8=
ex;border-left:1px #ccc solid;padding-left:1ex"><span class=3D"">On 26/02/1=
5 18:48, Jason Swails wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
On Thu, 2015-02-26 at 16:53 +0000, Sturla Molden wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
GPU computing is great if you have the following:<br>
<br>
1. Your data structures are arrays floating point numbers.<br>
</blockquote>
<br>
It actually works equally great, if not better, for integers.<br>
</blockquote>
<br></span>
Right, but not complicated data structures with a lot of references or poin=
ters. It requires data are laid out in regular arrays, and then it acts on =
these arrays in a data-parallel manner. It is designed to process vertices =
in parallel for computer graphics, and that is a limitation which is always=
 there. It is not a CPU with 1024 cores. It is a &quot;floating point monst=
er&quot; which can process 1024 vectors in parallel. You write a tiny kerne=
l in a C-like language (CUDA, OpenCL) to process one vector, and then it wi=
ll apply the kernel to all the vectors in an array of vectors. It is very c=
omparable to how GLSL and Direct3D vertex and fragment shaders work. (The r=
eason for which should be obvious.) The GPU is actually great for a lot of =
things in science, but it is not a CPU. The biggest mistake in the GPGPU hy=
pe is the idea that the GPU will behave like a CPU with many cores.</blockq=
uote><div><br></div><div><div class=3D"gmail_default" style=3D"color:rgb(0,=
0,0);display:inline">Very well summarized.=C2=A0 At least in my field, thou=
gh, it is well-known that GPUs are not &#39;uber-fast CPUs&#39;.=C2=A0 Algo=
rithms have been redesigned, programs rewritten to take advantage of their =
architecture.=C2=A0 It has been a *massive* investment of time and resource=
s, but (unlike the Xeon Phi coprocessor [1]) has reaped most of its promise=
d rewards.</div></div><div><div class=3D"gmail_default" style=3D"color:rgb(=
0,0,0);display:inline"><br></div></div><div><div class=3D"gmail_default" st=
yle=3D"color:rgb(0,0,0)">=E2=80=8B--Jason</div></div><div class=3D"gmail_de=
fault" style=3D"color:rgb(0,0,0)"><br></div><div class=3D"gmail_default" st=
yle=3D"color:rgb(0,0,0)">[1] I couldn&#39;t resist the jab.=C2=A0 At severa=
l times the cost of the top of the line NVidia gaming card, the GPU is abou=
t 15-20x faster...</div></div>
</div></div>

--001a11c1044e07ab0d051005478c--