Re: Parallelization of Python on GPU?

Path	csiph.com!usenet.pasdenom.info!news.redatomik.org!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path	<jason.swails@gmail.com>
X-Original-To	python-list@python.org
Delivered-To	python-list@mail.python.org
X-Spam-Status	OK 0.088
X-Spam-Evidence	'H': 0.82; 'S': 0.00; 'subject:Python': 0.06; 'received:198': 0.07; 'arrays': 0.09; 'bits': 0.09; 'python': 0.11; "(it's": 0.16; '^^^': 0.16; 'agreed,': 0.16; 'compute': 0.16; 'imo.': 0.16; 'integers.': 0.16; 'precision.': 0.16; 'rarely': 0.16; 'subtractions': 0.16; 'wrote:': 0.18; 'numerical': 0.19; 'thu,': 0.19; 'fit': 0.20; 'code,': 0.22; 'previously': 0.22; "i've": 0.25; 'second': 0.26; 'header:In-Reply-To:1': 0.27; 'tried': 0.27; 'point': 0.28; 'fixed': 0.29; '(like': 0.30; 'bigger': 0.30; 'needed.': 0.30; '(which': 0.31; 'code': 0.31; 'etc.).': 0.31; 'worked': 0.33; "can't": 0.35; 'problem.': 0.35; 'computing': 0.35; 'one,': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; '(e.g.,': 0.36; 'ram': 0.36; 'subject:?': 0.36; 'similar': 0.36; 'two': 0.37; 'message- id:@gmail.com': 0.38; 'easiest': 0.38; 'jason': 0.38; 'to:addr :python-list': 0.38; 'to:addr:python.org': 0.39; 'enough': 0.39; 'most': 0.60; 'hardware': 0.61; 'numbers': 0.61; 'matter': 0.61; 'great': 0.65; 'articles': 0.65; 'linked': 0.65; 'biggest': 0.67; 'benefit': 0.68; 'optimized': 0.68; 'limit': 0.70; 'potentially': 0.81; 'around,': 0.84; 'before...': 0.84; 'capability': 0.84; 'cuda': 0.84; 'demonstrates': 0.84; 'devastating': 0.84; 'high,': 0.84; 'maths': 0.84; 'stability': 0.84
DKIM-Signature	v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:subject:from:to:date:in-reply-to:references:content-type :mime-version:content-transfer-encoding; bh=kfnEyeih3HCZRsVbxcuWlfy2cJj4O6U3MU2+AbnlcvI=; b=TU0Tn0VzIeDK+RawUtsgX5eNhc3qseBJhI0gwg9UcUyZugGMsUmEUi4Ti/ZhqCqw+Z CEX+M5GqZGCZwLQIHmKNSiXV5nWjU1nt3vKjEPTGU8qTcNge17tYSfY8jBuFbzQVplNY nDv13q3SoZdOJlUi9WbGq9AurVgW5P2sY4TskgKFL6qWAf8YBEvzGSoSqGD5gYRd0F7C B4By8PQPTI0yF0kiCBBZngPEif54Oy3WLR3Kg7no0qa8OCv0lIE5HQ4PpVr/w1yFGApc 7Z8dtq3jZ8zq64nkNIcEtV3xm0PJ/7jWUg8Rg4/nvcXNYwDG+Qh6+s7RMuNPo9j0sD7S HXjg==
X-Received	by 10.140.202.213 with SMTP id x204mr20540756qha.95.1424972872306; Thu, 26 Feb 2015 09:47:52 -0800 (PST)
Subject	Re: Parallelization of Python on GPU?
From	Jason Swails <jason.swails@gmail.com>
To	python-list@python.org
Date	Thu, 26 Feb 2015 12:48:03 -0500
In-Reply-To	<1915907417446661989.682673sturla.molden-gmail.com@news.gmane.org>
References	<82642f3a-49e8-4982-b135-66ffc04d67d9@googlegroups.com> <54ee8ce2$0$11109$c3e8da3@news.astraweb.com> <1424963166.30927.73.camel@gmail.com> <1915907417446661989.682673sturla.molden-gmail.com@news.gmane.org>
Content-Type	text/plain; charset="UTF-8"
X-Mailer	Evolution 3.12.8
Mime-Version	1.0
Content-Transfer-Encoding	7bit
X-BeenThere	python-list@python.org
X-Mailman-Version	2.1.15
Precedence	list
List-Id	General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe	<https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive	<http://mail.python.org/pipermail/python-list/>
List-Post	<mailto:python-list@python.org>
List-Help	<mailto:python-list-request@python.org?subject=help>
List-Subscribe	<https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups	comp.lang.python
Message-ID	<mailman.19278.1424972875.18130.python-list@python.org> (permalink)
Lines	48
NNTP-Posting-Host	2001:888:2000:d::a6
X-Trace	1424972875 news.xs4all.nl 2865 [2001:888:2000:d::a6]:45727
X-Complaints-To	abuse@xs4all.nl
Xref	csiph.com comp.lang.python:86525

Show key headers only | View raw

On Thu, 2015-02-26 at 16:53 +0000, Sturla Molden wrote:
> GPU computing is great if you have the following:
> 
> 1. Your data structures are arrays floating point numbers.

It actually works equally great, if not better, for integers.

> 2. You have a data-parallel problem.

This is the biggest one, IMO. ^^^

> 3. You are happy with single precision.

NVidia GPUs have double-precision maths in hardware since compute
capability 1.2 (GTX 280).  That's ca. 2008.  In optimized CPU code, you
still get ~50% benefit going from double to single precision (it's
rarely ever that high, but 20-30% is commonplace in my experience of
optimized code).  It's admittedly a bigger hit on most GPUs, but there
are ways to work around it (e.g., fixed precision), and you can still do
double precision work where it's needed.  One of the articles I linked
previously demonstrates that a hybrid precision model (based on fixed
precision) provides exactly the same numerical stability as double
precision (which is much better than pure single precision) for that
application.

Double precision can often be avoided in many parts of a calculation,
using it only where those bits matter (like accumulators with
potentially small contributions, subtractions of two numbers of similar
magnitude, etc.).

> 4. You have time to code erything in CUDA or OpenCL.

This is the second biggest one, IMO. ^^^

> 5. You have enough video RAM to store your data.

Again, it can be worked around, but the frequent GPU->CPU xfers involved
if you can't fit everything on the GPU can be painstaking to limit its
potentially devastating effects on performance.

> 
> For Python the easiest solution is to use Numba Pro.

Agreed, although I've never actually tried PyCUDA before...

All the best,
Jason

Thread

Parallelization of Python on GPU? John Ladasky <john_ladasky@sbcglobal.net> - 2015-02-25 18:35 -0800
  Re: Parallelization of Python on GPU? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-26 14:02 +1100
    Re: Parallelization of Python on GPU? John Ladasky <john_ladasky@sbcglobal.net> - 2015-02-25 20:01 -0800
    Re: Parallelization of Python on GPU? Jason Swails <jason.swails@gmail.com> - 2015-02-26 10:06 -0500
    Re: Parallelization of Python on GPU? Sturla Molden <sturla.molden@gmail.com> - 2015-02-26 16:53 +0000
    Re: Parallelization of Python on GPU? Terry Reedy <tjreedy@udel.edu> - 2015-02-26 12:16 -0500
    Re: Parallelization of Python on GPU? Jason Swails <jason.swails@gmail.com> - 2015-02-26 12:48 -0500
    Re: Parallelization of Python on GPU? Sturla Molden <sturla.molden@gmail.com> - 2015-02-26 22:10 +0100
    Re: Parallelization of Python on GPU? Jason Swails <jason.swails@gmail.com> - 2015-02-26 17:28 -0500
  Re: Parallelization of Python on GPU? Ethan Furman <ethan@stoneleaf.us> - 2015-02-25 19:03 -0800
  Re: Parallelization of Python on GPU? Ethan Furman <ethan@stoneleaf.us> - 2015-02-25 19:05 -0800
    Re: Parallelization of Python on GPU? John Ladasky <john_ladasky@sbcglobal.net> - 2015-02-25 21:53 -0800
      Re: Parallelization of Python on GPU? Christian Gollwitzer <auriocus@gmx.de> - 2015-02-27 19:55 +0100
  Re: Parallelization of Python on GPU? Jason Swails <jason.swails@gmail.com> - 2015-02-26 10:27 -0500
  Re: Parallelization of Python on GPU? Sturla Molden <sturla.molden@gmail.com> - 2015-02-26 16:40 +0000
    Re: Parallelization of Python on GPU? John Ladasky <john_ladasky@sbcglobal.net> - 2015-02-26 09:34 -0800
      Re: Parallelization of Python on GPU? Sturla Molden <sturla.molden@gmail.com> - 2015-02-26 21:54 +0100

csiph-web