Path: csiph.com!usenet.pasdenom.info!news.redatomik.org!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
To: python-list@python.org
From: Sturla Molden <sturla.molden@gmail.com>
Subject: Re: Parallelization of Python on GPU?
Date: Thu, 26 Feb 2015 16:40:29 +0000 (UTC)
References: <82642f3a-49e8-4982-b135-66ffc04d67d9@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
User-Agent: NewsTap/4.0.1 (iPad)
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.19268.1424968864.18130.python-list@python.org>
Lines: 43
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:86513

If you are doing SVM regression with scikit-learn you are using libSVM.
There is a CUDA accelerated version of this C library here:
http://mklab.iti.gr/project/GPU-LIBSVM

You can presumably reuse the wrapping code from scikit-learn.

Sturla


John Ladasky <john_ladasky@sbcglobal.net> wrote:
> I've been working with machine learning for a while.  Many of the
> standard packages (e.g., scikit-learn) have fitting algorithms which run
> in single threads.  These algorithms are not themselves parallelized. 
> Perhaps, due to their unique mathematical requirements, they cannot be paralleized.  
> 
> When one is investigating several potential models of one's data with
> various settings for free parameters, it is still sometimes possible to
> speed things up.  On a modern machine, one can use Python's
> multiprocessing.Pool to run separate instances of scikit-learn fits.  I
> am currently using ten of the twelve 3.3 GHz CPU cores on my machine to
> do just that.  And I can still browse the web with no observable lag.  :^)
> 
> Still, I'm waiting hours for jobs to finish.  Support vector regression fitting is hard.
> 
> What I would REALLY like to do is to take advantage of my GPU.  My NVidia
> graphics card has 1152 cores and a 1.0 GHz clock.  I wouldn't mind
> borrowing a few hundred of those GPU cores at a time, and see what they
> can do.  In theory, I calculate that I can speed up the job by another five-fold.
> 
> The trick is that each process would need to run some PYTHON code, not
> CUDA or OpenCL.  The child process code isn't particularly fancy.  (I
> should, for example, be able to switch that portion of my code to static typing.)
> 
> What is the most effective way to accomplish this task?
> 
> I came across a reference to a package called "Urutu" which may be what I
> need, however it doesn't look like it is widely supported.
> 
> I would love it if the Python developers themselves added the ability to
> spawn GPU processes to the Multiprocessing module!
> 
> Thanks for any advice and comments.