Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #104451 > unrolled thread

Improving performance in matrix operations

Started byDrimades <e.zhupa@gmail.com>
First post2016-03-09 12:09 -0800
Last post2016-03-14 18:35 +0000
Articles 4 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  Improving performance in matrix operations Drimades <e.zhupa@gmail.com> - 2016-03-09 12:09 -0800
    Re: Improving performance in matrix operations Fabien <fabien.maussion@gmail.com> - 2016-03-09 21:16 +0100
    Re: Improving performance in matrix operations Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-03-10 19:25 +1100
    Re: Improving performance in matrix operations Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-03-14 18:35 +0000

#104451 — Improving performance in matrix operations

FromDrimades <e.zhupa@gmail.com>
Date2016-03-09 12:09 -0800
SubjectImproving performance in matrix operations
Message-ID<1b1ef48f-c60f-4c56-ae55-376e8a117337@googlegroups.com>
I'm doing some tests with operations on numpy matrices in Python. As an example, it takes about 3000 seconds to compute eigenvalues and eigenvectors using scipy.linalg.eig(a) for a matrix 6000x6000. Is it an acceptable time? Any suggestions to improve? Does C++ perform better with matrices? Another thing to consider is that matrices I'm processing are heavily sparse.
Do they implement any parallelism? While my code is running, one of my cores is 100% busy, the other one 30% busy.

[toc] | [next] | [standalone]


#104452

FromFabien <fabien.maussion@gmail.com>
Date2016-03-09 21:16 +0100
Message-ID<nbq0bp$1sqg$1@gioia.aioe.org>
In reply to#104451
On 03/09/2016 09:09 PM, Drimades wrote:
> Another thing to consider is that matrices I'm processing are heavily sparse.

Did you look at scipy.sparse.linalg ?

http://docs.scipy.org/doc/scipy/reference/sparse.linalg.html

[toc] | [prev] | [next] | [standalone]


#104487

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2016-03-10 19:25 +1100
Message-ID<56e12f93$0$11098$c3e8da3@news.astraweb.com>
In reply to#104451
On Thursday 10 March 2016 07:09, Drimades wrote:

> I'm doing some tests with operations on numpy matrices in Python. As an
> example, it takes about 3000 seconds to compute eigenvalues and
> eigenvectors using scipy.linalg.eig(a) for a matrix 6000x6000. Is it an
> acceptable time? 

I don't know what counts as acceptable. Do you have a thousand of these 
systems to solve by next Tuesday? Or one a month? Can you adjust your 
workflow to start the calculation and then go off to lunch, or do you 
require interactive use?


> Any suggestions to improve? 

Use smaller matrices? :-) Use a faster computer?


This may give you some ideas:

https://www.ibm.com/developerworks/community/blogs/jfp/entry/A_Comparison_Of_C_Julia_Python_Numba_Cython_Scipy_and_BLAS_on_LU_Factorization?lang=en



> Does C++ perform better with matrices? 

Specifically on your computer? I don't know, you'll have to try it. The 
actual time taken by a program will depend on the hardware you run it on, 
not just the language it is written in.


> Another thing to consider is that matrices I'm processing are
> heavily sparse. Do they implement any parallelism? While my code is
> running, one of my cores is 100% busy, the other one 30% busy.

You might get better answers for technical questions like that from 
dedicated numpy and scipy mailing lists.



-- 
Steve

[toc] | [prev] | [next] | [standalone]


#104853

FromOscar Benjamin <oscar.j.benjamin@gmail.com>
Date2016-03-14 18:35 +0000
Message-ID<mailman.125.1457980527.12893.python-list@python.org>
In reply to#104451
On 9 March 2016 at 20:09, Drimades <e.zhupa@gmail.com> wrote:
> I'm doing some tests with operations on numpy matrices in Python. As an example, it takes about 3000 seconds to compute eigenvalues and eigenvectors using scipy.linalg.eig(a) for a matrix 6000x6000. Is it an acceptable time?

I don't know really but you need to understand that numpy delegates
this kind of operation to the underlying BLAS library. It's possible
to have different BLAS libraries depending on how you installed numpy.
For example if you install numpy from
    http://www.lfd.uci.edu/~gohlke/pythonlibs/
then you will have a numpy that is linked with the Intel MKL library
for BLAS which I think is that same as used in e.g. Matlab and many
other things. Alternatively if you installed from the numpy
sourceforge page then you'll have the ATLAS BLAS library. If you're
using e.g. Ubuntu and installed numpy from the Ubuntu repos it's
possible that you're using numpy's vendored unoptimised BLAS library.

Each of these different BLAS libraries has different characteristics
in terms of accuracy and speed so it's worth knowing which one you're
actually using.

> Any suggestions to improve? Does C++ perform better with matrices?

If you were working in C++ you would still want to link to a BLAS
library to do this so I don't see why it would make any difference
except that it would require you to work out how to compile and use
BLAS directly and then link to it from your C++ code.

> Another thing to consider is that matrices I'm processing are heavily sparse.

Then you should definitely use something that is targeted at sparse
matrices (as suggested by Fabien). This can give a massive boost in
performance.

> Do they implement any parallelism? While my code is running, one of my cores is 100% busy, the other one 30% busy.

It sounds like the particular BLAS library you are using is not using
several cores for this workload. Different BLAS libraries have
different capabilities. Again you need to figure out which one you've
got and how it's compiled. It's possible that e.g. MKL has a parallel
eig function but that it is compiled with that behaviour disabled in
your setup.

--
Oscar

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web