Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Oscar Benjamin Newsgroups: comp.lang.python Subject: Re: Improving performance in matrix operations Date: Mon, 14 Mar 2016 18:35:05 +0000 Lines: 43 Message-ID: References: <1b1ef48f-c60f-4c56-ae55-376e8a117337@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: news.uni-berlin.de OAf/WDLZEXefyA9K3k10owH74sJCfFO1zJMinAVZnlEA== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.019 X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; 'heavily': 0.04; 'cc:addr :python-list': 0.09; 'compute': 0.09; 'underlying': 0.09; 'python.': 0.11; 'things.': 0.15; '2016': 0.16; 'atlas': 0.16; 'cc:name:python list': 0.16; 'matlab': 0.16; 'numpy': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'wrote:': 0.16; 'intel': 0.18; 'tests': 0.18; 'library': 0.20; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'suggested': 0.20; 'disabled': 0.22; 'libraries': 0.22; 'setup.': 0.22; 'code.': 0.23; 'accuracy': 0.23; 'url:edu': 0.24; 'header:In-Reply-To:1': 0.24; 'install': 0.25; 'example': 0.26; 'installed': 0.26; 'figure': 0.27; 'message-id:@mail.gmail.com': 0.27; 'function': 0.28; 'behaviour': 0.29; "i'm": 0.30; 'code': 0.30; 'e.g.': 0.30; 'operations': 0.31; 'seconds': 0.31; 'another': 0.32; 'implement': 0.32; 'compiled': 0.32; 'ubuntu': 0.33; 'except': 0.34; 'received:google.com': 0.35; 'acceptable': 0.35; 'c++': 0.35; 'library.': 0.35; 'something': 0.35; 'but': 0.36; 'should': 0.36; 'received:209.85': 0.36; 'possible': 0.36; 'subject:: ': 0.37; 'really': 0.37; 'doing': 0.38; 'difference': 0.38; 'received:209': 0.38; 'several': 0.38; 'why': 0.39; 'does': 0.39; 'takes': 0.39; 'still': 0.40; 'some': 0.40; 'your': 0.60; "you'll": 0.61; 'linked': 0.63; 'different': 0.63; 'march': 0.64; 'busy.': 0.66; 'boost': 0.67; 'worth': 0.67; 'targeted': 0.70; '100%': 0.72; 'sounds': 0.76; '3000': 0.79; 'oscar': 0.84; 'using.': 0.84; '30%': 0.97 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=HMDJi+O+ipfZtPP7yfhbNHvUmE8nu4pDe4YQrzztzws=; b=hJYJff78JE/kl4kiS4DPgwsH1dF8VMVsd4/pXgfyrCx3y1ts4fhPEjMihyzE3vtDMS W5LwoRTENKy/NNp3KWMY4yBCzyp/shpA4maWqG8d635iv5OPT2n+kwCLJnWo0r13WR7W nPBCRxcIQnVrBicAYEQ/P1dsKt8zVNOvbjwZMWZoEzopsBXoPXIlEcuxpDBwnM+DgRPq SRRI9I/JEsQyTFYPdqrJEG7P9k0GitDrgxAFC3Zpt4lir+Q8IrDgMEUFTtuP5bgHREg5 HWWBr/i134d6sLPKiRQQIT+XkMf8PZZnnlAlSag9rK6A2iyFO9z7G/3D4PaJ5tBOy+XI 8gCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=HMDJi+O+ipfZtPP7yfhbNHvUmE8nu4pDe4YQrzztzws=; b=c91l610kVyc/LWFfaybD5X6J+MYzseQ+KKfwIDeyX6vdvtJ16xGhCJdzCxcBQoIMsb Fl12a7MkV6D2L6W+hmEkeuemFNxYsUBC1OCKmbFB9NpaS1nkCHUBspNqltLkEqq3ibk6 JYVaRBW3+2LCdHJfc8mEj2+wENhfPEzSRYTM2KPWpGkTJplDwGaoPHhRQW7AcEifSyMG A3Wzxq3icMPwxfY+yv2BYU4ra9Sfj/G8xcKEHuxpgh2gWbivoCg/bFrF0CMRdmj43pvK 9S4XqYBk5/TAfBMd7ed9xKF0Ubv5Y9fmCagUmMihSlIKN7ZOK7JuuurGQmFgtfqJ0yFA hZgA== X-Gm-Message-State: AD7BkJLjbG/PncFpo/uwmOpXLX2CdSDRT+9X55E+hCduutoVL6O8Sr+RU7QCemMW3F9SHW/HZQqIHyqUt6b8pQ== X-Received: by 10.25.210.4 with SMTP id j4mr8804044lfg.130.1457980524939; Mon, 14 Mar 2016 11:35:24 -0700 (PDT) In-Reply-To: <1b1ef48f-c60f-4c56-ae55-376e8a117337@googlegroups.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:104853 On 9 March 2016 at 20:09, Drimades wrote: > I'm doing some tests with operations on numpy matrices in Python. As an example, it takes about 3000 seconds to compute eigenvalues and eigenvectors using scipy.linalg.eig(a) for a matrix 6000x6000. Is it an acceptable time? I don't know really but you need to understand that numpy delegates this kind of operation to the underlying BLAS library. It's possible to have different BLAS libraries depending on how you installed numpy. For example if you install numpy from http://www.lfd.uci.edu/~gohlke/pythonlibs/ then you will have a numpy that is linked with the Intel MKL library for BLAS which I think is that same as used in e.g. Matlab and many other things. Alternatively if you installed from the numpy sourceforge page then you'll have the ATLAS BLAS library. If you're using e.g. Ubuntu and installed numpy from the Ubuntu repos it's possible that you're using numpy's vendored unoptimised BLAS library. Each of these different BLAS libraries has different characteristics in terms of accuracy and speed so it's worth knowing which one you're actually using. > Any suggestions to improve? Does C++ perform better with matrices? If you were working in C++ you would still want to link to a BLAS library to do this so I don't see why it would make any difference except that it would require you to work out how to compile and use BLAS directly and then link to it from your C++ code. > Another thing to consider is that matrices I'm processing are heavily sparse. Then you should definitely use something that is targeted at sparse matrices (as suggested by Fabien). This can give a massive boost in performance. > Do they implement any parallelism? While my code is running, one of my cores is 100% busy, the other one 30% busy. It sounds like the particular BLAS library you are using is not using several cores for this workload. Different BLAS libraries have different capabilities. Again you need to figure out which one you've got and how it's compiled. It's possible that e.g. MKL has a parallel eig function but that it is compiled with that behaviour disabled in your setup. -- Oscar