Groups > comp.lang.python > #68870 > unrolled thread

Memory error

Started by	Jamie Mitchell <jamiemitchell1604@gmail.com>
First post	2014-03-24 04:32 -0700
Last post	2014-03-25 08:53 +0100
Articles	4 — 3 participants

Back to article view | Back to comp.lang.python

  Memory error Jamie Mitchell <jamiemitchell1604@gmail.com> - 2014-03-24 04:32 -0700
    Re: Memory error Jamie Mitchell <jamiemitchell1604@gmail.com> - 2014-03-24 04:39 -0700
    Re: Memory error Gary Herron <gary.herron@islandtraining.com> - 2014-03-24 13:37 -0700
    Re: Memory error dieter <dieter@handshake.de> - 2014-03-25 08:53 +0100

#68870 — Memory error

From	Jamie Mitchell <jamiemitchell1604@gmail.com>
Date	2014-03-24 04:32 -0700
Subject	Memory error
Message-ID	<d4489c2b-6650-45b6-bcb0-7a1b648192d2@googlegroups.com>

Hello all,

I'm afraid I am new to all this so bear with me...

I am looking to find the statistical significance between two large netCDF data sets.

Firstly I've loaded the two files into python:

swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')

swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')

I have then isolated the variables I want to perform the pearson correlation on:

hs=swh.variables['hs']

hs_2050s=swh_2050s.variables['hs']

Here is the metadata for those files:

print hs
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
    standard_name: significant_height_of_wind_and_swell_waves
    long_name: significant_wave_height
    units: m
    add_offset: 0.0
    scale_factor: 0.002
    _FillValue: -32767
    missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)

print hs_2050s
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
    standard_name: significant_height_of_wind_and_swell_waves
    long_name: significant_wave_height
    units: m
    add_offset: 0.0
    scale_factor: 0.002
    _FillValue: -32767
    missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)


Then to perform the pearsons correlation:

from scipy.stats.stats import pearsonr

pearsonr(hs,hs_2050s)

I then get a memory error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
    x = np.asarray(x)
  File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
    return array(a, dtype, copy=False, order=order)
MemoryError

This also happens when I try to create numpy arrays from the data.

Does anyone know how I can alleviate theses memory errors?

Cheers,

Jamie

[toc] | [next] | [standalone]

#68871

From	Jamie Mitchell <jamiemitchell1604@gmail.com>
Date	2014-03-24 04:39 -0700
Message-ID	<81019e01-c101-4025-af2a-bea0c193230d@googlegroups.com>
In reply to	#68870

On Monday, March 24, 2014 11:32:31 AM UTC, Jamie Mitchell wrote:
> Hello all,
> 
> 
> 
> I'm afraid I am new to all this so bear with me...
> 
> 
> 
> I am looking to find the statistical significance between two large netCDF data sets.
> 
> 
> 
> Firstly I've loaded the two files into python:
> 
> 
> 
> swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')
> 
> 
> 
> swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')
> 
> 
> 
> I have then isolated the variables I want to perform the pearson correlation on:
> 
> 
> 
> hs=swh.variables['hs']
> 
> 
> 
> hs_2050s=swh_2050s.variables['hs']
> 
> 
> 
> Here is the metadata for those files:
> 
> 
> 
> print hs
> 
> <type 'netCDF4.Variable'>
> 
> int16 hs(time, latitude, longitude)
> 
>     standard_name: significant_height_of_wind_and_swell_waves
> 
>     long_name: significant_wave_height
> 
>     units: m
> 
>     add_offset: 0.0
> 
>     scale_factor: 0.002
> 
>     _FillValue: -32767
> 
>     missing_value: -32767
> 
> unlimited dimensions: time
> 
> current shape = (86400, 350, 227)
> 
> 
> 
> print hs_2050s
> 
> <type 'netCDF4.Variable'>
> 
> int16 hs(time, latitude, longitude)
> 
>     standard_name: significant_height_of_wind_and_swell_waves
> 
>     long_name: significant_wave_height
> 
>     units: m
> 
>     add_offset: 0.0
> 
>     scale_factor: 0.002
> 
>     _FillValue: -32767
> 
>     missing_value: -32767
> 
> unlimited dimensions: time
> 
> current shape = (86400, 350, 227)
> 
> 
> 
> 
> 
> Then to perform the pearsons correlation:
> 
> 
> 
> from scipy.stats.stats import pearsonr
> 
> 
> 
> pearsonr(hs,hs_2050s)
> 
> 
> 
> I then get a memory error:
> 
> 
> 
> Traceback (most recent call last):
> 
>   File "<stdin>", line 1, in <module>
> 
>   File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
> 
>     x = np.asarray(x)
> 
>   File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
> 
>     return array(a, dtype, copy=False, order=order)
> 
> MemoryError
> 
> 
> 
> This also happens when I try to create numpy arrays from the data.
> 
> 
> 
> Does anyone know how I can alleviate theses memory errors?
> 
> 
> 
> Cheers,
> 
> 
> 
> Jamie

Just realised that obviously pearson correlation requires two 1D arrays and mine are 3D, silly mistake!

[toc] | [prev] | [next] | [standalone]

#68904

From	Gary Herron <gary.herron@islandtraining.com>
Date	2014-03-24 13:37 -0700
Message-ID	<mailman.8466.1395693999.18130.python-list@python.org>
In reply to	#68870

On 03/24/2014 04:32 AM, Jamie Mitchell wrote:
> Hello all,
>
> I'm afraid I am new to all this so bear with me...
>
> I am looking to find the statistical significance between two large netCDF data sets.
>
> Firstly I've loaded the two files into python:
>
> swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')
>
> swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')
>
> I have then isolated the variables I want to perform the pearson correlation on:
>
> hs=swh.variables['hs']
>
> hs_2050s=swh_2050s.variables['hs']

This is not really a Python question.  It's a question about netCDF 
(whatever that may be), or perhaps it's interface to Python python-netCD4.

You may get an answer here, but you are far more likely to get one 
quickly and accurately from a forum dedicated to netCDF, or python-netCD.

Good luck.

Gary Herron

[toc] | [prev] | [next] | [standalone]

#68995

From	dieter <dieter@handshake.de>
Date	2014-03-25 08:53 +0100
Message-ID	<mailman.8506.1395734111.18130.python-list@python.org>
In reply to	#68870

Jamie Mitchell <jamiemitchell1604@gmail.com> writes:
> ...
> I then get a memory error:
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
>     x = np.asarray(x)
>   File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
>     return array(a, dtype, copy=False, order=order)
> MemoryError

"MemoryError" means that Python cannot get sufficent memory
from the operating system.

You have already found out one mistake. Should you continue to
get "MemoryError" after this is fixed, then your system does not
provide enough resources (memory) to solve the problem at hand.
You would need to find a way to provide more resources.

[toc] | [prev] | [standalone]

csiph-web

Memory error

Contents

#68870 — Memory error

#68871

#68904

#68995