Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #68870 > unrolled thread
| Started by | Jamie Mitchell <jamiemitchell1604@gmail.com> |
|---|---|
| First post | 2014-03-24 04:32 -0700 |
| Last post | 2014-03-25 08:53 +0100 |
| Articles | 4 — 3 participants |
Back to article view | Back to comp.lang.python
Memory error Jamie Mitchell <jamiemitchell1604@gmail.com> - 2014-03-24 04:32 -0700
Re: Memory error Jamie Mitchell <jamiemitchell1604@gmail.com> - 2014-03-24 04:39 -0700
Re: Memory error Gary Herron <gary.herron@islandtraining.com> - 2014-03-24 13:37 -0700
Re: Memory error dieter <dieter@handshake.de> - 2014-03-25 08:53 +0100
| From | Jamie Mitchell <jamiemitchell1604@gmail.com> |
|---|---|
| Date | 2014-03-24 04:32 -0700 |
| Subject | Memory error |
| Message-ID | <d4489c2b-6650-45b6-bcb0-7a1b648192d2@googlegroups.com> |
Hello all,
I'm afraid I am new to all this so bear with me...
I am looking to find the statistical significance between two large netCDF data sets.
Firstly I've loaded the two files into python:
swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')
swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')
I have then isolated the variables I want to perform the pearson correlation on:
hs=swh.variables['hs']
hs_2050s=swh_2050s.variables['hs']
Here is the metadata for those files:
print hs
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)
print hs_2050s
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)
Then to perform the pearsons correlation:
from scipy.stats.stats import pearsonr
pearsonr(hs,hs_2050s)
I then get a memory error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
x = np.asarray(x)
File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
return array(a, dtype, copy=False, order=order)
MemoryError
This also happens when I try to create numpy arrays from the data.
Does anyone know how I can alleviate theses memory errors?
Cheers,
Jamie
[toc] | [next] | [standalone]
| From | Jamie Mitchell <jamiemitchell1604@gmail.com> |
|---|---|
| Date | 2014-03-24 04:39 -0700 |
| Message-ID | <81019e01-c101-4025-af2a-bea0c193230d@googlegroups.com> |
| In reply to | #68870 |
On Monday, March 24, 2014 11:32:31 AM UTC, Jamie Mitchell wrote:
> Hello all,
>
>
>
> I'm afraid I am new to all this so bear with me...
>
>
>
> I am looking to find the statistical significance between two large netCDF data sets.
>
>
>
> Firstly I've loaded the two files into python:
>
>
>
> swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')
>
>
>
> swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')
>
>
>
> I have then isolated the variables I want to perform the pearson correlation on:
>
>
>
> hs=swh.variables['hs']
>
>
>
> hs_2050s=swh_2050s.variables['hs']
>
>
>
> Here is the metadata for those files:
>
>
>
> print hs
>
> <type 'netCDF4.Variable'>
>
> int16 hs(time, latitude, longitude)
>
> standard_name: significant_height_of_wind_and_swell_waves
>
> long_name: significant_wave_height
>
> units: m
>
> add_offset: 0.0
>
> scale_factor: 0.002
>
> _FillValue: -32767
>
> missing_value: -32767
>
> unlimited dimensions: time
>
> current shape = (86400, 350, 227)
>
>
>
> print hs_2050s
>
> <type 'netCDF4.Variable'>
>
> int16 hs(time, latitude, longitude)
>
> standard_name: significant_height_of_wind_and_swell_waves
>
> long_name: significant_wave_height
>
> units: m
>
> add_offset: 0.0
>
> scale_factor: 0.002
>
> _FillValue: -32767
>
> missing_value: -32767
>
> unlimited dimensions: time
>
> current shape = (86400, 350, 227)
>
>
>
>
>
> Then to perform the pearsons correlation:
>
>
>
> from scipy.stats.stats import pearsonr
>
>
>
> pearsonr(hs,hs_2050s)
>
>
>
> I then get a memory error:
>
>
>
> Traceback (most recent call last):
>
> File "<stdin>", line 1, in <module>
>
> File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
>
> x = np.asarray(x)
>
> File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
>
> return array(a, dtype, copy=False, order=order)
>
> MemoryError
>
>
>
> This also happens when I try to create numpy arrays from the data.
>
>
>
> Does anyone know how I can alleviate theses memory errors?
>
>
>
> Cheers,
>
>
>
> Jamie
Just realised that obviously pearson correlation requires two 1D arrays and mine are 3D, silly mistake!
[toc] | [prev] | [next] | [standalone]
| From | Gary Herron <gary.herron@islandtraining.com> |
|---|---|
| Date | 2014-03-24 13:37 -0700 |
| Message-ID | <mailman.8466.1395693999.18130.python-list@python.org> |
| In reply to | #68870 |
On 03/24/2014 04:32 AM, Jamie Mitchell wrote:
> Hello all,
>
> I'm afraid I am new to all this so bear with me...
>
> I am looking to find the statistical significance between two large netCDF data sets.
>
> Firstly I've loaded the two files into python:
>
> swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')
>
> swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')
>
> I have then isolated the variables I want to perform the pearson correlation on:
>
> hs=swh.variables['hs']
>
> hs_2050s=swh_2050s.variables['hs']
This is not really a Python question. It's a question about netCDF
(whatever that may be), or perhaps it's interface to Python python-netCD4.
You may get an answer here, but you are far more likely to get one
quickly and accurately from a forum dedicated to netCDF, or python-netCD.
Good luck.
Gary Herron
[toc] | [prev] | [next] | [standalone]
| From | dieter <dieter@handshake.de> |
|---|---|
| Date | 2014-03-25 08:53 +0100 |
| Message-ID | <mailman.8506.1395734111.18130.python-list@python.org> |
| In reply to | #68870 |
Jamie Mitchell <jamiemitchell1604@gmail.com> writes: > ... > I then get a memory error: > > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr > x = np.asarray(x) > File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray > return array(a, dtype, copy=False, order=order) > MemoryError "MemoryError" means that Python cannot get sufficent memory from the operating system. You have already found out one mistake. Should you continue to get "MemoryError" after this is fixed, then your system does not provide enough resources (memory) to solve the problem at hand. You would need to find a way to provide more resources.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web