Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #73057 > unrolled thread

lists vs. NumPy arrays for sets of dates and strings

Started bybeliavsky@aol.com
First post2014-06-09 12:48 -0700
Last post2014-06-10 09:18 +0200
Articles 3 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  lists vs. NumPy arrays for sets of dates and strings beliavsky@aol.com - 2014-06-09 12:48 -0700
    Re: lists vs. NumPy arrays for sets of dates and strings Denis McMahon <denismfmcmahon@gmail.com> - 2014-06-09 20:50 +0000
    Re: lists vs. NumPy arrays for sets of dates and strings Peter Otten <__peter__@web.de> - 2014-06-10 09:18 +0200

#73057 — lists vs. NumPy arrays for sets of dates and strings

Frombeliavsky@aol.com
Date2014-06-09 12:48 -0700
Subjectlists vs. NumPy arrays for sets of dates and strings
Message-ID<9b60eed1-1924-43d8-a92c-43b792118ebb@googlegroups.com>
I am going to read a multivariate time series from a CSV file that looks like

Date,A,B
2014-01-01,10.0,20.0
2014-01-02,10.1,19.9
...

The numerical data I will store in a NumPy array, since they are more convenient to work with than lists of lists. What are the advantages and disadvantages of storing the symbols [A,B] and dates [2014-01-01,2014-01-02] as lists vs. NumPy arrays?

[toc] | [next] | [standalone]


#73058

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2014-06-09 20:50 +0000
Message-ID<ln56lr$93m$1@dont-email.me>
In reply to#73057
On Mon, 09 Jun 2014 12:48:12 -0700, beliavsky wrote:

> I am going to read a multivariate time series from a CSV file that looks
> like
> 
> Date,A,B 2014-01-01,10.0,20.0 2014-01-02,10.1,19.9 ...
> 
> The numerical data I will store in a NumPy array, since they are more
> convenient to work with than lists of lists. What are the advantages and
> disadvantages of storing the symbols [A,B] and dates
> [2014-01-01,2014-01-02] as lists vs. NumPy arrays?

You could also use a dictionary of either lists or tuples or even NumPy 
arrays keyed on the date.

$ python
Python 2.7.3 (default, Feb 27 2014, 19:58:35) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> x = {}
>>> y = numpy.array( [0,1] )
>>> x['2014-06-05'] = y
>>> x['2014-06-05']
array([0, 1])
>>> x
{'2014-06-05': array([0, 1])}
>>> x['2014-06-05'][0]
0
>>> x['2014-06-05'][1]
1

-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [next] | [standalone]


#73074

FromPeter Otten <__peter__@web.de>
Date2014-06-10 09:18 +0200
Message-ID<mailman.10943.1402384746.18130.python-list@python.org>
In reply to#73057
beliavsky@aol.com.dmarc.invalid wrote:

> I am going to read a multivariate time series from a CSV file that looks
> like
> 
> Date,A,B
> 2014-01-01,10.0,20.0
> 2014-01-02,10.1,19.9
> ...
> 
> The numerical data I will store in a NumPy array, since they are more
> convenient to work with than lists of lists. What are the advantages and
> disadvantages of storing the symbols [A,B] and dates
> [2014-01-01,2014-01-02] as lists vs. NumPy arrays?

If you don't mind the numpy dependency I can't see any disadvantages.
You might also have a look at pandas:

>>> ts = pandas.read_csv(io.StringIO("""\
... Date,A,B
... 2014-01-01,10.0,20.0
... 2014-01-02,10.1,19.9
... """), parse_dates=[0])
>>> ts
                 Date     A     B
0 2014-01-01 00:00:00  10.0  20.0
1 2014-01-02 00:00:00  10.1  19.9
>>> ts["A"]
0    10.0
1    10.1
Name: A, dtype: float64
>>> ts["Date"]
0   2014-01-01 00:00:00
1   2014-01-02 00:00:00
Name: Date, dtype: datetime64[ns]
>>> ts["Date"][0]
Timestamp('2014-01-01 00:00:00', tz=None)
>>> pylab.show(ts.plot(x="Date", y=["A", "B"]))

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web