Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #73057 > unrolled thread
| Started by | beliavsky@aol.com |
|---|---|
| First post | 2014-06-09 12:48 -0700 |
| Last post | 2014-06-10 09:18 +0200 |
| Articles | 3 — 3 participants |
Back to article view | Back to comp.lang.python
lists vs. NumPy arrays for sets of dates and strings beliavsky@aol.com - 2014-06-09 12:48 -0700
Re: lists vs. NumPy arrays for sets of dates and strings Denis McMahon <denismfmcmahon@gmail.com> - 2014-06-09 20:50 +0000
Re: lists vs. NumPy arrays for sets of dates and strings Peter Otten <__peter__@web.de> - 2014-06-10 09:18 +0200
| From | beliavsky@aol.com |
|---|---|
| Date | 2014-06-09 12:48 -0700 |
| Subject | lists vs. NumPy arrays for sets of dates and strings |
| Message-ID | <9b60eed1-1924-43d8-a92c-43b792118ebb@googlegroups.com> |
I am going to read a multivariate time series from a CSV file that looks like Date,A,B 2014-01-01,10.0,20.0 2014-01-02,10.1,19.9 ... The numerical data I will store in a NumPy array, since they are more convenient to work with than lists of lists. What are the advantages and disadvantages of storing the symbols [A,B] and dates [2014-01-01,2014-01-02] as lists vs. NumPy arrays?
[toc] | [next] | [standalone]
| From | Denis McMahon <denismfmcmahon@gmail.com> |
|---|---|
| Date | 2014-06-09 20:50 +0000 |
| Message-ID | <ln56lr$93m$1@dont-email.me> |
| In reply to | #73057 |
On Mon, 09 Jun 2014 12:48:12 -0700, beliavsky wrote:
> I am going to read a multivariate time series from a CSV file that looks
> like
>
> Date,A,B 2014-01-01,10.0,20.0 2014-01-02,10.1,19.9 ...
>
> The numerical data I will store in a NumPy array, since they are more
> convenient to work with than lists of lists. What are the advantages and
> disadvantages of storing the symbols [A,B] and dates
> [2014-01-01,2014-01-02] as lists vs. NumPy arrays?
You could also use a dictionary of either lists or tuples or even NumPy
arrays keyed on the date.
$ python
Python 2.7.3 (default, Feb 27 2014, 19:58:35)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> x = {}
>>> y = numpy.array( [0,1] )
>>> x['2014-06-05'] = y
>>> x['2014-06-05']
array([0, 1])
>>> x
{'2014-06-05': array([0, 1])}
>>> x['2014-06-05'][0]
0
>>> x['2014-06-05'][1]
1
--
Denis McMahon, denismfmcmahon@gmail.com
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2014-06-10 09:18 +0200 |
| Message-ID | <mailman.10943.1402384746.18130.python-list@python.org> |
| In reply to | #73057 |
beliavsky@aol.com.dmarc.invalid wrote:
> I am going to read a multivariate time series from a CSV file that looks
> like
>
> Date,A,B
> 2014-01-01,10.0,20.0
> 2014-01-02,10.1,19.9
> ...
>
> The numerical data I will store in a NumPy array, since they are more
> convenient to work with than lists of lists. What are the advantages and
> disadvantages of storing the symbols [A,B] and dates
> [2014-01-01,2014-01-02] as lists vs. NumPy arrays?
If you don't mind the numpy dependency I can't see any disadvantages.
You might also have a look at pandas:
>>> ts = pandas.read_csv(io.StringIO("""\
... Date,A,B
... 2014-01-01,10.0,20.0
... 2014-01-02,10.1,19.9
... """), parse_dates=[0])
>>> ts
Date A B
0 2014-01-01 00:00:00 10.0 20.0
1 2014-01-02 00:00:00 10.1 19.9
>>> ts["A"]
0 10.0
1 10.1
Name: A, dtype: float64
>>> ts["Date"]
0 2014-01-01 00:00:00
1 2014-01-02 00:00:00
Name: Date, dtype: datetime64[ns]
>>> ts["Date"][0]
Timestamp('2014-01-01 00:00:00', tz=None)
>>> pylab.show(ts.plot(x="Date", y=["A", "B"]))
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web