Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #92623 > unrolled thread

python financial data cleaning

Started bySebastian M Cheung <minscheung@googlemail.com>
First post2015-06-15 03:12 -0700
Last post2015-06-15 06:42 -0700
Articles 7 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  python financial data cleaning Sebastian M Cheung <minscheung@googlemail.com> - 2015-06-15 03:12 -0700
    Re: python financial data cleaning Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-06-15 11:19 +0100
      Re: python financial data cleaning Sebastian M Cheung <minscheung@googlemail.com> - 2015-06-15 14:01 -0700
        Re: python financial data cleaning Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-06-16 20:40 +0100
    Re: python financial data cleaning Sebastian M Cheung <minscheung@googlemail.com> - 2015-06-15 03:59 -0700
    Re: python financial data cleaning Laura Creighton <lac@openend.se> - 2015-06-15 13:34 +0200
      Re: python financial data cleaning Sebastian M Cheung <minscheung@googlemail.com> - 2015-06-15 06:42 -0700

#92623 — python financial data cleaning

FromSebastian M Cheung <minscheung@googlemail.com>
Date2015-06-15 03:12 -0700
Subjectpython financial data cleaning
Message-ID<b0cbc75c-cc0b-4f27-a8d6-e4c20c356e6d@googlegroups.com>
How to do financial data cleaning ? Say I assume a list of 1000 finance series data in myList = Open, High, Low and Close. For missing Close Price data, What is best practice to clean data in Python

[toc] | [next] | [standalone]


#92624

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-06-15 11:19 +0100
Message-ID<mailman.480.1434363576.13271.python-list@python.org>
In reply to#92623
On 15/06/2015 11:12, Sebastian M Cheung via Python-list wrote:
> How to do financial data cleaning ? Say I assume a list of 1000 finance series data in myList = Open, High, Low and Close. For missing Close Price data, What is best practice to clean data in Python
>

http://pandas.pydata.org/

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#92643

FromSebastian M Cheung <minscheung@googlemail.com>
Date2015-06-15 14:01 -0700
Message-ID<55a0ccda-572c-4324-9169-08028d4fb619@googlegroups.com>
In reply to#92624
On Monday, June 15, 2015 at 11:19:48 AM UTC+1, Mark Lawrence wrote:
> On 15/06/2015 11:12, Sebastian M Cheung via Python-list wrote:
> > How to do financial data cleaning ? Say I assume a list of 1000 finance series data in myList = Open, High, Low and Close. For missing Close Price data, What is best practice to clean data in Python
> >
> 
> http://pandas.pydata.org/
> 
> -- 
> My fellow Pythonistas, ask not what our language can do for you, ask
> what you can do for our language.
> 
> Mark Lawrence


Hi Mark,

Below I read in DirtyData (financial data) from Excel and then find the number of NaN missing Closed Pricing data:

xls = pd.ExcelFile('DirtyData.xlsm')
df = xls.parse('Dirty Data', index_col=None, na_values=['NA'])
print(df.isnull().astype(int).sum()) 

So if I were to clean missing Open Price data, I could copy from previous or row's Close Price data, but how would I implement it? Thanks

[toc] | [prev] | [next] | [standalone]


#92696

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-06-16 20:40 +0100
Message-ID<mailman.525.1434483634.13271.python-list@python.org>
In reply to#92643
On 15/06/2015 22:01, Sebastian M Cheung via Python-list wrote:
> On Monday, June 15, 2015 at 11:19:48 AM UTC+1, Mark Lawrence wrote:
>> On 15/06/2015 11:12, Sebastian M Cheung via Python-list wrote:
>>> How to do financial data cleaning ? Say I assume a list of 1000 finance series data in myList = Open, High, Low and Close. For missing Close Price data, What is best practice to clean data in Python
>>>
>>
>> http://pandas.pydata.org/
>>
>> --
>> My fellow Pythonistas, ask not what our language can do for you, ask
>> what you can do for our language.
>>
>> Mark Lawrence
>
>
> Hi Mark,
>
> Below I read in DirtyData (financial data) from Excel and then find the number of NaN missing Closed Pricing data:
>
> xls = pd.ExcelFile('DirtyData.xlsm')
> df = xls.parse('Dirty Data', index_col=None, na_values=['NA'])
> print(df.isnull().astype(int).sum())
>
> So if I were to clean missing Open Price data, I could copy from previous or row's Close Price data, but how would I implement it? Thanks
>

I'm sorry but my knowledge of pandas is limited, I just know it's pretty 
much best of breed.  Try stackoverflow or 
https://groups.google.com/forum/#!forum/pydata which is gated to 
gmane.comp.python.pydata

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#92625

FromSebastian M Cheung <minscheung@googlemail.com>
Date2015-06-15 03:59 -0700
Message-ID<18f75f78-2a6b-4e3a-b3b8-3811b47ff3e4@googlegroups.com>
In reply to#92623
On Monday, June 15, 2015 at 11:13:07 AM UTC+1, Sebastian M Cheung wrote:
> How to do financial data cleaning ? Say I assume a list of 1000 finance series data in myList = Open, High, Low and Close. For missing Close Price data, What is best practice to clean data in Python

Thanks Mark just looking into Pandas now



Seb

[toc] | [prev] | [next] | [standalone]


#92627

FromLaura Creighton <lac@openend.se>
Date2015-06-15 13:34 +0200
Message-ID<mailman.482.1434368106.13271.python-list@python.org>
In reply to#92623
I don't know anything about this program, and in particular how
complete it is, but worth a look
https://github.com/benjaminmgross/clean-fin-data

Laura

[toc] | [prev] | [next] | [standalone]


#92632

FromSebastian M Cheung <minscheung@googlemail.com>
Date2015-06-15 06:42 -0700
Message-ID<907d123b-6eb0-4268-b82b-6388a0c2e7b5@googlegroups.com>
In reply to#92627
On Monday, June 15, 2015 at 12:35:18 PM UTC+1, Laura Creighton wrote:
> I don't know anything about this program, and in particular how
> complete it is, but worth a look
> https://github.com/benjaminmgross/clean-fin-data
> 
> Laura

Thanks Laura, I will check it out, but basically it is to clean financial series data as sometimes there are missing closing price data, misalignment of some sort etc 

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web