Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #22086 > unrolled thread
| Started by | Grzegorz Staniak <gstaniak@gmail.com> |
|---|---|
| First post | 2012-03-23 16:43 +0000 |
| Last post | 2012-03-24 05:21 +0000 |
| Articles | 3 — 2 participants |
Back to article view | Back to comp.lang.python
Data mining/pattern recogniton software in Python? Grzegorz Staniak <gstaniak@gmail.com> - 2012-03-23 16:43 +0000
Re: Data mining/pattern recogniton software in Python? Jon Clements <joncle@googlemail.com> - 2012-03-23 20:10 -0700
Re: Data mining/pattern recogniton software in Python? Grzegorz Staniak <gstaniak@gmail.com> - 2012-03-24 05:21 +0000
| From | Grzegorz Staniak <gstaniak@gmail.com> |
|---|---|
| Date | 2012-03-23 16:43 +0000 |
| Subject | Data mining/pattern recogniton software in Python? |
| Message-ID | <jki97s$66e$1@mx1.internetia.pl> |
Hello, I've been asked by a colleague for help in a small educational project, which would involve the recognition of patterns in a live feed of data points (readings from a measuring appliance), and then a more general search for patterns on archival data. The language of preference is Python, since the lab uses software written in Python already. I can see there are packages like Open CV, scikit-learn, Orange that could perhaps be of use for the mining phase -- and even if they are slanted towards image pattern recognition, I think I'd be able to find an appropriate package for the timeseries analyses. But I'm wondering about the "live" phase -- what approach would you suggest? I wouldn't want to force an open door, perhaps there are already packages/modules that could be used to read data in a loop i.e. every 10 seconds, maintain a a buffer of 15 readings and ring a bell when the data in buffer form a specific pattern (a spike, a trough, whatever)? I'll be grateful for a push in the right direction. Thanks, GS -- Grzegorz Staniak <gstaniak _at_ gmail [dot] com>
[toc] | [next] | [standalone]
| From | Jon Clements <joncle@googlemail.com> |
|---|---|
| Date | 2012-03-23 20:10 -0700 |
| Message-ID | <30719938.516.1332558628851.JavaMail.geo-discussion-forums@vbut24> |
| In reply to | #22086 |
On Friday, 23 March 2012 16:43:40 UTC, Grzegorz Staniak wrote: > Hello, > > I've been asked by a colleague for help in a small educational > project, which would involve the recognition of patterns in a live > feed of data points (readings from a measuring appliance), and then > a more general search for patterns on archival data. The language > of preference is Python, since the lab uses software written in > Python already. I can see there are packages like Open CV, > scikit-learn, Orange that could perhaps be of use for the mining > phase -- and even if they are slanted towards image pattern > recognition, I think I'd be able to find an appropriate package > for the timeseries analyses. But I'm wondering about the "live" > phase -- what approach would you suggest? I wouldn't want to > force an open door, perhaps there are already packages/modules that > could be used to read data in a loop i.e. every 10 seconds, > maintain a a buffer of 15 readings and ring a bell when the data > in buffer form a specific pattern (a spike, a trough, whatever)? > > I'll be grateful for a push in the right direction. Thanks, > > GS > -- > Grzegorz Staniak <gstaniak _at_ gmail [dot] com> It might also be worth checking out pandas[1] and scikits.statsmodels[2]. In terms of reading data in a loop I would probably go for a producer-consumer model (possibly using a Queue[3]). Have the consumer constantly try to get another reading, and notify the consumer which can then determine if it's got enough data to calculate a peak/trough. This article is also a fairly good read[4]. That's some pointers anyway, hth, Jon. [1] http://pandas.pydata.org/ [2] http://statsmodels.sourceforge.net/ [3] http://docs.python.org/library/queue.html [4] http://www.laurentluce.com/posts/python-threads-synchronization-locks-rlocks-semaphores-conditions-events-and-queues/
[toc] | [prev] | [next] | [standalone]
| From | Grzegorz Staniak <gstaniak@gmail.com> |
|---|---|
| Date | 2012-03-24 05:21 +0000 |
| Message-ID | <jkjlk5$jfa$1@mx1.internetia.pl> |
| In reply to | #22109 |
On 24.03.2012, Jon Clements <joncle@googlemail.com> wroted: > It might also be worth checking out pandas[1] and scikits.statsmodels[2]. > > In terms of reading data in a loop I would probably go for a producer-consumer model (possibly using a Queue[3]). Have the consumer constantly try to get another reading, and notify the consumer which can then determine if it's got enough data to calculate a peak/trough. This article is also a fairly good read[4]. > > That's some pointers anyway, > > hth, > > Jon. > > [1] http://pandas.pydata.org/ > [2] http://statsmodels.sourceforge.net/ > [3] http://docs.python.org/library/queue.html > [4] http://www.laurentluce.com/posts/python-threads-synchronization-locks-rlocks-semaphores-conditions-events-and-queues/ Thanks for the suggestions. GS -- Grzegorz Staniak <gstaniak _at_ gmail [dot] com>
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web