Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #101122
| Path | csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail |
|---|---|
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
| Newsgroups | comp.lang.python |
| Subject | Re: Python Data Analysis Recommendations |
| Date | Fri, 1 Jan 2016 21:24:27 +0000 |
| Lines | 63 |
| Message-ID | <mailman.151.1451683499.11925.python-list@python.org> (permalink) |
| References | <n63nrs$5c0$1@dont-email.me> |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset=windows-1252; format=flowed |
| Content-Transfer-Encoding | 7bit |
| X-Trace | news.uni-berlin.de lyQ/q5h4EubUhZg6vhJoPQ1VuwDqwf10o3UdVGsqtmbw== |
| Return-Path | <python-python-list@m.gmane.org> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.002 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'offline': 0.03; 'subject:Python': 0.05; 'from:addr:yahoo.co.uk': 0.05; 'hierarchical': 0.07; 'tool,': 0.07; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'sqlite': 0.09; "they've": 0.09; 'variables,': 0.09; 'wrong,': 0.09; 'python.': 0.11; 'big,': 0.16; 'datasets': 0.16; 'experiments': 0.16; 'grounds': 0.16; 'measured': 0.16; 'notably': 0.16; 'received:80.91.229.3': 0.16; 'received:io': 0.16; 'received:plane.gmane.org': 0.16; 'received:psf.io': 0.16; 'rough': 0.16; 'task.': 0.16; 'tutorials.': 0.16; 'with?': 0.16; 'wrote:': 0.16; 'looked': 0.16; 'basically': 0.18; 'case.': 0.18; 'language': 0.19; 'together.': 0.20; 'amounts': 0.22; 'latter': 0.22; 'lawrence': 0.22; 'own.': 0.22; 'fit': 0.23; 'bit': 0.23; 'references': 0.23; 'tables': 0.23; 'tutorials': 0.23; 'thanks,': 0.24; 'header:In-Reply-To:1': 0.24; "i've": 0.25; 'header:User-Agent:1': 0.26; 'header:X -Complaints-To:1': 0.26; 'handling': 0.27; 'data,': 0.27; 'function': 0.28; 'idea': 0.28; 'actual': 0.28; 'code"': 0.29; 'matplotlib': 0.29; 'quoting': 0.29; "i'm": 0.30; 'books': 0.30; 'you?': 0.30; 'code': 0.30; "i'd": 0.31; 'anyone': 0.32; 'another': 0.32; "can't": 0.32; 'language.': 0.32; 'run': 0.33; 'point': 0.33; 'definition': 0.34; 'that,': 0.34; 'advice': 0.35; 'so,': 0.35; 'something': 0.35; 'comment': 0.35; "isn't": 0.35; 'url:org': 0.36; 'tool': 0.36; 'faster': 0.36; 'to:addr:python- list': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'setting': 0.37; 'received:org': 0.37; 'seem': 0.37; 'things': 0.38; 'doing': 0.38; 'detail': 0.38; 'data': 0.39; 'does': 0.39; 'easily': 0.39; 'rather': 0.39; 'to:addr:python.org': 0.40; 'mark': 0.40; 'some': 0.40; 'collection': 0.60; 'your': 0.60; 'further': 0.62; 'back': 0.62; 'charset:windows-1252': 0.62; 'making': 0.62; 'introduction': 0.63; 'our': 0.64; 'between': 0.65; 'believe': 0.66; 'managing': 0.66; 'subject:Data': 0.66; "they're": 0.66; 'cut': 0.67; 'life': 0.67; 'hearing': 0.67; 'day': 0.67; 'online': 0.71; 'money': 0.71; 'analysis': 0.72; 'inadequate': 0.84; 'pythonistas,': 0.84; 'slope': 0.84; 'suits': 0.84; 'temperature': 0.84; 'voltage': 0.84; 'weekend)': 0.84; '"how': 0.91 |
| X-Injected-Via-Gmane | http://gmane.org/ |
| X-Gmane-NNTP-Posting-Host | ip13.worldwideprinting.adsl.gxn.net |
| X-Mozilla-News-Host | news://news.gmane.org |
| User-Agent | Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 |
| In-Reply-To | <n63nrs$5c0$1@dont-email.me> |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.20+ |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Xref | csiph.com comp.lang.python:101122 |
Show key headers only | View raw
On 31/12/2015 17:15, Rob Gaddi wrote: > I'm looking for some advice on handling data collection/analysis in > Python. I do a lot of big, time consuming experiments in which I run a > long data collection (a day or a weekend) in which I sweep a bunch of > variables, then come back offline and try to cut the data into something > that makes sense. > > For example, my last data collection looked (neglecting all the actual > equipment control code in each loop) like: > > for t in temperatures: > for r in voltage_ranges: > for v in test_voltages[r]: > for c in channels: > for n in range(100): > record_data() > > I've been using Sqlite (through peewee) as the data backend, setting up > a couple tables with a basically hierarchical relationship, and then > handling analysis with a rough cut of SQL queries against the > original data, Numpy/Scipy for further refinement, and Matplotlib > to actually do the visualization. For example, one graph was "How does > the slope of straight line fit between measured and applied voltage vary > as a function of temperature on each channel?" > > The whole process feels a bit grindy; like I keep having to do a lot of > ad-hoc stitching things together. And I keep hearing about pandas, > PyTables, and HDF5. Would that be making my life notably easier? If > so, does anyone have any references on it that they've found > particularly useful? The tutorials I've seen so far seem to not give > much detail on what the point of what they're doing is; it's all "how > you write the code" rather than "why you write the code". Paying money > for books is acceptable; this is all on the company's time/dime. > > Thanks, > Rob > I'd start with pandas http://pandas.pydata.org/and see how you get on. If and only if pandas isn't adequate, and I think that highly unlikely, try PyTables. Quoting from http://www.pytables.org/ "PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data." and "PyTables is built on top of the HDF5 library". I've no idea what the definition of "extremely large" is in this case. How much data are you dealing with? I don't understand your comment about tutorials. Once they've given you an introduction to the tool, isn't it your responsibility to manipulate your data in the way that suits you? If you can't do that, either you're doing something wrong, or the tool is inadequate for the task. For the latter I believe you've two options, find another tool or write your own. I would not buy books, on the simple grounds that they go out of date far faster then the online docs :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Python Data Analysis Recommendations Rob Gaddi <rgaddi@highlandtechnology.invalid> - 2015-12-31 17:15 +0000
Re: Python Data Analysis Recommendations Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-01-01 21:24 +0000
Re: Python Data Analysis Recommendations Ravi Narasimhan <backscatter@rettacs.org> - 2016-01-01 14:16 -0800
Re: Python Data Analysis Recommendations Sameer Grover <sameer.grover.1@gmail.com> - 2016-01-01 23:25 +0530
csiph-web