Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #27784

Re: Data cleaning workouts

Path csiph.com!usenet.pasdenom.info!news.albasani.net!nntp-feed.chiark.greenend.org.uk!ewrotcd!news.nosignal.org!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.018
X-Spam-Evidence '*H*': 0.96; '*S*': 0.00; 'python.': 0.02; 'anyway.': 0.04; 'newbie': 0.05; 'scipy': 0.05; 'python': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; "wouldn't": 0.11; 'suggest': 0.11; 'aug': 0.13; 'debates': 0.16; 'ideally,': 0.16; 'posting,': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'wrote:': 0.17; 'saying': 0.18; '(or': 0.18; 'trying': 0.21; 'bit': 0.21; 'thanks.': 0.21; 'form:': 0.22; 'header:In-Reply-To:1': 0.25; 'header:User- Agent:1': 0.26; 'wondering': 0.26; 'raw': 0.27; 'header:X -Complaints-To:1': 0.28; "i'm": 0.29; 'knows': 0.30; 'thursday,': 0.30; '-----': 0.32; 'from:addr:yahoo.co.uk': 0.32; 'certain': 0.33; 'cleaning': 0.33; 'anyone': 0.33; 'to:addr:python-list': 0.33; 'list': 0.35; 'pm,': 0.35; 'there': 0.35; 'list.': 0.35; 'received:org': 0.36; 'subject:': 0.36; 'but': 0.36; 'data.': 0.36; 'email addr:python.org': 0.36; 'should': 0.36; 'resources': 0.37; 'sent:': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'mark': 0.38; 'from:': 0.38; 'some': 0.38; 'sure': 0.38; 'to:addr:python.org': 0.39; 'header:Received:5': 0.40; 'your': 0.60; 'more.': 0.62; 'email name:python-list': 0.62; 'dont': 0.64; 'here': 0.65; 'management': 0.65; 'subject:Data': 0.65; 'talking': 0.66; 'august': 0.66; 'teach': 0.69; 'analysis': 0.70; 'special': 0.73; 'answered,': 0.84; 'basically,': 0.84; 'elevated': 0.84; 'horrible': 0.84; 'much,': 0.84; 'received:89': 0.86; 'dirty': 0.91; 'rusi': 0.91; 'hands': 0.97
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Mark Lawrence <breamoreboy@yahoo.co.uk>
Subject Re: Data cleaning workouts
Date Fri, 24 Aug 2012 09:16:30 +0100
References <mailman.3697.1345708456.4697.python-list@python.org> <5042082c-5764-4c87-897a-776793753f55@r1g2000pbq.googlegroups.com> <1345790897.27768.YahooMailNeo@web122401.mail.ne1.yahoo.com>
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-Gmane-NNTP-Posting-Host host-89-243-202-177.as13285.net
User-Agent Mozilla/5.0 (Windows NT 6.0; rv:14.0) Gecko/20120713 Thunderbird/14.0
In-Reply-To <1345790897.27768.YahooMailNeo@web122401.mail.ne1.yahoo.com>
X-Antivirus avast! (VPS 120824-0, 23/08/2012), Outbound message
X-Antivirus-Status Clean
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.3745.1345796137.4697.python-list@python.org> (permalink)
Lines 39
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1345796137 news.xs4all.nl 6930 [2001:888:2000:d::a6]:56295
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:27784

Show key headers only | View raw


Elevated Python types don't get their hands dirty top posting, but I'm 
certain that they would when talking data or there wouldn't be so many 
debates on which data type to use :)

On 24/08/2012 07:48, Fg Nu wrote:
>
>
> Thanks. I will try the SciPy list. It was a bit of a hail mary anyway. Pretty sure elevated Python types don't actually get their hands dirty with data. ;)
>
>
>
> ----- Original Message -----
> From: rusi <rustompmody@gmail.com>
> To: python-list@python.org
> Cc:
> Sent: Thursday, August 23, 2012 11:01 PM
> Subject: Re: Data cleaning workouts
>
> On Aug 23, 12:52 pm, Fg Nu <fgn...@yahoo.com> wrote:
>> List folk,
>>
>> I am a newbie trying to get used to Python. I was wondering if anyone knows of web resources that teach good practices in data cleaning and management for statistics/analytics/machine learning, particularly using Python.
>>
>> Ideally, these would be exercises of the form: here is some horrible raw data --> here is what it should look like after it has been cleaned. Guidelines about steps that should always be taken, practices that should be avoided; basically, workflow of data analysis in Python with special emphasis on the cleaning part.
>
> Since no one has answered, I suggest you narrow your searching from
> 'python' to 'scipy' (or 'numpy').
> Also perhaps ipython.
> And then perhaps try those specific mailing lists/fora.
>
> Since I dont know this area much, not saying more.
>


-- 
Cheers.

Mark Lawrence.

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

Data cleaning workouts Fg Nu <fgnu32@yahoo.com> - 2012-08-23 00:52 -0700
  Re: Data cleaning workouts rusi <rustompmody@gmail.com> - 2012-08-23 23:01 -0700
    Re: Data cleaning workouts Fg Nu <fgnu32@yahoo.com> - 2012-08-23 23:48 -0700
    Re: Data cleaning workouts Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-08-24 09:16 +0100

csiph-web