Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'data:': 0.07; 'python': 0.09; '(same': 0.09; 'cvs': 0.09; 'field)': 0.09; 'format:': 0.09; 'formatted': 0.09; 'skip:[ 40': 0.09; 'snippet': 0.09; 'subject:Help': 0.10; 'file,': 0.15; '2.7.3': 0.16; 'co,': 0.16; 'csv': 0.16; 'decode': 0.16; 'module:': 0.16; 'wrote:': 0.17; 'appears': 0.18; 'trying': 0.21; 'import': 0.21; 'linux': 0.24; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'tree': 0.27; 'there.': 0.28; 'actual': 0.28; 'source': 0.29; "i'm": 0.29; 'code': 0.31; 'url:python': 0.32; 'subject:data': 0.33; 'to:addr:python-list': 0.33; 'follows:': 0.35; 'saved': 0.35; 'pm,': 0.35; 'url:org': 0.36; 'skip:m 40': 0.36; 'url:library': 0.36; 'why': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'url:docs': 0.38; 'to:addr:python.org': 0.39; 'received:192': 0.39; 'received:192.168': 0.40; 'lost': 0.60; 'note:': 0.64; 'url:content': 0.66; 'family': 0.68; 'received:74.208': 0.71; 'received:74.208.4.194': 0.84; 'cape': 0.91 Date: Sun, 20 Jan 2013 17:21:33 -0500 From: Dave Angel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Help splitting CVS data References: <3e1e8567-b9f4-446a-8a59-75f45367d2ac@googlegroups.com> In-Reply-To: <3e1e8567-b9f4-446a-8a59-75f45367d2ac@googlegroups.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:cbMx011w/lnxMOYqNMAJDy1+vXE1ov01kRL4DU5xMme U/S5xsNveUUSDn+5zMjEZ10k1tciVK0Ki1hWy//F57VnGZU6Gq UgRhffhmIFdYJBBtfG3oOrxH5wpmpNTfKQRZEsRxfKQ5eARMgR SEpMF9Yg5KZLWcYimrR+S0TJW9ax3RjSzhFLi96vpUFHp/Gho2 l0yvjUMnUydRsjUvqEmEFkv6A5PjDQSEovp6VhsgbmJAqfMIfR MEMhUHRbQWF14TzJ+L7v5jX08S+dHT7f+MQ//Xupm8mK37cuWO CQvXWcA4Ky1YpwS9gNXzVycnIg5AX5IltjCaWz6zx1BOLztvw= = X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 37 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1358720516 news.xs4all.nl 6895 [2001:888:2000:d::a6]:46558 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:37123 On 01/20/2013 05:04 PM, Garry wrote: > I'm trying to manipulate family tree data using Python. > I'm using linux and Python 2.7.3 and have data files saved as Linux formatted cvs files > The data appears in this format: > > Marriage,Husband,Wife,Date,Place,Source,Note0x0a > Note: the Source field or the Note field can contain quoted data (same as the Place field) > > Actual data: > [F0244],[I0690],[I0354],1916-06-08,"Neely's Landing, Cape Gir. Co, MO",,0x0a > [F0245],[I0692],[I0355],1919-09-04,"Cape Girardeau Co, MO",,0x0a > > code snippet follows: > > import os > import re > #I'm using the following regex in an attempt to decode the data: > RegExp2 = "^(\[[A-Z]\d{1,}\])\,(\[[A-Z]\d{1,}\])\,(\[[A-Z]\d{1,}\])\,(\d{,4}\-\d{,2}\-\d{,2})\,(.*|\".*\")\,(.*|\".*\")\,(.*|\".*\")" > # Well, you lost me about there. For a csv file, why not use the csv module: import csv ifile = open('test.csv', "rb") reader = csv.reader(ifile) For reference, see http://docs.python.org/2/library/csv.html and for sample use and discussion, see http://www.linuxjournal.com/content/handling-csv-files-python -- DaveA