Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'string': 0.09; "'')": 0.09; 'f.close()': 0.09; 'newline': 0.09; 'subject:fields': 0.09; 'throws': 0.09; 'python': 0.11; 'def': 0.12; 'suggest': 0.14; 'creates': 0.14; "'rb')": 0.16; '(just': 0.16; '-tkc': 0.16; 'cstringio': 0.16; 'csv': 0.16; 'from:addr:python.list': 0.16; 'from:addr:tim.thechases.com': 0.16; 'from:name:tim chase': 0.16; 'pythonistas': 0.16; 'stringio': 0.16; 'subject:CSV': 0.16; 'module': 0.19; "hasn't": 0.19; 'later': 0.20; 'appears': 0.22; 'example': 0.22; 'import': 0.22; 'print': 0.22; 'this?': 0.23; '2.x': 0.24; 'versions': 0.24; "i've": 0.25; 'world,': 0.26; 'values': 0.27; 'external': 0.29; 'ideal': 0.29; "i'm": 0.30; 'code': 0.31; 'file': 0.32; 'me?': 0.32; "i'd": 0.34; 'subject:with': 0.35; 'advice': 0.35; 'possible.': 0.35; 'something': 0.35; 'but': 0.35; 'version': 0.36; 'really': 0.36; 'yield': 0.36; 'to:addr:python-list': 0.38; 'recent': 0.39; 'does': 0.39; 'to:addr:python.org': 0.39; 'changed': 0.39; 'more': 0.64; 'bottom': 0.67; 'containing': 0.69; '2.5),': 0.84; 'received:50.22': 0.84 Date: Wed, 4 Sep 2013 10:04:03 -0500 From: Tim Chase To: python-list@python.org Subject: Dealing with \r in CSV fields in Python2.4 X-Mailer: Claws Mail 3.8.1 (GTK+ 2.24.10; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - boston.accountservergroup.com X-AntiAbuse: Original Domain - python.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tim.thechases.com X-Get-Message-Sender-Via: boston.accountservergroup.com: none X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 45 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1378306984 news.xs4all.nl 15900 [2001:888:2000:d::a6]:57006 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:53629 I've got some old 2.4 code (requires an external lib that hasn't been upgraded) that needs to process a CSV file where some of the values contain \r characters. It appears that in more recent versions (just tested in 2.7; docs suggest this was changed in 2.5), Python does the Right Thing=E2=84=A2 and just creates values in the row containing that \r. However, in 2.4, the csv module chokes on it with _csv.Error: newline inside string as demoed by the example code at the bottom of this email. What's the best way to deal with this? At the moment, I'm just using something like def unCR(f): for line in f: yield line.replace('\r', '') f =3D file('input.csv', 'rb') for row in csv.reader(unCR(f)): code_to_process(row) but this throws away data that I'd really prefer to keep if possible. I know 2.4 isn't exactly popular, and in an ideal world, I'd just upgrade to a later 2.x version that does what I need. Any old-time 2.4 pythonistas have sage advice for me? -tkc from cStringIO import StringIO import csv f =3D file('out.txt', 'wb') w =3D csv.writer(f) w.writerow(["One", "Two"]) w.writerow(["First\rSecond", "Third"]) f.close() f =3D file('out.txt', 'rb') r =3D csv.reader(f) for i, row in enumerate(r): # works in 2.7, fails in 2.4 print repr(row) f.close()