Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #97455

Re: Finding Blank Columns in CSV

Path csiph.com!goblin1!goblin.stu.neva.ru!uio.no!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!nzpost1.xs4all.net!not-for-mail
Return-Path <python@mrabarnett.plus.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.003
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'encoded': 0.05; 'csv': 0.09; 'indicates': 0.09; 'subject:CSV': 0.09; 'file,': 0.15; 'skip:f 30': 0.15; "'rb').read()": 0.16; '_do_': 0.16; 'file?': 0.16; 'fn)': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:addr:python': 0.16; 'from:name:mrab': 0.16; 'message- id:@mrabarnett.plus.com': 0.16; 'received:192.168.1.4': 0.16; 'wrote:': 0.16; 'found,': 0.18; '2015': 0.20; 'file.': 0.22; 'trying': 0.22; 'tried': 0.24; 'downloaded': 0.24; 'header:In- Reply-To:1': 0.24; 'header:User-Agent:1': 0.26; '+0200,': 0.27; 'looks': 0.29; 'you?': 0.30; 'another': 0.32; 'null': 0.33; 'open': 0.33; 'tue,': 0.34; 'file': 0.34; 'so,': 0.35; 'there': 0.36; 'to:addr:python-list': 0.36; 'subject:: ': 0.37; 'google': 0.39; 'why': 0.39; 'data': 0.39; 'received:192': 0.39; 'to:addr:python.org': 0.40; 'still': 0.40; 'some': 0.40; 'special': 0.73; 'otten': 0.84; 'contacts': 0.97
X-CM-Score 0.00
X-CNFS-Analysis v=2.1 cv=CvRCCSMD c=1 sm=1 tr=0 a=0nF1XD0wxitMEM03M9B4ZQ==:117 a=0nF1XD0wxitMEM03M9B4ZQ==:17 a=0Bzu9jTXAAAA:8 a=EBOSESyhAAAA:8 a=IkcTkHD0fZMA:10 a=o_rhbB355_oXZ7SOAjEA:9 a=QEXdDO2ut3YA:10
X-AUTH mrabarnett@:2500
Subject Re: Finding Blank Columns in CSV
To python-list@python.org
References <muttuu$pc5$1@ger.gmane.org> <CAPTjJmrdE-n=VazddU7PZS6E=rbBf1XVoz_g2ArxK6454D5qxw@mail.gmail.com> <CAPTjJmqYwD0=KNLaHu_w_sOjUoacQRXFhb4Zxdv6n++5A-Y8iQ@mail.gmail.com> <20151005090652.1c9faed7@bigbox.christie.dr> <CAPTjJmpv1OOEomjjaqQ5qiGQDHpxAR4RpxLxn1Rt+j=TQBz4Yg@mail.gmail.com> <mv0b1h$5rd$1@ger.gmane.org> <mv0f3g$8pd$1@ger.gmane.org> <mv102j$oil$1@ger.gmane.org>
From MRAB <python@mrabarnett.plus.com>
Date Tue, 6 Oct 2015 19:25:12 +0100
User-Agent Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version 1.0
In-Reply-To <mv102j$oil$1@ger.gmane.org>
Content-Type text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding 7bit
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.20+
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.433.1444155917.28679.python-list@python.org> (permalink)
Lines 40
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1444155917 news.xs4all.nl 23845 [2001:888:2000:d::a6]:55424
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:97455

Show key headers only | View raw


On 2015-10-06 18:23, Jaydip Chakrabarty wrote:
> On Tue, 06 Oct 2015 14:33:51 +0200, Peter Otten wrote:
>
[snip]
>
> I downloaded gmail contacts in google csv format. There are so many
> columns. So I was trying to create another csv with the required columns.
> Now when I tried to open the gmail csv file with csv DictReader, it said
> the file contained NULL characters.

Why would there be nulls in a CSV file?

> So first I did -
>
> data = open(fn, 'rb').read()
> fout = open(ofn, 'wb')
> fout.write(data.replace('\x00', ''))
> fout.close()
> shutil.move(ofn, fn)
>
> Then I found, there were some special characters in the file. So, once
> again I opened the file and did -
>
> data = open(fn, 'rb').read()
> fout = open(ofn, 'wb')
> fout.write(data.replace('\xff\xfe', ''))
> fout.close()
> shutil.move(ofn, fn)
>
b'\xff\xfe' looks like a BOM.

If it's at the start of the file, it indicates that the file is encoded
in 'UTF16-LE'.

So, apparently, the original file was CSV encoded in 'UTF16-LE'.

You _do_ still have the original file, don't you?

[snip]

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Finding Blank Columns in CSV MRAB <python@mrabarnett.plus.com> - 2015-10-06 19:25 +0100

csiph-web