Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #25619 > unrolled thread

Re: Odd csv column-name truncation with only one column

Started byPeter Otten <__peter__@web.de>
First post2012-07-19 13:49 +0200
Last post2012-07-19 13:49 +0200
Articles 1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Odd csv column-name truncation with only one column Peter Otten <__peter__@web.de> - 2012-07-19 13:49 +0200

#25619 — Re: Odd csv column-name truncation with only one column

FromPeter Otten <__peter__@web.de>
Date2012-07-19 13:49 +0200
SubjectRe: Odd csv column-name truncation with only one column
Message-ID<mailman.2298.1342698598.4697.python-list@python.org>
Tim Chase wrote:

> tim@laptop:~/tmp$ python
> Python 2.6.6 (r266:84292, Dec 26 2010, 22:31:48)
> [GCC 4.4.5] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import csv
>>>> from cStringIO import StringIO
>>>> s = StringIO('Email\nfoo@example.com\nbar@example.org\n')
>>>> s.seek(0)
>>>> d = csv.Sniffer().sniff(s.read())
>>>> s.seek(0)
>>>> r = csv.DictReader(s, dialect=d)
>>>> r.fieldnames
> ['Emai', '']
> 
> I get the same results using Python 3.1.3 (also readily available on
> Debian Stable), as well as working directly on a file rather than a
> StringIO.
> 
> Any reason I'm getting ['Emai', ''] (note the missing ell) instead
> of ['Email'] as my resulting fieldnames?  Did I miss something in
> the docs?

Judging from 

>>> import csv
>>> sniffer = csv.Sniffer()
>>> sniffer.sniff("abc").delimiter
'c'
>>> sniffer.sniff("abc\naba").delimiter
'b'
>>> sniffer.sniff("abc\naba\nxyz").delimiter
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/csv.py", line 184, in sniff
    raise Error, "Could not determine delimiter"
_csv.Error: Could not determine delimiter
>>> sniffer.sniff("abc\n"*10 + "xyz").delimiter
'c'
>>> sniffer.sniff("abc\n"*9 + "xyz").delimiter
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/csv.py", line 184, in sniff
    raise Error, "Could not determine delimiter"
_csv.Error: Could not determine delimiter

the Sniffer heuristics determines a character that occurs on all of the 
first 10 lines to be the delimiter. There are of course examples where that 
doesn't make sense to a human observer...

[toc] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web