Path: csiph.com!news.mixmin.net!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Adam Funk Newsgroups: comp.lang.python Subject: Re: getting fileinput to do errors='ignore' or 'replace'? Date: Thu, 03 Dec 2015 19:17:51 +0000 Organization: $CABAL Lines: 31 Message-ID: References: X-Trace: individual.net s5IUz0YyGlxs/tB7vBgVAgfb3J7t4wZLjESOmTOFNn0/yepvRR X-Orig-Path: news.ducksburg.com!not-for-mail Cancel-Lock: sha1:7vpBZFoq4QHGOsIHynwZBiOflRo= sha1:RDgD3e+WZu8TLOXtyheAqHb2NLc= User-Agent: slrn/pre1.0.3-5 (Linux) Xref: csiph.com comp.lang.python:99982 On 2015-12-03, Laura Creighton wrote: > In a message of Thu, 03 Dec 2015 15:12:15 +0000, Adam Funk writes: >>I'm having trouble with some input files that are almost all proper >>UTF-8 but with a couple of troublesome characters mixed in, which I'd >>like to ignore instead of throwing ValueError. I've found the >>openhook for the encoding >> >>for line in fileinput.input(options.files, openhook=fileinput.hook_encoded("utf-8")): >> do_stuff(line) >> >>which the documentation describes as "a hook which opens each file >>with codecs.open(), using the given encoding to read the file", but >>I'd like codecs.open() to also have the errors='ignore' or >>errors='replace' effect. Is it possible to do this? >> >>Thanks. > > This should be both easy to add, and useful, and I happen to know that > fileinput is being hacked on by Serhiy Storchaka right now, who agrees > that this would be easy. So, with his approval, I stuck this into the > tracker. http://bugs.python.org/issue25788 > > Future Pythons may not have the problem. Good to know, thanks. -- You cannot really appreciate Dilbert unless you've read it in the original Klingon. --- Klingon Programmer's Guide