Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #99967

Re: getting fileinput to do errors='ignore' or 'replace'?

From MRAB <python@mrabarnett.plus.com>
Newsgroups comp.lang.python
Subject Re: getting fileinput to do errors='ignore' or 'replace'?
Date 2015-12-03 16:12 +0000
Message-ID <mailman.175.1449159185.14615.python-list@python.org> (permalink)
References <fn26jcxltl.ln2@news.ducksburg.com>

Show all headers | View raw


On 2015-12-03 15:12, Adam Funk wrote:
> I'm having trouble with some input files that are almost all proper
> UTF-8 but with a couple of troublesome characters mixed in, which I'd
> like to ignore instead of throwing ValueError.  I've found the
> openhook for the encoding
>
> for line in fileinput.input(options.files, openhook=fileinput.hook_encoded("utf-8")):
>      do_stuff(line)
>
> which the documentation describes as "a hook which opens each file
> with codecs.open(), using the given encoding to read the file", but
> I'd like codecs.open() to also have the errors='ignore' or
> errors='replace' effect.  Is it possible to do this?
>
It looks like it's not possible with the standard "hook_encoded", but
you could write your own alternative:

import codecs

def my_hook_encoded(encoding, errors):

     def opener(path, mode):
         return codecs.open(path, mode, encoding=encoding, errors=errors)

     return opener

for line in fileinput.input(options.files, 
openhook=fileinput.my_hook_encoded("utf-8", "ignore")):
     do_stuff(line)

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

getting fileinput to do errors='ignore' or 'replace'? Adam Funk <a24061@ducksburg.com> - 2015-12-03 15:12 +0000
  Re: getting fileinput to do errors='ignore' or 'replace'? Adam Funk <a24061@ducksburg.com> - 2015-12-03 15:18 +0000
    Re: getting fileinput to do errors='ignore' or 'replace'? Peter Otten <__peter__@web.de> - 2015-12-03 17:11 +0100
      Re: getting fileinput to do errors='ignore' or 'replace'? Adam Funk <a24061@ducksburg.com> - 2015-12-03 19:17 +0000
    Re: getting fileinput to do errors='ignore' or 'replace'? Terry Reedy <tjreedy@udel.edu> - 2015-12-03 11:48 -0500
      Re: getting fileinput to do errors='ignore' or 'replace'? Adam Funk <a24061@ducksburg.com> - 2015-12-03 19:21 +0000
    Re: getting fileinput to do errors='ignore' or 'replace'? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-12-03 22:26 +0000
    Re: getting fileinput to do errors='ignore' or 'replace'? Serhiy Storchaka <storchaka@gmail.com> - 2015-12-04 10:34 +0200
    Re: getting fileinput to do errors='ignore' or 'replace'? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-12-04 09:00 +0000
      Re: getting fileinput to do errors='ignore' or 'replace'? Adam Funk <a24061@ducksburg.com> - 2015-12-07 14:46 +0000
  Re: getting fileinput to do errors='ignore' or 'replace'? MRAB <python@mrabarnett.plus.com> - 2015-12-03 16:12 +0000
  Re: getting fileinput to do errors='ignore' or 'replace'? Laura Creighton <lac@openend.se> - 2015-12-03 17:46 +0100
    Re: getting fileinput to do errors='ignore' or 'replace'? Adam Funk <a24061@ducksburg.com> - 2015-12-03 19:17 +0000
      Re: getting fileinput to do errors='ignore' or 'replace'? Laura Creighton <lac@openend.se> - 2015-12-03 21:40 +0100

csiph-web