Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #89993

Re: Stripping unencodable characters from a string

Newsgroups comp.lang.python
Date 2015-05-05 12:24 -0700
References <24ef6c6d-a47a-4d8c-8651-c581e25161cb@googlegroups.com> <mailman.137.1430852451.12865.python-list@python.org>
Message-ID <a75300bf-7a30-464a-84cb-eb3cde4ca40f@googlegroups.com> (permalink)
Subject Re: Stripping unencodable characters from a string
From Paul Moore <p.f.moore@gmail.com>

Show all headers | View raw


On Tuesday, 5 May 2015 20:01:04 UTC+1, Dave Angel  wrote:
> On 05/05/2015 02:19 PM, Paul Moore wrote:
> 
> You need to specify that you're using Python 3.4 (or whichever) when 
> starting a new thread.

Sorry. 2.6, 2.7, and 3.3+. It's for use in a cross-version library.

> If you're going to take charge of the encoding of the file, why not just 
> open the file in binary, and do it all with
>      file.write(data.encode( myencoding, errors='replace') )

I don't have control of the encoding of the file. It's typically sys.stdout, which is already open. I can't replace sys.stdout (because the main program which calls my library code wouldn't like me messing with global state behind its back). And sys.stdout isn't open in binary mode.

> i can't see the benefit of two encodes and a decode just to write a 
> string to the file.

Nor can I - that's my point. But if all I have is an open text-mode file with the "strict" error mode, I have to incur one encode, and I have to make sure that no characters are passed to that encode which can't be encoded.

If there was a codec method to identify un-encodable characters, that might be an alternative (although it's quite possible that the encode/decode dance would be faster anyway, as it's mostly in C - not that performance is key here).

> Alternatively, there's probably a way to open the file using 
> codecs.open(), and reassign it to sys.stdout.

As I said, I have to work with the file (sys.stdout or whatever) that I'm given. I can't reopen or replace it.

Paul

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Stripping unencodable characters from a string Paul  Moore <p.f.moore@gmail.com> - 2015-05-05 11:19 -0700
  Re: Stripping unencodable characters from a string Dave Angel <davea@davea.name> - 2015-05-05 15:00 -0400
    Re: Stripping unencodable characters from a string Paul  Moore <p.f.moore@gmail.com> - 2015-05-05 12:24 -0700
      Re: Stripping unencodable characters from a string Marko Rauhamaa <marko@pacujo.net> - 2015-05-05 22:55 +0300
  Re: Stripping unencodable characters from a string Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-05-05 19:33 +0000
  Re: Stripping unencodable characters from a string Chris Angelico <rosuav@gmail.com> - 2015-05-06 10:02 +1000
  Re: Stripping unencodable characters from a string Serhiy Storchaka <storchaka@gmail.com> - 2015-05-08 15:28 +0300

csiph-web