Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #52215

Re: right adjusted strings containing umlauts

From Dave Angel <davea@davea.name>
Subject Re: right adjusted strings containing umlauts
Date 2013-08-08 17:47 +0000
References <mailman.352.1375972418.1251.python-list@python.org> <9781df99-f9c8-4217-aa67-7a714b7f2ebe@googlegroups.com> <5203B841.4060304@gmail.com> <ku0eo0$9v9$1@ger.gmane.org> <5203C6DA.6060108@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.364.1375984053.1251.python-list@python.org> (permalink)

Show all headers | View raw


Kurt Mueller wrote:

> Now I have this small example:
> ----------------------------------------------------------
> #!/usr/bin/env python
> # vim: set fileencoding=utf-8 :
>
> from __future__ import print_function
> import sys, shlex
>
> print( repr( sys.stdin.encoding ) )
>
> strg_form = u'{0:>3} {1:>3} {2:>3} {3:>3} {4:>3}'
> for inpt_line in sys.stdin:
>     proc_line = shlex.split( inpt_line, False, True, )
>     encoding = "utf-8"
>     proc_line = [ strg.decode( encoding ) for strg in proc_line ]
>     print( strg_form.format( *proc_line ) )
> ----------------------------------------------------------
>
> $ echo -e "a b c d e\na ö u 1 2" | file -
> /dev/stdin: UTF-8 Unicode text
> $ echo -e "a b c d e\na ö u 1 2" | ./align_compact.py
> None
>   a   b   c   d   e
>   a   ö   u   1   2
> $ echo -e "a b c d e\na ö u 1 2" | recode utf8..latin9 | file -
> /dev/stdin: ISO-8859 text
> $ echo -e "a b c d e\na ö u 1 2" | recode utf8..latin9 | ./align_compact.py
> None
>   a   b   c   d   e
> Traceback (most recent call last):
>   File "./align_compact.py", line 13, in <module>
>     proc_line = [ strg.decode( encoding ) for strg in proc_line ]
>   File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
>     return codecs.utf_8_decode(input, errors, True)
> UnicodeDecodeError: 'utf8' codec can't decode byte 0xf6 in position 0: invalid start byte
> muk@mcp20:/sw/prog/scripts/text_manip>
>
> How do I handle this two inputs?
>

Once you're using pipes, you've given up any hope that the terminal will
report a useful encoding, so I'm not surprised you're getting None for
sys.stdin.encoding()

So you can either do as others have suggested, and guess, or you can get
the information explicitly, say from argv.  In any case you'll need a
different way to assign   encoding = 


-- 
DaveA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

right adjusted strings containing umlauts Kurt Mueller <kurt.alfred.mueller@gmail.com> - 2013-08-08 16:23 +0200
  Re: right adjusted strings containing umlauts Neil Cerutti <neilc@norwich.edu> - 2013-08-08 14:40 +0000
    Re: right adjusted strings containing umlauts MRAB <python@mrabarnett.plus.com> - 2013-08-08 16:19 +0100
  Re: right adjusted strings containing umlauts jfharden@gmail.com - 2013-08-08 07:43 -0700
    Re: right adjusted strings containing umlauts Kurt Mueller <kurt.alfred.mueller@gmail.com> - 2013-08-08 17:24 +0200
      Re: right adjusted strings containing umlauts Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-08-10 01:29 +0000
    Re: right adjusted strings containing umlauts Peter Otten <__peter__@web.de> - 2013-08-08 17:44 +0200
    Re: right adjusted strings containing umlauts Dave Angel <davea@davea.name> - 2013-08-08 15:50 +0000
    Re: right adjusted strings containing umlauts Kurt Mueller <kurt.alfred.mueller@gmail.com> - 2013-08-08 18:16 +0200
    Re: right adjusted strings containing umlauts Kurt Mueller <kurt.alfred.mueller@gmail.com> - 2013-08-08 18:27 +0200
      Re: right adjusted strings containing umlauts wxjmfauth@gmail.com - 2013-08-09 01:30 -0700
    Re: right adjusted strings containing umlauts Peter Otten <__peter__@web.de> - 2013-08-08 18:34 +0200
    Re: right adjusted strings containing umlauts Chris Angelico <rosuav@gmail.com> - 2013-08-08 17:37 +0100
    Re: right adjusted strings containing umlauts Dave Angel <davea@davea.name> - 2013-08-08 17:47 +0000
    Re: right adjusted strings containing umlauts Terry Reedy <tjreedy@udel.edu> - 2013-08-08 16:51 -0400
    Re: right adjusted strings containing umlauts Kurt Mueller <kurt.alfred.mueller@gmail.com> - 2013-08-23 17:47 +0200
    Re: right adjusted strings containing umlauts Kurt Mueller <kurt.alfred.mueller@gmail.com> - 2013-08-28 10:01 +0200
    Re: right adjusted strings containing umlauts Dave Angel <davea@davea.name> - 2013-08-28 10:23 +0000
      Re: right adjusted strings containing umlauts kurt.alfred.mueller@gmail.com - 2013-08-28 04:17 -0700

csiph-web