Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #63842

Re: Problem writing some strings (UnicodeEncodeError)

Path csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'encoded': 0.07; 'purpose.': 0.07; 'bash': 0.09; 'editor.': 0.09; 'escape': 0.09; 'expected.': 0.09; 'filename': 0.09; 'filenames': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'things,': 0.09; 'python': 0.11; 'itself.': 0.14; 'template': 0.14; '"w")': 0.16; 'paulo': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-ipconnect.de': 0.16; 'simplifies': 0.16; 'subject:Problem': 0.16; 'subject:writing': 0.16; 'substitution': 0.16; 'worst': 0.16; 'fix': 0.17; 'wrote:': 0.18; 'looked': 0.18; 'variable': 0.18; 'bit': 0.19; '>>>': 0.22; 'import': 0.22; 'shell': 0.22; 'header:User-Agent:1': 0.23; 'bytes': 0.24; 'char': 0.24; 'script.': 0.24; 'simpler': 0.24; 'script': 0.25; 'pass': 0.26; 'header:X-Complaints-To:1': 0.27; 'tried': 0.27; 'skip:p 30': 0.29; 'characters': 0.30; 'code': 0.31; "skip:' 10": 0.31; '"",': 0.31; '>>>>': 0.31; 'portuguese': 0.31; 'subject:some': 0.31; 'file': 0.32; 'this.': 0.32; 'probably': 0.32; 'another': 0.32; 'could': 0.34; 'problem': 0.35; 'subject: (': 0.35; 'problem.': 0.35; 'case,': 0.35; 'but': 0.35; 'there': 0.35; "didn't": 0.36; 'application': 0.37; 'list': 0.37; 'to:addr :python-list': 0.38; 'to:addr:python.org': 0.39; 'either': 0.39; 'skip:p 20': 0.39; 'received:org': 0.40; 'easy': 0.60; 'simple': 0.61; 'times': 0.62; 'complete': 0.62; 'address': 0.63; 'happen': 0.63; 'more': 0.64; 'between': 0.67; 'special': 0.74; 'otten': 0.84; 'mean.': 0.91; 'shell,': 0.91; 'serious': 0.97
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Peter Otten <__peter__@web.de>
Subject Re: Problem writing some strings (UnicodeEncodeError)
Date Mon, 13 Jan 2014 18:29:28 +0100
Organization None
References <laucp8$890$1@speranza.aioe.org> <mailman.5374.1389543800.18130.python-list@python.org> <laukne$sp6$1@speranza.aioe.org> <mailman.5382.1389552633.18130.python-list@python.org> <laur49$dj8$1@speranza.aioe.org> <mailman.5386.1389558607.18130.python-list@python.org> <lav9u1$hc8$1@speranza.aioe.org> <lb098a$jmo$1@ger.gmane.org> <mailman.5397.1389603536.18130.python-list@python.org> <lb13cq$j8g$1@speranza.aioe.org>
Mime-Version 1.0
Content-Type text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding 7Bit
X-Gmane-NNTP-Posting-Host p50848b44.dip0.t-ipconnect.de
User-Agent KNode/4.7.3
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.5416.1389634177.18130.python-list@python.org> (permalink)
Lines 68
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1389634178 news.xs4all.nl 2899 [2001:888:2000:d::a6]:41589
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:63842

Show key headers only | View raw


Paulo da Silva wrote:

> Em 13-01-2014 08:58, Peter Otten escreveu:
>> Peter Otten wrote:
>> 
>>> Paulo da Silva wrote:
>>>
>>>> Em 12-01-2014 20:29, Peter Otten escreveu:
>>>>> Paulo da Silva wrote:
>>>>>
>>>>>>> but I have not tried it myself. Also, some bytes may need to be
>>>>>>> escaped, either to be understood by the shell, or to address
>>>>>>> security concerns:
>>>>>>>
>>>>>>
>>>>>> Since I am puting the file names between "", the only char that needs
>>>>>> to be escaped is the " itself.
>>>>>
>>>>> What about the escape char?
>>>>>
>>>> Just this fn=fn.replace('"','\\"')
>>>>
>>>> So far I didn't find any problem, but the script is still running.
>>>
>>> To be a bit more explicit:
>>>
>>>>>> for filename in os.listdir():
>>> ...     print(template.replace("<fn>", filename.replace('"', '\\"')))
>>> ...
>>> ls "\\"; rm whatever; ls \"
>> 
>> The complete session:
>> 
>>>>> import os
>>>>> template = 'ls "<fn>"'
>>>>> with open('\\"; rm whatever; ls \\', "w") as f: pass
>> ...
>>>>> for filename in os.listdir():
>> ...     print(template.replace("<fn>", filename.replace('"', '\\"')))
>> ...
>> ls "\\"; rm whatever; ls \"
>> 
>> 
>> Shell variable substitution is another problem. c.l.py is probably not
>> the best place to get the complete list of possibilities.
> I see what you mean.
> This is a tedious problem. Don't know if there is a simple solution in
> python for this. I have to think about it ...
> On a more general and serious application I would not produce a bash
> script. I would do all the work in python.
> 
> That's not the case, however. This is a few times execution script for a
> very special purpose. The only problem was the occurrence of some
> Portuguese characters in old filenames encoded in another code than
> utf-8. Very few also include the ".
> 
> The worst thing that could happen was the bash script to abort. Then it
> would be easy to fix it using a simple editor.

I looked around in the stdlib and found shlex.quote(). It uses ' instead of 
" which simplifies things, and special-cases only ':

>>> print(shlex.quote("alpha'beta"))
'alpha'"'"'beta'

So the answer is simpler than I had expected.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Problem writing some strings (UnicodeEncodeError) Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2014-01-12 15:36 +0000
  Re: Problem writing some strings (UnicodeEncodeError) Peter Otten <__peter__@web.de> - 2014-01-12 17:23 +0100
    Re: Problem writing some strings (UnicodeEncodeError) Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2014-01-12 17:51 +0000
      Re: Problem writing some strings (UnicodeEncodeError) Peter Otten <__peter__@web.de> - 2014-01-12 19:50 +0100
        Re: Problem writing some strings (UnicodeEncodeError) Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2014-01-12 19:41 +0000
          Re: Problem writing some strings (UnicodeEncodeError) Peter Otten <__peter__@web.de> - 2014-01-12 21:29 +0100
            Re: Problem writing some strings (UnicodeEncodeError) Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2014-01-12 23:53 +0000
              Re: Problem writing some strings (UnicodeEncodeError) Peter Otten <__peter__@web.de> - 2014-01-13 09:48 +0100
              Re: Problem writing some strings (UnicodeEncodeError) Peter Otten <__peter__@web.de> - 2014-01-13 09:58 +0100
                Re: Problem writing some strings (UnicodeEncodeError) Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2014-01-13 16:14 +0000
                Re: Problem writing some strings (UnicodeEncodeError) Peter Otten <__peter__@web.de> - 2014-01-13 18:29 +0100
                Re: Problem writing some strings (UnicodeEncodeError) Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2014-01-13 18:44 +0000
  Re: Problem writing some strings (UnicodeEncodeError) Emile van Sebille <emile@fenx.com> - 2014-01-12 08:55 -0800

csiph-web