Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #10396 > unrolled thread

shlex parsing

Started byKarim <karim.liateni@free.fr>
First post2011-07-27 21:30 +0200
Last post2011-07-29 16:34 +0200
Articles 10 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  shlex parsing Karim <karim.liateni@free.fr> - 2011-07-27 21:30 +0200
    Re: shlex parsing Web Dreamer <webdreamer@nospam.fr> - 2011-07-28 17:48 +0200
      Re: shlex parsing Karim <karim.liateni@free.fr> - 2011-07-28 18:21 +0200
      Re: shlex parsing Nobody <nobody@nowhere.com> - 2011-07-28 17:37 +0100
        Re: shlex parsing Karim <karim.liateni@free.fr> - 2011-07-28 19:51 +0200
          Re: shlex parsing Web Dreamer <webdreamer@nospam.fr> - 2011-07-29 09:30 +0200
        Re: shlex parsing Web Dreamer <webdreamer@nospam.fr> - 2011-07-29 09:24 +0200
          Re: shlex parsing Karim <karim.liateni@free.fr> - 2011-07-29 11:37 +0200
            Re: shlex parsing Web Dreamer <webdreamer@nospam.fr> - 2011-07-29 15:42 +0200
              Re: shlex parsing Karim <karim.liateni@free.fr> - 2011-07-29 16:34 +0200

#10396 — shlex parsing

FromKarim <karim.liateni@free.fr>
Date2011-07-27 21:30 +0200
Subjectshlex parsing
Message-ID<mailman.1538.1311795072.1164.python-list@python.org>
Hello All,

I would like to parse this TCL command line with shlex:

'-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'

And I want to get the splitted list:

['-option1', '[get_rule A1 B2]', '-option2',  '$VAR', '-option3',  'TAG']

Then I will gather in tuple 2 by 2 the arguments.

I tried to the shlec properties attributes 'quotes', 'whitespace', etc...

But I make 'choux blanc'.

If somebody has complex experiences with  this module I am in.

Cheers
Karim

[toc] | [next] | [standalone]


#10460

FromWeb Dreamer <webdreamer@nospam.fr>
Date2011-07-28 17:48 +0200
Message-ID<4e3184d3$0$29530$426a74cc@news.free.fr>
In reply to#10396
Karim a écrit ce mercredi 27 juillet 2011 21:30 dans <mailman.1538.1311795072.1164.python-list@python.org> :

> 
> Hello All,
> 
> I would like to parse this TCL command line with shlex:
> 
> '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'
> 
> And I want to get the splitted list:
> 
> ['-option1', '[get_rule A1 B2]', '-option2',  '$VAR', '-option3',  'TAG']
> 
> Then I will gather in tuple 2 by 2 the arguments.


Do this:

>>> s = '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'

Now if you don't enclose [get_rule A1 B2] in cotes, you will pull your hair off!
So:
>>> s = s.replace('[','"[')
>>> s = s.replace(']',']"')

Now:
>>> s
'-option1 "[get_rule A1 B2]" -option2 $VAR -option3 TAG'

Lets continue:
>>> import shlex
>>> optionlist = shlex.split(s)
>>> optionlist
['-option1', '[get_rule A1 B2]', '-option2', '$VAR', '-option3', 'TAG']

Now to get your tupple of two by two arguments:
>>> argtuple = tuple([(option, value) for option,value in zip(optionlist[0::2],optionlist[1::2])])
>>> argtuple
(('-option1', '[get_rule A1 B2]'), ('-option2', '$VAR'), ('-option3', 'TAG'))


whole code:
>>> import shlex
>>> s = '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'
>>> s = s.replace('[','"[')
>>> s = s.replace(']',']"')
>>> optionlist = shlex.split(s)
>>> argtuple = tuple([(option, value) for option,value in zip(optionlist[0::2],optionlist[1::2]))

-- 
Web Dreamer

[toc] | [prev] | [next] | [standalone]


#10462

FromKarim <karim.liateni@free.fr>
Date2011-07-28 18:21 +0200
Message-ID<mailman.1577.1311870117.1164.python-list@python.org>
In reply to#10460

Hello You have you feet on earth Web Dreamer!

Very clever!
Beautiful hack!

Many Thanks

Karim

On 07/28/2011 05:48 PM, Web Dreamer wrote:
> Karim a écrit ce mercredi 27 juillet 2011 21:30 dans<mailman.1538.1311795072.1164.python-list@python.org>  :
>
>> Hello All,
>>
>> I would like to parse this TCL command line with shlex:
>>
>> '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'
>>
>> And I want to get the splitted list:
>>
>> ['-option1', '[get_rule A1 B2]', '-option2',  '$VAR', '-option3',  'TAG']
>>
>> Then I will gather in tuple 2 by 2 the arguments.
>
> Do this:
>
>>>> s = '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'
> Now if you don't enclose [get_rule A1 B2] in cotes, you will pull your hair off!
> So:
>>>> s = s.replace('[','"[')
>>>> s = s.replace(']',']"')
> Now:
>>>> s
> '-option1 "[get_rule A1 B2]" -option2 $VAR -option3 TAG'
>
> Lets continue:
>>>> import shlex
>>>> optionlist = shlex.split(s)
>>>> optionlist
> ['-option1', '[get_rule A1 B2]', '-option2', '$VAR', '-option3', 'TAG']
>
> Now to get your tupple of two by two arguments:
>>>> argtuple = tuple([(option, value) for option,value in zip(optionlist[0::2],optionlist[1::2])])
>>>> argtuple
> (('-option1', '[get_rule A1 B2]'), ('-option2', '$VAR'), ('-option3', 'TAG'))
>
>
> whole code:
>>>> import shlex
>>>> s = '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'
>>>> s = s.replace('[','"[')
>>>> s = s.replace(']',']"')
>>>> optionlist = shlex.split(s)
>>>> argtuple = tuple([(option, value) for option,value in zip(optionlist[0::2],optionlist[1::2]))

[toc] | [prev] | [next] | [standalone]


#10463

FromNobody <nobody@nowhere.com>
Date2011-07-28 17:37 +0100
Message-ID<pan.2011.07.28.16.37.35.287000@nowhere.com>
In reply to#10460
On Thu, 28 Jul 2011 17:48:34 +0200, Web Dreamer wrote:

>> I would like to parse this TCL command line with shlex:
>> 
>> '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'

>>>> s = s.replace('[','"[')
>>>> s = s.replace(']',']"')

Note that this approach won't work if you have nested brackets or braces.
That would require a real parser.

[toc] | [prev] | [next] | [standalone]


#10468

FromKarim <karim.liateni@free.fr>
Date2011-07-28 19:51 +0200
Message-ID<mailman.1583.1311875478.1164.python-list@python.org>
In reply to#10463
Just a little modification:

 >>> tuple([(option, value) for option,value in 
zip(optionlist[0::2],optionlist[1::2])]) == 
tuple(zip(optionlist[0::2],optionlist[1::2]))
True

Indeed:

tuple(zip(optionlist[0::2],optionlist[1::2]))

shorter than:

tuple([(option, value) for option,value in 
zip(optionlist[0::2],optionlist[1::2])])


Karim

PS: I am from Grenoble, which place in France are from?

On 07/28/2011 06:37 PM, Nobody wrote:
> On Thu, 28 Jul 2011 17:48:34 +0200, Web Dreamer wrote:
>
>>> I would like to parse this TCL command line with shlex:
>>>
>>> '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'
>>>>> s = s.replace('[','"[')
>>>>> s = s.replace(']',']"')
> Note that this approach won't work if you have nested brackets or braces.
> That would require a real parser.
>

[toc] | [prev] | [next] | [standalone]


#10503

FromWeb Dreamer <webdreamer@nospam.fr>
Date2011-07-29 09:30 +0200
Message-ID<4e326181$0$23657$426a74cc@news.free.fr>
In reply to#10468
Karim a écrit ce jeudi 28 juillet 2011 19:51 dans 
<mailman.1583.1311875478.1164.python-list@python.org> :

> 
> Just a little modification:
> 
>  >>> tuple([(option, value) for option,value in
> zip(optionlist[0::2],optionlist[1::2])]) ==
> tuple(zip(optionlist[0::2],optionlist[1::2]))
> True
> 
> Indeed:
> 
> tuple(zip(optionlist[0::2],optionlist[1::2]))
> 
> shorter than:
> 
> tuple([(option, value) for option,value in
> zip(optionlist[0::2],optionlist[1::2])])

Nice Thanks :-)

> 
> PS: I am from Grenoble, which place in France are from?
> 

Paris.

Cheers :-)

-- 
Web Dreamer

[toc] | [prev] | [next] | [standalone]


#10504

FromWeb Dreamer <webdreamer@nospam.fr>
Date2011-07-29 09:24 +0200
Message-ID<4e32618d$0$23657$426a74cc@news.free.fr>
In reply to#10463
Nobody a écrit ce jeudi 28 juillet 2011 18:37 dans 
<pan.2011.07.28.16.37.35.287000@nowhere.com> :

> On Thu, 28 Jul 2011 17:48:34 +0200, Web Dreamer wrote:
> 
>>> I would like to parse this TCL command line with shlex:
>>> 
>>> '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'
> 
>>>>> s = s.replace('[','"[')
>>>>> s = s.replace(']',']"')
> 
> Note that this approach won't work if you have nested brackets or braces.
> That would require a real parser.

True,
I tried with the shlex class, but adding '[]' to a shlexobject.quotes 
doesn't work.
Indeed, shlex expects an opening and closing quote to be the same, and [ is 
different from ] so shlex therefore expects an opening [ to be closed with [ 
and not ]

My solution indeed only works as long as there are not any nested brackets.

-- 
Web Dreamer

[toc] | [prev] | [next] | [standalone]


#10506

FromKarim <karim.liateni@free.fr>
Date2011-07-29 11:37 +0200
Message-ID<mailman.1607.1311932260.1164.python-list@python.org>
In reply to#10504
On 07/29/2011 09:24 AM, Web Dreamer wrote:
> Nobody a écrit ce jeudi 28 juillet 2011 18:37 dans
> <pan.2011.07.28.16.37.35.287000@nowhere.com>  :
>
>> On Thu, 28 Jul 2011 17:48:34 +0200, Web Dreamer wrote:
>>
>>>> I would like to parse this TCL command line with shlex:
>>>>
>>>> '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'
>>>>>> s = s.replace('[','"[')
>>>>>> s = s.replace(']',']"')
>> Note that this approach won't work if you have nested brackets or braces.
>> That would require a real parser.
> True,
> I tried with the shlex class, but adding '[]' to a shlexobject.quotes
> doesn't work.
> Indeed, shlex expects an opening and closing quote to be the same, and [ is
> different from ] so shlex therefore expects an opening [ to be closed with [
> and not ]
>
> My solution indeed only works as long as there are not any nested brackets.
>
Yeah I saw that shlex object behavior is slighty different from 
shlex.split()
fonction. the char minus by default is taken as a individual word and in 
split
it is not. get_token() method return '-' alone and you have to set 
object attributes
wordchars += '-' to have the same behavior as shlex.split().

Cheers
Karim

[toc] | [prev] | [next] | [standalone]


#10514

FromWeb Dreamer <webdreamer@nospam.fr>
Date2011-07-29 15:42 +0200
Message-ID<4e32b8d0$0$14982$426a74cc@news.free.fr>
In reply to#10506
Karim a écrit ce vendredi 29 juillet 2011 11:37 dans 
<mailman.1607.1311932260.1164.python-list@python.org> :

> Yeah I saw that shlex object behavior is slighty different from
> shlex.split()
> fonction. the char minus by default is taken as a individual word and in
> split
> it is not. get_token() method return '-' alone and you have to set
> object attributes
> wordchars += '-' to have the same behavior as shlex.split().

To have the "almost" same behaviour as shlex.split() you only need to set 
the shlex whitespace_split property to True.
but it doesn't solve the bracket problem.

Using shlex object:
>>> import shlex
>>> s = '-option1 [get_rule A1 B2] -option2 $VAR -option3 TAG'
>>> s = s.replace('[','"[')
>>> s = s.replace(']',']"')
>>> optshlex = shlex.shlex(s)
>>> optshlex.whitespace_split = True
>>> options = [option for option in optshlex]
>>> options
['-option1', '"[get_rule A1 B2]"', '-option2', '$VAR', '-option3', 'TAG']

However there still is a diference with shlex.split():
>>> shlex.split(s)
['-option1', '[get_rule A1 B2]', '-option2', '$VAR', '-option3', 'TAG']

It's the fact that '[get_rule A1 B2]' get's added quotes.

-- 
Web Dreamer

[toc] | [prev] | [next] | [standalone]


#10517

FromKarim <karim.liateni@free.fr>
Date2011-07-29 16:34 +0200
Message-ID<mailman.1615.1311950066.1164.python-list@python.org>
In reply to#10514
On 07/29/2011 03:42 PM, Web Dreamer wrote:
> whitespace_split = True
Thanks for the tip!

Cheers
Karim

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web