Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!newsfeed.eweka.nl!eweka.nl!feeder3.eweka.nl!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
To: python-list@python.org
From: Frank Millman <frank@chagford.com>
Subject: Re: Question about ast.literal_eval
Date: Tue, 21 May 2013 08:30:03 +0200
References: <knci07$jol$1@ger.gmane.org> <BLU176-W37EB377C73CD6D7306247DD7AF0@phx.gbl> <knckj0$cft$1@ger.gmane.org> <CAPTjJmpb073XFH5ksNhzRyAJNSJF=6WNq5pGzed4PXqBrJBN=A@mail.gmail.com> <knclk6$lva$1@ger.gmane.org> <mailman.1888.1369056365.3114.python-list@python.org> <519a4b6a$0$29997$c3e8da3$5496439d@news.astraweb.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
User-Agent: Mozilla/5.0 (Windows NT 5.2; rv:17.0) Gecko/20130509 Thunderbird/17.0.6
In-Reply-To: <519a4b6a$0$29997$c3e8da3$5496439d@news.astraweb.com>
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.1905.1369117820.3114.python-list@python.org>
Lines: 64
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:45645

On 20/05/2013 18:12, Steven D'Aprano wrote:
> On Mon, 20 May 2013 15:26:02 +0200, Frank Millman wrote:
>
>> Can anyone see anything wrong with the following approach. I have not
>> definitely decided to do it this way, but I have been experimenting and
>> it seems to work.
>>
[...]
>
> It seems safe to me too, but then any fool can come up with a system
> which they themselves cannot break :-)
>

Thanks for the detailed response.

> I think the real worry is validating the column name. That will be
> critical.

I would not pass the actual column name to eval(), I would use it to 
retrieve a value from a data object and pass that to eval(). However, 
then your point becomes 'validating the value retrieved'. I had not 
thought about that. I will investigate further.

> Personally, I would strongly suggest writing your own mini-
> evaluator that walks the list and evaluates it by hand. It isn't as
> convenient as just calling eval, but *definitely* safer.
>

I am not sure I can wrap my mind around mixed 'and's, 'or's, and brackets.

[Thinking aloud]
Maybe I can manually reduce each internal test to a True or False, 
substitute them in the list, and then call eval() on the result.

eval('(True and False) or (False or True)')

I will experiment with that.

> If you do call eval, make sure you supply the globals and locals
> arguments. The usual way is:
>
> eval(expression, {'__builtins__': None}, {})
>
> which gives you an empty locals() and a minimal, (mostly) safe globals.
>

Thanks - I did not know about that.

> Finally, as a "belt-and-braces" approach, I wouldn't even call eval
> directly, but call a thin wrapper that raises an exception if the
> expression contains an underscore. Underscores are usually the key to
> breaking eval, so refusing to evaluate anything with an underscore raises
> the barrier very high.
>
> And even with all those defences, I wouldn't allow untrusted data from
> the Internet anywhere near this. Just because I can't break it, doesn't
> mean it's safe.
>

All good advice - much appreciated.

Frank