Groups > comp.lang.python > #36088 > unrolled thread

Yet another attempt at a safe eval() call

Started by	Grant Edwards <invalid@invalid.invalid>
First post	2013-01-03 23:25 +0000
Last post	2013-01-04 18:13 +0000
Articles	20 on this page of 27 — 10 participants

Back to article view | Back to comp.lang.python

  Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-03 23:25 +0000
    Re: Yet another attempt at a safe eval() call Tim Chase <python.list@tim.thechases.com> - 2013-01-03 19:11 -0600
      Re: Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-04 02:34 +0000
    Re: Yet another attempt at a safe eval() call Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-01-04 07:47 +0000
      Re: Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-04 15:53 +0000
        Re: Yet another attempt at a safe eval() call Michael Torrie <torriem@gmail.com> - 2013-01-04 09:05 -0700
          Re: Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-04 16:16 +0000
        Re: Yet another attempt at a safe eval() call Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-01-05 15:56 +0000
          Re: Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-06 15:12 +0000
            Re: Yet another attempt at a safe eval() call Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-01-07 00:08 +0000
        Re: Yet another attempt at a safe eval() call Chris Angelico <rosuav@gmail.com> - 2013-01-06 03:01 +1100
        Re: Yet another attempt at a safe eval() call Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-01-05 16:17 +0000
          Re: Yet another attempt at a safe eval() call matt.newville@gmail.com - 2013-01-05 08:40 -0800
          Re: Yet another attempt at a safe eval() call matt.newville@gmail.com - 2013-01-05 08:40 -0800
      Re: Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-04 16:38 +0000
        Re: Yet another attempt at a safe eval() call Chris Angelico <rosuav@gmail.com> - 2013-01-05 03:51 +1100
          Re: Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-04 17:14 +0000
            Re: Yet another attempt at a safe eval() call Chris Angelico <rosuav@gmail.com> - 2013-01-05 04:21 +1100
              Re: Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-04 18:09 +0000
                Re: Yet another attempt at a safe eval() call Chris Angelico <rosuav@gmail.com> - 2013-01-05 05:23 +1100
                  Re: Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-04 18:43 +0000
                    Re: Yet another attempt at a safe eval() call Chris Angelico <rosuav@gmail.com> - 2013-01-05 06:02 +1100
    Re: Yet another attempt at a safe eval() call Chris Rebert <clp2@rebertia.com> - 2013-01-03 23:50 -0800
    Re: Yet another attempt at a safe eval() call Terry Reedy <tjreedy@udel.edu> - 2013-01-04 07:24 -0500
      Re: Yet another attempt at a safe eval() call Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-01-04 13:33 +0000
        Re: Yet another attempt at a safe eval() call Grant Edwards <invalid@invalid.invalid> - 2013-01-04 15:59 +0000
        Re: Yet another attempt at a safe eval() call Alister <alister.ware@ntlworld.com> - 2013-01-04 18:13 +0000

Page 1 of 2 [1] 2 Next page →

#36088 — Yet another attempt at a safe eval() call

From	Grant Edwards <invalid@invalid.invalid>
Date	2013-01-03 23:25 +0000
Subject	Yet another attempt at a safe eval() call
Message-ID	<kc541v$3e4$1@reader1.panix.com>

I've written a small assembler in Python 2.[67], and it needs to
evaluate integer-valued arithmetic expressions in the context of a
symbol table that defines integer values for a set of names.  The
"right" thing is probably an expression parser/evaluator using ast,
but it looked like that would take more code that the rest of the
assembler combined, and I've got other higher-priority tasks to get
back to.

How badly am I deluding myself with the code below?

def lessDangerousEval(expr):
    global symbolTable
    if 'import' in expr:
        raise ParseError("operand expressions are not allowed to contain the string 'import'")
    globals = {'__builtins__': None}
    locals  = symbolTable
    return eval(expr, globals, locals)

I can guarantee that symbolTable is a dict that maps a set of string
symbol names to integer values.

-- 
Grant Edwards               grant.b.edwards        Yow! -- I have seen the
                                  at               FUN --
                              gmail.com

[toc] | [next] | [standalone]

#36097

From	Tim Chase <python.list@tim.thechases.com>
Date	2013-01-03 19:11 -0600
Message-ID	<mailman.69.1357265052.2939.python-list@python.org>
In reply to	#36088

On 01/03/13 17:25, Grant Edwards wrote:
> def lessDangerousEval(expr):
>      global symbolTable
>      if 'import' in expr:
>          raise ParseError("operand expressions are not allowed to contain the string 'import'")
>      globals = {'__builtins__': None}
>      locals  = symbolTable
>      return eval(expr, globals, locals)
>
> I can guarantee that symbolTable is a dict that maps a set of string
> symbol names to integer values.

For what definition of "safe"?  Are CPython segfaults a problem? 
Blowing the stack?  Do you aim to prevent exploitable things like 
system calls or network/file access?

-tkc

[toc] | [prev] | [next] | [standalone]

#36098

From	Grant Edwards <invalid@invalid.invalid>
Date	2013-01-04 02:34 +0000
Message-ID	<kc5f36$qao$1@reader1.panix.com>
In reply to	#36097

On 2013-01-04, Tim Chase <python.list@tim.thechases.com> wrote:
> On 01/03/13 17:25, Grant Edwards wrote:
>> def lessDangerousEval(expr):
>>      global symbolTable
>>      if 'import' in expr:
>>          raise ParseError("operand expressions are not allowed to contain the string 'import'")
>>      globals = {'__builtins__': None}
>>      locals  = symbolTable
>>      return eval(expr, globals, locals)
>>
>> I can guarantee that symbolTable is a dict that maps a set of string
>> symbol names to integer values.
>
> For what definition of "safe"?  Are CPython segfaults a problem?

Not by themselves, no.

> Blowing the stack?

Not a problem either.  I don't care if the program crashes.  It's a
pretty dumb assembler, and it gives up and exits after the first error
anyway.

> Do you aim to prevent exploitable things like system calls or
> network/file access?

Yes, that's mainly what I was wondering wondering about.

-- 
Grant

[toc] | [prev] | [next] | [standalone]

#36101

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-01-04 07:47 +0000
Message-ID	<50e6891c$0$30003$c3e8da3$5496439d@news.astraweb.com>
In reply to	#36088

On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:

> I've written a small assembler in Python 2.[67], and it needs to
> evaluate integer-valued arithmetic expressions in the context of a
> symbol table that defines integer values for a set of names.  The
> "right" thing is probably an expression parser/evaluator using ast, but
> it looked like that would take more code that the rest of the assembler
> combined, and I've got other higher-priority tasks to get back to.
> 
> How badly am I deluding myself with the code below?

Pretty badly, sorry. See trivial *cough* exploit below.

> def lessDangerousEval(expr):
>     global symbolTable
>     if 'import' in expr:
>         raise ParseError("operand expressions are not allowed to contain
>         the string 'import'")
>     globals = {'__builtins__': None}
>     locals  = symbolTable
>     return eval(expr, globals, locals)
> 
> I can guarantee that symbolTable is a dict that maps a set of string
> symbol names to integer values.

Here's one exploit. I make no promises that it is the simplest such one.

# get access to __import__
s = ("[x for x in (1).__class__.__base__.__subclasses__() "
     "if x.__name__ == 'catch_warnings'][0]()._module"
     ".__builtins__['__imp' + 'ort__']")
# use it to get access to any module we like
t = s + "('os')"
# and then do bad things
urscrewed = t + ".system('echo u r pwned!')"

lessDangerousEval(urscrewed)

At a minimum, I would recommend:

* Do not allow any underscores in the expression being evaluated. Unless 
you absolutely need to support them for names, they can only lead to 
trouble.

* If you must allow underscores, don't allow double underscores. Every 
restriction you apply makes it harder to exploit.

* Since you're evaluating mathematical expressions, there's probably no 
need to allow quotation marks either. They too can only lead to trouble.

* Likewise for dots, since this is *integer* maths.

* Set as short as possible limit on the length of the string as you can 
bare; the shorter the limit, the shorter any exploit must be, and it is 
harder to write a short exploit than a long exploit.

* But frankly, you should avoid eval, and write your own mini-integer 
arithmetic evaluator which avoids even the most remote possibility of 
exploit.

So, here's my probably-not-safe-either "safe eval":

def probably_not_safe_eval(expr):
    if 'import' in expr.lower():
        raise ParseError("'import' prohibited")
    for c in '_"\'.':
        if c in expr:
            raise ParseError('prohibited char %r' % c)
    if len(expr) > 120:
        raise ParseError('expression too long')
    globals = {'__builtins__': None}
    locals  = symbolTable
    return eval(expr, globals, locals)  # fingers crossed!

I can't think of any way to break out of these restrictions, but that may 
just mean I'm not smart enough.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#36113

From	Grant Edwards <invalid@invalid.invalid>
Date	2013-01-04 15:53 +0000
Message-ID	<kc6tu3$s34$1@reader1.panix.com>
In reply to	#36101

On 2013-01-04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
>
>> I've written a small assembler in Python 2.[67], and it needs to
>> evaluate integer-valued arithmetic expressions in the context of a
>> symbol table that defines integer values for a set of names.  The
>> "right" thing is probably an expression parser/evaluator using ast, but
>> it looked like that would take more code that the rest of the assembler
>> combined, and I've got other higher-priority tasks to get back to.
>> 
>> How badly am I deluding myself with the code below?
>
> Pretty badly, sorry.

I suspected that was the case.

> See trivial *cough* exploit below.
>
>
>> def lessDangerousEval(expr):
>>     global symbolTable
>>     if 'import' in expr:
>>         raise ParseError("operand expressions are not allowed to contain
>>         the string 'import'")
>>     globals = {'__builtins__': None}
>>     locals  = symbolTable
>>     return eval(expr, globals, locals)
>> 
>> I can guarantee that symbolTable is a dict that maps a set of string
>> symbol names to integer values.
>
>
> Here's one exploit. I make no promises that it is the simplest such one.
>
> # get access to __import__
> s = ("[x for x in (1).__class__.__base__.__subclasses__() "
>      "if x.__name__ == 'catch_warnings'][0]()._module"
>      ".__builtins__['__imp' + 'ort__']")
> # use it to get access to any module we like
> t = s + "('os')"
> # and then do bad things
> urscrewed = t + ".system('echo u r pwned!')"
>
> lessDangerousEval(urscrewed)
>
>
> At a minimum, I would recommend:
>
> * Do not allow any underscores in the expression being evaluated.
>   Unless you absolutely need to support them for names, they can only
>   lead to trouble.

I can disallow underscores in names.

> [...]

> * Since you're evaluating mathematical expressions, there's probably
>   no need to allow quotation marks either. They too can only lead to
>   trouble.
>
> * Likewise for dots, since this is *integer* maths.

OK, quotes and dots are out as well.

> * Set as short as possible limit on the length of the string as you
>   can bare; the shorter the limit, the shorter any exploit must be,
>   and it is harder to write a short exploit than a long exploit.
>
> * But frankly, you should avoid eval, and write your own mini-integer
>   arithmetic evaluator which avoids even the most remote possibility
>   of exploit.

That's obviously the "right" thing to do.  I suppose I should figure
out how to use the ast module.  

> So, here's my probably-not-safe-either "safe eval":
>
>
> def probably_not_safe_eval(expr):
>     if 'import' in expr.lower():
>         raise ParseError("'import' prohibited")
>     for c in '_"\'.':
>         if c in expr:
>             raise ParseError('prohibited char %r' % c)
>     if len(expr) > 120:
>         raise ParseError('expression too long')
>     globals = {'__builtins__': None}
>     locals  = symbolTable
>     return eval(expr, globals, locals)  # fingers crossed!
>
> I can't think of any way to break out of these restrictions, but that may 
> just mean I'm not smart enough.

Thanks!  It's definitely an improvement.

-- 
Grant Edwards               grant.b.edwards        Yow! -- I have seen the
                                  at               FUN --
                              gmail.com

[toc] | [prev] | [next] | [standalone]

#36115

From	Michael Torrie <torriem@gmail.com>
Date	2013-01-04 09:05 -0700
Message-ID	<mailman.87.1357315539.2939.python-list@python.org>
In reply to	#36113

On 01/04/2013 08:53 AM, Grant Edwards wrote:
> That's obviously the "right" thing to do.  I suppose I should figure
> out how to use the ast module.  

Or PyParsing.

As for your program being "secure" I don't see that there's much to
exploit.  You're not running as a service, and you're not running your
assembler as root, called from a normal user.  The user has your code
and can "exploit" it anytime he wants.

[toc] | [prev] | [next] | [standalone]

#36117

From	Grant Edwards <invalid@invalid.invalid>
Date	2013-01-04 16:16 +0000
Message-ID	<kc6v89$sp5$1@reader1.panix.com>
In reply to	#36115

On 2013-01-04, Michael Torrie <torriem@gmail.com> wrote:
> On 01/04/2013 08:53 AM, Grant Edwards wrote:
>> That's obviously the "right" thing to do.  I suppose I should figure
>> out how to use the ast module.  
>
> Or PyParsing.
>
> As for your program being "secure" I don't see that there's much to
> exploit.

There isn't.

> You're not running as a service, and you're not running your
> assembler as root, called from a normal user.  The user has your code
> and can "exploit" it anytime he wants.

I'm just trying to prevent surprises for people who are running the
assembler.  We have to assume that they trust the assembler code to
not cause damage intentionally.  But, one would not expect them to
have to worry that assembly language input fed to the assembler code
might cause some sort of collateral damage.

Sure, I can change the source code for gcc so that it wreaks havok
when I invoke it.  But, using the stock gcc compiler there shouldn't
be any source file I can feed it that will cause it to mail my bank
account info to somebody in Eastern Europe, install a keylogger, and
then remove all my files.

-- 
Grant Edwards               grant.b.edwards        Yow! I have a TINY BOWL in
                                  at               my HEAD
                              gmail.com

[toc] | [prev] | [next] | [standalone]

#36188

From	Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date	2013-01-05 15:56 +0000
Message-ID	<mailman.126.1357401393.2939.python-list@python.org>
In reply to	#36113

On 4 January 2013 15:53, Grant Edwards <invalid@invalid.invalid> wrote:
> On 2013-01-04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
>>
>> * But frankly, you should avoid eval, and write your own mini-integer
>>   arithmetic evaluator which avoids even the most remote possibility
>>   of exploit.
>
> That's obviously the "right" thing to do.  I suppose I should figure
> out how to use the ast module.

Someone has already created a module that does this called numexpr. Is
there some reason why you don't want to use that?

>>> import numexpr
>>> numexpr.evaluate('2+4*5')
array(22, dtype=int32)
>>> numexpr.evaluate('2+a*5', {'a':4})
array(22L)


Oscar

[toc] | [prev] | [next] | [standalone]

#36261

From	Grant Edwards <invalid@invalid.invalid>
Date	2013-01-06 15:12 +0000
Message-ID	<kcc49e$aii$1@reader1.panix.com>
In reply to	#36188

On 2013-01-05, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
> On 4 January 2013 15:53, Grant Edwards <invalid@invalid.invalid> wrote:
>> On 2013-01-04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>>> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
>>>
>>> * But frankly, you should avoid eval, and write your own mini-integer
>>>   arithmetic evaluator which avoids even the most remote possibility
>>>   of exploit.
>>
>> That's obviously the "right" thing to do.  I suppose I should figure
>> out how to use the ast module.
>
> Someone has already created a module that does this called numexpr. Is
> there some reason why you don't want to use that?

1) I didn't know about it, and my Googling didn't find it.

2) It's not part of the standard library, and my program needs to be
   distributed as a single source file.
   
-- 
Grant

[toc] | [prev] | [next] | [standalone]

#36309

From	Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date	2013-01-07 00:08 +0000
Message-ID	<mailman.200.1357517295.2939.python-list@python.org>
In reply to	#36261

On 6 January 2013 15:12, Grant Edwards <invalid@invalid.invalid> wrote:
> On 2013-01-05, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
>> On 4 January 2013 15:53, Grant Edwards <invalid@invalid.invalid> wrote:
>>> On 2013-01-04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>>>> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
>>>>
>>>> * But frankly, you should avoid eval, and write your own mini-integer
>>>>   arithmetic evaluator which avoids even the most remote possibility
>>>>   of exploit.
>>>
>>> That's obviously the "right" thing to do.  I suppose I should figure
>>> out how to use the ast module.
>>
>> Someone has already created a module that does this called numexpr. Is
>> there some reason why you don't want to use that?
>
> 1) I didn't know about it, and my Googling didn't find it.
>
> 2) It's not part of the standard library, and my program needs to be
>    distributed as a single source file.

That's an unfortunate restriction. It also won't be possible to reuse
the code from numexpr (for technical rather than legal reasons).
Perhaps asteval will be more helpful in that sense.

Otherwise presumably the shunting-yard algorithm comes out a little
nicer in Python than in C (it would be useful if something like this
were available on PyPI as a pure Python module):
http://en.wikipedia.org/wiki/Shunting_yard_algorithm#C_example


Oscar

[toc] | [prev] | [next] | [standalone]

#36191

From	Chris Angelico <rosuav@gmail.com>
Date	2013-01-06 03:01 +1100
Message-ID	<mailman.129.1357401711.2939.python-list@python.org>
In reply to	#36113

On Sun, Jan 6, 2013 at 2:56 AM, Oscar Benjamin
<oscar.j.benjamin@gmail.com> wrote:
> On 4 January 2013 15:53, Grant Edwards <invalid@invalid.invalid> wrote:
>> On 2013-01-04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>>> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
>>>
>>> * But frankly, you should avoid eval, and write your own mini-integer
>>>   arithmetic evaluator which avoids even the most remote possibility
>>>   of exploit.
>>
>> That's obviously the "right" thing to do.  I suppose I should figure
>> out how to use the ast module.
>
> Someone has already created a module that does this called numexpr. Is
> there some reason why you don't want to use that?
>
>>>> import numexpr
>>>> numexpr.evaluate('2+4*5')
> array(22, dtype=int32)
>>>> numexpr.evaluate('2+a*5', {'a':4})
> array(22L)

Is that from PyPI? It's not in my Python 3.3 installation. Obvious
reason not to use it: Unaware of it. :)

ChrisA

[toc] | [prev] | [next] | [standalone]

#36193

From	Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date	2013-01-05 16:17 +0000
Message-ID	<mailman.131.1357402645.2939.python-list@python.org>
In reply to	#36113

On 5 January 2013 16:01, Chris Angelico <rosuav@gmail.com> wrote:
> On Sun, Jan 6, 2013 at 2:56 AM, Oscar Benjamin
> <oscar.j.benjamin@gmail.com> wrote:
>> On 4 January 2013 15:53, Grant Edwards <invalid@invalid.invalid> wrote:
>>> On 2013-01-04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>>>> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
>>>>
>>>> * But frankly, you should avoid eval, and write your own mini-integer
>>>>   arithmetic evaluator which avoids even the most remote possibility
>>>>   of exploit.
>>>
>>> That's obviously the "right" thing to do.  I suppose I should figure
>>> out how to use the ast module.
>>
>> Someone has already created a module that does this called numexpr. Is
>> there some reason why you don't want to use that?
>>
>>>>> import numexpr
>>>>> numexpr.evaluate('2+4*5')
>> array(22, dtype=int32)
>>>>> numexpr.evaluate('2+a*5', {'a':4})
>> array(22L)
>
> Is that from PyPI? It's not in my Python 3.3 installation. Obvious
> reason not to use it: Unaware of it. :)

My apologies. I should have at least provided a link:
http://code.google.com/p/numexpr/

I installed it from the ubuntu repo under the name python-numexpr. It
is also on PyPI:
http://pypi.python.org/pypi/numexpr

numexpr is a well established project intended primarily for memory
and cache efficient computations over large arrays of data. Possibly
as a side effect, it can also be used to evaluate simple algebraic
expressions involving ordinary scalar variables.

Oscar

[toc] | [prev] | [next] | [standalone]

#36194

From	matt.newville@gmail.com
Date	2013-01-05 08:40 -0800
Message-ID	<074c2d8b-3734-43f8-b96f-9186506e412f@googlegroups.com>
In reply to	#36193

On Saturday, January 5, 2013 8:17:16 AM UTC-8, Oscar Benjamin wrote:
> On 5 January 2013 16:01, Chris Angelico <rosuav@gmail.com> wrote:
> 
> > On Sun, Jan 6, 2013 at 2:56 AM, Oscar Benjamin
> 
> > <oscar.j.benjamin@gmail.com> wrote:
> 
> >> On 4 January 2013 15:53, Grant Edwards <invalid@invalid.invalid> wrote:
> 
> >>> On 2013-01-04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> 
> >>>> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
> 
> >>>>
> 
> >>>> * But frankly, you should avoid eval, and write your own mini-integer
> 
> >>>>   arithmetic evaluator which avoids even the most remote possibility
> 
> >>>>   of exploit.
> 
> >>>
> 
> >>> That's obviously the "right" thing to do.  I suppose I should figure
> 
> >>> out how to use the ast module.
> 
> >>
> 
> >> Someone has already created a module that does this called numexpr. Is
> 
> >> there some reason why you don't want to use that?
> 
> >>
> 
> >>>>> import numexpr
> 
> >>>>> numexpr.evaluate('2+4*5')
> 
> >> array(22, dtype=int32)
> 
> >>>>> numexpr.evaluate('2+a*5', {'a':4})
> 
> >> array(22L)
> 
> >
> 
> > Is that from PyPI? It's not in my Python 3.3 installation. Obvious
> 
> > reason not to use it: Unaware of it. :)
> 
> 
> 
> My apologies. I should have at least provided a link:
> 
> http://code.google.com/p/numexpr/
> 
> 
> 
> I installed it from the ubuntu repo under the name python-numexpr. It
> 
> is also on PyPI:
> 
> http://pypi.python.org/pypi/numexpr
> 
> 
> 
> numexpr is a well established project intended primarily for memory
> 
> and cache efficient computations over large arrays of data. Possibly
> 
> as a side effect, it can also be used to evaluate simple algebraic
> 
> expressions involving ordinary scalar variables.
> 
> 
> 
> 
> 
> Oscar

The asteval module http://pypi.python.org/pypi/asteval/0.9 and
http://newville.github.com/asteval/  might be another alternative.  It's not as fast as numexpr, but a bit more general. It uses the ast module to "compile" an expression into the AST, then walks through that, intercepting Name nodes and using a flat namespace of variables.  It disallows imports and does not support all python constructs, but it is a fairly complete in supporting python syntax.

It makes no claim at actually being safe from malicious attack, but should be safer than a straight eval(), and prevent accidental problems when evaluating user-input as code.  If anyone can find exploits within it, I'd be happy to try to fix them.

--Matt

[toc] | [prev] | [next] | [standalone]

#36195

From	matt.newville@gmail.com
Date	2013-01-05 08:40 -0800
Message-ID	<mailman.133.1357404044.2939.python-list@python.org>
In reply to	#36193

On Saturday, January 5, 2013 8:17:16 AM UTC-8, Oscar Benjamin wrote:
> On 5 January 2013 16:01, Chris Angelico <rosuav@gmail.com> wrote:
> 
> > On Sun, Jan 6, 2013 at 2:56 AM, Oscar Benjamin
> 
> > <oscar.j.benjamin@gmail.com> wrote:
> 
> >> On 4 January 2013 15:53, Grant Edwards <invalid@invalid.invalid> wrote:
> 
> >>> On 2013-01-04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> 
> >>>> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
> 
> >>>>
> 
> >>>> * But frankly, you should avoid eval, and write your own mini-integer
> 
> >>>>   arithmetic evaluator which avoids even the most remote possibility
> 
> >>>>   of exploit.
> 
> >>>
> 
> >>> That's obviously the "right" thing to do.  I suppose I should figure
> 
> >>> out how to use the ast module.
> 
> >>
> 
> >> Someone has already created a module that does this called numexpr. Is
> 
> >> there some reason why you don't want to use that?
> 
> >>
> 
> >>>>> import numexpr
> 
> >>>>> numexpr.evaluate('2+4*5')
> 
> >> array(22, dtype=int32)
> 
> >>>>> numexpr.evaluate('2+a*5', {'a':4})
> 
> >> array(22L)
> 
> >
> 
> > Is that from PyPI? It's not in my Python 3.3 installation. Obvious
> 
> > reason not to use it: Unaware of it. :)
> 
> 
> 
> My apologies. I should have at least provided a link:
> 
> http://code.google.com/p/numexpr/
> 
> 
> 
> I installed it from the ubuntu repo under the name python-numexpr. It
> 
> is also on PyPI:
> 
> http://pypi.python.org/pypi/numexpr
> 
> 
> 
> numexpr is a well established project intended primarily for memory
> 
> and cache efficient computations over large arrays of data. Possibly
> 
> as a side effect, it can also be used to evaluate simple algebraic
> 
> expressions involving ordinary scalar variables.
> 
> 
> 
> 
> 
> Oscar

The asteval module http://pypi.python.org/pypi/asteval/0.9 and
http://newville.github.com/asteval/  might be another alternative.  It's not as fast as numexpr, but a bit more general. It uses the ast module to "compile" an expression into the AST, then walks through that, intercepting Name nodes and using a flat namespace of variables.  It disallows imports and does not support all python constructs, but it is a fairly complete in supporting python syntax.

It makes no claim at actually being safe from malicious attack, but should be safer than a straight eval(), and prevent accidental problems when evaluating user-input as code.  If anyone can find exploits within it, I'd be happy to try to fix them.

--Matt

[toc] | [prev] | [next] | [standalone]

#36119

From	Grant Edwards <invalid@invalid.invalid>
Date	2013-01-04 16:38 +0000
Message-ID	<kc70hb$p5$1@reader1.panix.com>
In reply to	#36101

On 2013-01-04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> On Thu, 03 Jan 2013 23:25:51 +0000, Grant Edwards wrote:
>
>> I've written a small assembler in Python 2.[67], and it needs to
>> evaluate integer-valued arithmetic expressions in the context of a
>> symbol table that defines integer values for a set of names.

[...]

[ my attaempt at a safer eval() ]

> So, here's my probably-not-safe-either "safe eval":
>
>
> def probably_not_safe_eval(expr):
>     if 'import' in expr.lower():
>         raise ParseError("'import' prohibited")
>     for c in '_"\'.':
>         if c in expr:
>             raise ParseError('prohibited char %r' % c)
>     if len(expr) > 120:
>         raise ParseError('expression too long')
>     globals = {'__builtins__': None}
>     locals  = symbolTable
>     return eval(expr, globals, locals)  # fingers crossed!
>
> I can't think of any way to break out of these restrictions, but that may 
> just mean I'm not smart enough.

I've added equals, backslash, commas, square/curly brackets, colons and semicolons to the
prohibited character list. I also reduced the maximum length to 60
characters.  It's unfortunate that parentheses are overloaded for both
expression grouping and for function calling...

def lessDangerousEval(expr):
    if 'import' in expr.lower():
        raise ParseError("'import' prohibited in expression")
    for c in '_"\'.;:[]{}=\\':
        if c in expr:
            raise ParseError("prohibited char '%r' in expression" % c)
    if len(expr) > 60:
        raise ParseError('expression too long')
    globals = {'__builtins__': None}
    locals  = symbolTable
    return eval(expr, globals, locals)  # fingers crossed!

Exploits anyone?    

-- 
Grant Edwards               grant.b.edwards        Yow! I'm ZIPPY the PINHEAD
                                  at               and I'm totally committed
                              gmail.com            to the festive mode.

[toc] | [prev] | [next] | [standalone]

#36120

From	Chris Angelico <rosuav@gmail.com>
Date	2013-01-05 03:51 +1100
Message-ID	<mailman.89.1357318292.2939.python-list@python.org>
In reply to	#36119

On Sat, Jan 5, 2013 at 3:38 AM, Grant Edwards <invalid@invalid.invalid> wrote:
> I've added equals, backslash, commas, square/curly brackets, colons and semicolons to the
> prohibited character list. I also reduced the maximum length to 60
> characters.  It's unfortunate that parentheses are overloaded for both
> expression grouping and for function calling...

I have to say that an expression evaluator that can't handle parens
for grouping is badly flawed. Can you demand that open parenthesis be
preceded by an operator (or beginning of line)? For instance:

(1+2)*3+4 # Valid
1+2*(3+4) # Valid
1+2(3+4) # Invalid, this will attempt to call 2

You could explain it as a protection against mistaken use of algebraic
notation (in which the last two expressions have the same meaning and
evaluate to 15). Or, alternatively, you could simply insert the
asterisk yourself, though that could potentially be VERY confusing.

Without parentheses, your users will be forced to store intermediate
results in variables, which gets tiresome fast.

discriminant = b*b-4*a*c
denominator = 2*a
# Okay, this expression demands a square rooting, but let's pretend that's done.
sol1 = -b+discriminant
sol2 = -b-discrminant
sol1 = sol1/denominator
sol2 /= denominator # if they know about augmented assignment

You can probably recognize the formula I'm working with there, but
it's far less obvious and involves six separate statements rather than
two. And this is a fairly simple formula. It'll get a lot worse in
production.

ChrisA

[toc] | [prev] | [next] | [standalone]

#36122

From	Grant Edwards <invalid@invalid.invalid>
Date	2013-01-04 17:14 +0000
Message-ID	<kc72ls$3m7$1@reader1.panix.com>
In reply to	#36120

On 2013-01-04, Chris Angelico <rosuav@gmail.com> wrote:
> On Sat, Jan 5, 2013 at 3:38 AM, Grant Edwards <invalid@invalid.invalid> wrote:

>> I've added equals, backslash, commas, square/curly brackets, colons
>> and semicolons to the prohibited character list. I also reduced the
>> maximum length to 60 characters.  It's unfortunate that parentheses
>> are overloaded for both expression grouping and for function
>> calling...
>
> I have to say that an expression evaluator that can't handle parens
> for grouping is badly flawed.

Indeed.  That's why I didn't disallow parens.

What I was implying was that since you have to allow parens for
grouping, there's no simple way to disallow function calls.

> Can you demand that open parenthesis be preceded by an operator (or
> beginning of line)?

Yes, but once you've parsed the expression to the point where you can
enforce rules like that, you're probably most of the way to doing the
"right" thing and evaluating the expression using ast or pyparsing or
similar.

> You can probably recognize the formula I'm working with there, but
> it's far less obvious and involves six separate statements rather than
> two. And this is a fairly simple formula. It'll get a lot worse in
> production.

In the general case, yes.  For this assembler I could _probably_ get
by with expressions of the form <symbol> <op> <literal> where op is
'+' or '-'.  But, whenever I try to come up with a minimal solution
like that, it tends to get "enhanced" over the years until it's a
complete mess, doesn't work quite right, and took more total man-hours
than a general and "permanent" solution would have.

Some might argue that repeated tweaking of and adding limitiations to
a "safe eval" is just heading down that same road in a different car.
They'd probably be right: in the end, it will probably have been less
work to just do it with ast.  But it's still interesting to try. :)

-- 
Grant Edwards               grant.b.edwards        Yow! Are you the
                                  at               self-frying president?
                              gmail.com

[toc] | [prev] | [next] | [standalone]

#36123

From	Chris Angelico <rosuav@gmail.com>
Date	2013-01-05 04:21 +1100
Message-ID	<mailman.91.1357320101.2939.python-list@python.org>
In reply to	#36122

On Sat, Jan 5, 2013 at 4:14 AM, Grant Edwards <invalid@invalid.invalid> wrote:
> On 2013-01-04, Chris Angelico <rosuav@gmail.com> wrote:
>> On Sat, Jan 5, 2013 at 3:38 AM, Grant Edwards <invalid@invalid.invalid> wrote:
>
>>> I've added equals, backslash, commas, square/curly brackets, colons
>>> and semicolons to the prohibited character list. I also reduced the
>>> maximum length to 60 characters.  It's unfortunate that parentheses
>>> are overloaded for both expression grouping and for function
>>> calling...
>>
>> I have to say that an expression evaluator that can't handle parens
>> for grouping is badly flawed.
>
> Indeed.  That's why I didn't disallow parens.
>
> What I was implying was that since you have to allow parens for
> grouping, there's no simple way to disallow function calls.

Yeah, and a safe evaluator that allows function calls is highly vulnerable.

>> Can you demand that open parenthesis be preceded by an operator (or
>> beginning of line)?
>
> Yes, but once you've parsed the expression to the point where you can
> enforce rules like that, you're probably most of the way to doing the
> "right" thing and evaluating the expression using ast or pyparsing or
> similar.
>
> Some might argue that repeated tweaking of and adding limitiations to
> a "safe eval" is just heading down that same road in a different car.
> They'd probably be right: in the end, it will probably have been less
> work to just do it with ast.  But it's still interesting to try. :)

Yep, have fun with it. As mentioned earlier, though, security isn't
all that critical; so in this case, chances are you can just leave
parens permitted and let function calls potentially happen.

ChrisA

[toc] | [prev] | [next] | [standalone]

#36126

From	Grant Edwards <invalid@invalid.invalid>
Date	2013-01-04 18:09 +0000
Message-ID	<kc75su$7es$1@reader1.panix.com>
In reply to	#36123

On 2013-01-04, Chris Angelico <rosuav@gmail.com> wrote:
> On Sat, Jan 5, 2013 at 4:14 AM, Grant Edwards <invalid@invalid.invalid> wrote:
>> On 2013-01-04, Chris Angelico <rosuav@gmail.com> wrote:
>>> On Sat, Jan 5, 2013 at 3:38 AM, Grant Edwards <invalid@invalid.invalid> wrote:
>>
>>>> I've added equals, backslash, commas, square/curly brackets, colons
>>>> and semicolons to the prohibited character list. I also reduced the
>>>> maximum length to 60 characters.  It's unfortunate that parentheses
>>>> are overloaded for both expression grouping and for function
>>>> calling...
>>>
>>> I have to say that an expression evaluator that can't handle parens
>>> for grouping is badly flawed.
>>
>> Indeed.  That's why I didn't disallow parens.
>>
>> What I was implying was that since you have to allow parens for
>> grouping, there's no simple way to disallow function calls.
>
> Yeah, and a safe evaluator that allows function calls is highly vulnerable.
>
>>> Can you demand that open parenthesis be preceded by an operator (or
>>> beginning of line)?
>>
>> Yes, but once you've parsed the expression to the point where you can
>> enforce rules like that, you're probably most of the way to doing the
>> "right" thing and evaluating the expression using ast or pyparsing or
>> similar.
>>
>> Some might argue that repeated tweaking of and adding limitiations to
>> a "safe eval" is just heading down that same road in a different car.
>> They'd probably be right: in the end, it will probably have been less
>> work to just do it with ast.  But it's still interesting to try. :)
>
> Yep, have fun with it. As mentioned earlier, though, security isn't
> all that critical; so in this case, chances are you can just leave
> parens permitted and let function calls potentially happen.

An ast-based evaluator wasn't as complicated as I first thought: the
examples I'd been looking at implemented far more features than I
needed.  This morning I found a simpler example at

  http://stackoverflow.com/questions/2371436/evaluating-a-mathematical-expression-in-a-string

The error messages are still pretty cryptic, so improving
that will add a few more lines.  One nice thing about the ast code is
that it's simple to add code to allow C-like character constants such
that ('A' === 0x41).  Here's the first pass at ast-based code:

import ast,operator

operators = \
    {
    ast.Add:    operator.iadd,
    ast.Sub:    operator.isub,
    ast.Mult:   operator.imul,
    ast.Div:    operator.idiv,
    ast.BitXor: operator.ixor,
    ast.BitAnd: operator.iand,
    ast.BitOr:  operator.ior,
    ast.LShift: operator.lshift,
    ast.RShift: operator.rshift,
    ast.Invert: operator.invert,
    ast.USub:   operator.neg,
    ast.UAdd:   operator.pos,
    }

def _eval_expr(node):
    global symbolTable
    if isinstance(node, ast.Name):
        if node.id not in symbolTable:
            raise ParseError("name '%s' undefined" % node.id)
        return symbolTable[node.id]
    elif isinstance(node, ast.Num):
        return node.n
    elif isinstance(node, ast.operator) or isinstance(node, ast.unaryop):
        return operators[type(node)]
    elif isinstance(node, ast.BinOp):
        return _eval_expr(node.op)(_eval_expr(node.left), _eval_expr(node.right))
    elif isinstance(node, ast.UnaryOp):
        return _eval_expr(node.op)(_eval_expr(node.operand))
    else:
        raise ParseError("error parsing expression at node %s" %  node)

def eval_expr(expr):
    return _eval_expr(ast.parse(expr).body[0].value)
    

-- 
Grant Edwards               grant.b.edwards        Yow! A can of ASPARAGUS,
                                  at               73 pigeons, some LIVE ammo,
                              gmail.com            and a FROZEN DAQUIRI!!

[toc] | [prev] | [next] | [standalone]

#36129

From	Chris Angelico <rosuav@gmail.com>
Date	2013-01-05 05:23 +1100
Message-ID	<mailman.94.1357323816.2939.python-list@python.org>
In reply to	#36126

On Sat, Jan 5, 2013 at 5:09 AM, Grant Edwards <invalid@invalid.invalid> wrote:
> The error messages are still pretty cryptic, so improving
> that will add a few more lines.  One nice thing about the ast code is
> that it's simple to add code to allow C-like character constants such
> that ('A' === 0x41).  Here's the first pass at ast-based code:

Looks cool, and fairly neat! Now I wonder, is it possible to use that
to create new operators, such as the letter d? Binary operator, takes
two integers...

ChrisA

[toc] | [prev] | [next] | [standalone]

Page 1 of 2 [1] 2 Next page →

csiph-web

Yet another attempt at a safe eval() call

Contents

#36088 — Yet another attempt at a safe eval() call

#36097

#36098

#36101

#36113

#36115

#36117

#36188

#36261

#36309

#36191

#36193

#36194

#36195

#36119

#36120

#36122

#36123

#36126

#36129