Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #106410 > unrolled thread

Untrusted code execution

Started byJon Ribbens <jon+usenet@unequivocal.co.uk>
First post2016-04-03 21:12 +0000
Last post2016-04-08 10:10 +0200
Articles 20 on this page of 25 — 8 participants

Back to article view | Back to comp.lang.python


Contents

  Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-03 21:12 +0000
    Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 13:46 +0000
      Re: Untrusted code execution Rustom Mody <rustompmody@gmail.com> - 2016-04-05 07:17 -0700
        Re: Untrusted code execution Ian Kelly <ian.g.kelly@gmail.com> - 2016-04-05 08:50 -0600
        Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 17:26 +0000
          Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 18:50 +0000
            Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 19:14 +0000
          Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 19:13 +0000
          Re: Untrusted code execution Steven D'Aprano <steve@pearwood.info> - 2016-04-06 11:43 +1000
            Re: Untrusted code execution Random832 <random832@fastmail.com> - 2016-04-06 09:14 -0400
              Re: Untrusted code execution Steven D'Aprano <steve@pearwood.info> - 2016-04-07 11:45 +1000
                Re: Untrusted code execution Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-04-07 14:48 +1000
                Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 15:18 +0000
                  Re: Untrusted code execution Steven D'Aprano <steve@pearwood.info> - 2016-04-08 15:28 +1000
            Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 12:13 +0000
              Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 14:25 +0000
                Re: Untrusted code execution Steven D'Aprano <steve@pearwood.info> - 2016-04-08 15:26 +1000
              Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 17:20 +0000
                Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 17:35 +0000
                Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-10 17:06 +0000
        Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 17:40 +0000
      Re: Untrusted code execution Paul Rubin <no.email@nospam.invalid> - 2016-04-05 13:39 -0700
        Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 21:13 +0000
          Re: Untrusted code execution Paul Rubin <no.email@nospam.invalid> - 2016-04-07 00:08 -0700
            Re: Untrusted code execution Lele Gaifax <lele@metapensiero.it> - 2016-04-08 10:10 +0200

Page 1 of 2  [1] 2  Next page →


#106410 — Untrusted code execution

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-03 21:12 +0000
SubjectUntrusted code execution
Message-ID<slrnng31v9.19u.jon+usenet@wintry.unequivocal.co.uk>
I'd just like to say up front that this is more of a thought experiment
than anything else, I don't have any plans to use this idea on any
genuinely untrusted code. Apart from anything else, there's the
denial-of-service issue.

That said, is there any way that the following Python 3.4 code could
result in a arbitrary code execution security hole?

    tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
    for node in ast.walk(tree):
        if (isinstance(node, ast.Name) and node.id.startswith("_") or
            isinstance(node, ast.Attribute) and node.attr.startswith("_")):
                raise ValueError("Access to private values is not allowed.")
    namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
    print(eval(compile(tree, "<script>", "eval"), namespace))

[toc] | [next] | [standalone]


#106519

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-05 13:46 +0000
Message-ID<slrnng7gj4.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106410
On 2016-04-03, Jon Ribbens <jon+usenet@unequivocal.co.uk> wrote:
> I'd just like to say up front that this is more of a thought experiment
> than anything else, I don't have any plans to use this idea on any
> genuinely untrusted code. Apart from anything else, there's the
> denial-of-service issue.
>
> That said, is there any way that the following Python 3.4 code could
> result in a arbitrary code execution security hole?
>
>     tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
>     for node in ast.walk(tree):
>         if (isinstance(node, ast.Name) and node.id.startswith("_") or
>             isinstance(node, ast.Attribute) and node.attr.startswith("_")):
>                 raise ValueError("Access to private values is not allowed.")
>     namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
>     print(eval(compile(tree, "<script>", "eval"), namespace))

Nobody has any thoughts on this at all?

[toc] | [prev] | [next] | [standalone]


#106521

FromRustom Mody <rustompmody@gmail.com>
Date2016-04-05 07:17 -0700
Message-ID<58de161a-ecdb-4ada-aab5-871876ea1574@googlegroups.com>
In reply to#106519
On Tuesday, April 5, 2016 at 7:19:39 PM UTC+5:30, Jon Ribbens wrote:
> On 2016-04-03, Jon Ribbens wrote:
> > I'd just like to say up front that this is more of a thought experiment
> > than anything else, I don't have any plans to use this idea on any
> > genuinely untrusted code. Apart from anything else, there's the
> > denial-of-service issue.
> >
> > That said, is there any way that the following Python 3.4 code could
> > result in a arbitrary code execution security hole?
> >
> >     tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
> >     for node in ast.walk(tree):
> >         if (isinstance(node, ast.Name) and node.id.startswith("_") or
> >             isinstance(node, ast.Attribute) and node.attr.startswith("_")):
> >                 raise ValueError("Access to private values is not allowed.")
> >     namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
> >     print(eval(compile(tree, "<script>", "eval"), namespace))
> 
> Nobody has any thoughts on this at all?

i actually did...

But dont know enough of the AST API to figure out what you are trying/avoiding etc

[toc] | [prev] | [next] | [standalone]


#106522

FromIan Kelly <ian.g.kelly@gmail.com>
Date2016-04-05 08:50 -0600
Message-ID<mailman.71.1459867858.32530.python-list@python.org>
In reply to#106521
On Tue, Apr 5, 2016 at 8:17 AM, Rustom Mody <rustompmody@gmail.com> wrote:
> On Tuesday, April 5, 2016 at 7:19:39 PM UTC+5:30, Jon Ribbens wrote:
>> On 2016-04-03, Jon Ribbens wrote:
>> > I'd just like to say up front that this is more of a thought experiment
>> > than anything else, I don't have any plans to use this idea on any
>> > genuinely untrusted code. Apart from anything else, there's the
>> > denial-of-service issue.
>> >
>> > That said, is there any way that the following Python 3.4 code could
>> > result in a arbitrary code execution security hole?
>> >
>> >     tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
>> >     for node in ast.walk(tree):
>> >         if (isinstance(node, ast.Name) and node.id.startswith("_") or
>> >             isinstance(node, ast.Attribute) and node.attr.startswith("_")):
>> >                 raise ValueError("Access to private values is not allowed.")
>> >     namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
>> >     print(eval(compile(tree, "<script>", "eval"), namespace))
>>
>> Nobody has any thoughts on this at all?
>
> i actually did...
>
> But dont know enough of the AST API to figure out what you are trying/avoiding etc

Same here, although it looks to me like this approach could work. Or
I'm just not clever enough to see how it could be exploited.

[toc] | [prev] | [next] | [standalone]


#106534

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-05 17:26 +0000
Message-ID<slrnng7tgu.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106521
On 2016-04-05, Rustom Mody <rustompmody@gmail.com> wrote:
> On Tuesday, April 5, 2016 at 7:19:39 PM UTC+5:30, Jon Ribbens wrote:
>> On 2016-04-03, Jon Ribbens wrote:
>> > I'd just like to say up front that this is more of a thought experiment
>> > than anything else, I don't have any plans to use this idea on any
>> > genuinely untrusted code. Apart from anything else, there's the
>> > denial-of-service issue.
>> >
>> > That said, is there any way that the following Python 3.4 code could
>> > result in a arbitrary code execution security hole?
>> >
>> >     tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
>> >     for node in ast.walk(tree):
>> >         if (isinstance(node, ast.Name) and node.id.startswith("_") or
>> >             isinstance(node, ast.Attribute) and node.attr.startswith("_")):
>> >                 raise ValueError("Access to private values is not allowed.")
>> >     namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
>> >     print(eval(compile(tree, "<script>", "eval"), namespace))
>> 
>> Nobody has any thoughts on this at all?
>
> i actually did...
>
> But dont know enough of the AST API to figure out what you are
> trying/avoiding etc

The idea is that it should prevent you from accessing any names or
attributes that start with an underscore.

Python 2 used to have an 'rexec' module that was supposed to allow
restricted execution of code. This was deprecated because it was
insecure and in Python 3 it was completely removed.

Generally speaking the way that people broke out of restricted
environments was to try and get access to the unrestricted "builtins"
so you could use __import__(), open(), the code type, etc which then
mean you can do almost anything or at least mess with the file system. 

The way that you went looking for "builtins" was to wander about the
object tree by doing things like:
  "".__class__.__base__.__subclasses__()
etc and, e.g. finding a function and doing
  func.__globals__["__builtins__"]
or perhaps creating a raw code object via:
  (lambda a: 1).__code__.__class__(...arbitrary code...)()

The point is that as far as I can see basically all such techniques
are completely prevented if you cannot access "__" attributes (let
alone "_" ones). 

The received wisdom is that restricted code execution in Python is
an insolubly hard problem, but it looks a bit like my 7-line example
above disproves this theory, provided you choose carefully what you
provide in your restricted __builtins__ - but people who knows more
than me about Python seem to have thought about this problem for
longer than I have and come up with the opposite conclusion so I'm
curious what I'm missing.

(Again this all comes with the caveat that DoS is always easy -
just something trivial like '"*"*10**10**10' is going to break things.
But DoS is a far cry from remote code execution.)

[toc] | [prev] | [next] | [standalone]


#106538

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-05 18:50 +0000
Message-ID<slrnng82eo.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106534
On 2016-04-05, Chris Angelico <rosuav@gmail.com> wrote:
> On Wed, Apr 6, 2016 at 3:26 AM, Jon Ribbens
><jon+usenet@unequivocal.co.uk> wrote:
>> The received wisdom is that restricted code execution in Python is
>> an insolubly hard problem, but it looks a bit like my 7-line example
>> above disproves this theory, provided you choose carefully what you
>> provide in your restricted __builtins__ - but people who knows more
>> than me about Python seem to have thought about this problem for
>> longer than I have and come up with the opposite conclusion so I'm
>> curious what I'm missing.
>
> No, it doesn't disprove anything. All you've shown is "here's a piece
> of code that hasn't yet been compromised". :)

Yes, obviously. I wasn't asking for pedantry.

> Your code is a *lot* safer for using 'eval' rather than 'exec'.
> Otherwise, you'd be easily exploited using exceptions, which carry a
> ton of info.

... but all in attributes that don't start with "_", as far as I can see.

I think a very similar approach would work with 'exec' too, just you
would obviously have to disallow ast.Import and ast.ImportFrom.

> But even so, I would not bet money (much less the security of my
> systems) on this being safe.

I wasn't planning on betting any money ;-)

[toc] | [prev] | [next] | [standalone]


#106540

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-05 19:14 +0000
Message-ID<slrnng83pt.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106538
On 2016-04-05, Jon Ribbens <jon+usenet@unequivocal.co.uk> wrote:
> On 2016-04-05, Chris Angelico <rosuav@gmail.com> wrote:
>> Your code is a *lot* safer for using 'eval' rather than 'exec'.
>> Otherwise, you'd be easily exploited using exceptions, which carry a
>> ton of info.
>
> ... but all in attributes that don't start with "_", as far as I can see.

Sorry, obviously I meant "that *do* start with '_'".

[toc] | [prev] | [next] | [standalone]


#106539

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-05 19:13 +0000
Message-ID<slrnng83o7.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106534
On 2016-04-05, Chris Angelico <rosuav@gmail.com> wrote:
> You can also create objects of various types using literal/display
> syntax, and that might let you craft some weird construct that
> effectively access those attributes without actually having an
> attribute that starts with an underscore. (Think of "getattr(x,
> '\x5f_class__')", although obviously it'll take more work than that,
> since getattr itself isn't available.)

Indeed. Although I think it would be safe to add a "proxy" getattr()
to the namespace's __builtins__ that just checked if the first
character of "name" was "_" and if so raised an AttributeError or
somesuch, and otherwise passed straight through to the real getattr(),
e.g.:

    def proxy_getattr(obj, name, *args):
        if type(name) is str and not name.startswith("_"):
            return getattr(obj, name, *args)
        raise AttributeError("Not allowed to access private attributes")

[toc] | [prev] | [next] | [standalone]


#106554

FromSteven D'Aprano <steve@pearwood.info>
Date2016-04-06 11:43 +1000
Message-ID<570469d7$0$1598$c3e8da3$5496439d@news.astraweb.com>
In reply to#106534
On Wed, 6 Apr 2016 03:48 am, Chris Angelico wrote:

> On Wed, Apr 6, 2016 at 3:26 AM, Jon Ribbens
> <jon+usenet@unequivocal.co.uk> wrote:
>> The received wisdom is that restricted code execution in Python is
>> an insolubly hard problem, but it looks a bit like my 7-line example
>> above disproves this theory, 

Jon's 7-line example doesn't come even close to providing restricted code
execution in Python. What it provides is a restricted subset of expression
evaluation, which is *much* easier. It's barely more powerful than the
ast.safe_eval function. That doesn't make it useless, but it does mean that
it's not solving the problem of "restricted code execution".


[Jon again]
>> provided you choose carefully what you 
>> provide in your restricted __builtins__ - but people who knows more
>> than me about Python seem to have thought about this problem for
>> longer than I have and come up with the opposite conclusion so I'm
>> curious what I'm missing.

You're missing that they're trying to allow enough Python functionality to
run useful scripts (not just evaluate a few arithmetic expressions), but
without allowing the script to break out of the restricted environment and
do things which aren't permitted.

For example, check out Tav's admirable work some years ago on trying to
allow Python code to read but not write files:

http://tav.espians.com/a-challenge-to-break-python-security.html

and followups to that post here:

http://tav.espians.com/paving-the-way-to-securing-the-python-interpreter.html
http://tav.espians.com/update-on-securing-the-python-interpreter.html


You should also read Guido's comments on capabilities:

http://neopythonic.blogspot.com.au/2009/03/capabilities-for-python.html

As Zooko says, Guido's "best argument is that reducing usability (in terms
of forbidding language features, especially module import) and reducing the
usefulness of extant library code" would make the resulting interpreter too
feeble to be useful.

Look at what you've done: you've restricted the entire world of Python down
to, effectively, a calculator and a few string methods. That's not to say
that a calculator and a few string methods won't be useful to someone, but
the next Javascript it is not...


[Chris]
> No, it doesn't disprove anything. All you've shown is "here's a piece
> of code that hasn't yet been compromised". :) What you're missing is a
> demonstrated exploit against your code. I can't provide one, but it's
> entirely possible that one will be found.
> 
> Your code is a *lot* safer for using 'eval' rather than 'exec'.
> Otherwise, you'd be easily exploited using exceptions, which carry a
> ton of info. But even so, I would not bet money (much less the
> security of my systems) on this being safe.

I think Jon is on the right approach here for restricting evaluation of
evaluation, which is a nicely constrained and small subset of Python. He's
not allowing unrestricted arbitrary code execution: he has a very
restricted (too restricted -- what the hell can you do with just int, str
and len?) whitelist of functions that are allowed, and the significant
restriction that dunder names aren't allowed.

This makes his function a tiny DSL for calculators and equivalent. I think
that, if it checks out, it would make a good addition to the standard
library.


All the obvious, and even not-so-obvious, attack tools are gone: eval, exec,
getattr, type, __import__. Because you're not supporting Python 2, the
various func.func_* attack surfaces are all gone. Since you can't access
dunders directly, and the obvious indirect methods like eval and getattr
aren't available, I don't think that any of the usual attacks will work.

Keep in mind that Jon's burden is easier: he doesn't need to worry about the
caller's environment, only his own environment. So long as the attacker
can't inject code into his "safe eval" code, the attacker can monkey-patch
their own built-ins and it simply doesn't matter.

(If the attacker can monkey-patch Jon's code, they can do anything they
like.)


I think this approach is promising enough that Jon should take it to a few
other places for comments, to try to get more eyeballs. Stackoverflow and
Reddit's /r/python, perhaps. 

Please do followup here with any results, positive or negative.




-- 
Steven

[toc] | [prev] | [next] | [standalone]


#106572

FromRandom832 <random832@fastmail.com>
Date2016-04-06 09:14 -0400
Message-ID<mailman.129.1459948499.32530.python-list@python.org>
In reply to#106554
On Tue, Apr 5, 2016, at 21:43, Steven D'Aprano wrote:
> As Zooko says, Guido's "best argument is that reducing usability (in
> terms
> of forbidding language features, especially module import) and reducing
> the
> usefulness of extant library code" would make the resulting interpreter
> too
> feeble to be useful.

You don't have to forbid module import. The sandbox could control what
modules can be loaded, and what happens when you try to load a module.

import sys
module = type(sys)
fm = {}

def fimp(name, *etc):
    # In a real implementation, this could also load whitelisted modules
    try:
        return fm[name]
    except KeyError:
        raise ImportError("Tried to load restricted module " + name)

fm['builtins'] = fb = module('builtins')
fb.int = int
fb.str = str
fb.len = len
fb.print = print
fb.__import__ = fimp
fm['sys'] = fsys = module('sys')
fsys.modules = fm

exec("""
import sys
print(sys.modules.keys())
""", {'__builtins__': fb})

[toc] | [prev] | [next] | [standalone]


#106609

FromSteven D'Aprano <steve@pearwood.info>
Date2016-04-07 11:45 +1000
Message-ID<5705bba5$0$1620$c3e8da3$5496439d@news.astraweb.com>
In reply to#106572
On Wed, 6 Apr 2016 11:14 pm, Random832 wrote:

> On Tue, Apr 5, 2016, at 21:43, Steven D'Aprano wrote:
>> As Zooko says, Guido's "best argument is that reducing usability (in
>> terms
>> of forbidding language features, especially module import) and reducing
>> the
>> usefulness of extant library code" would make the resulting interpreter
>> too
>> feeble to be useful.
> 
> You don't have to forbid module import. The sandbox could control what
> modules can be loaded, and what happens when you try to load a module.


Sure, but you do have to forbid import of *arbitrary* modules. One could
include a white list of allowed modules, but it would probably be quite
small.

And you would have to do something about the unfortunate matter that modules
have a reference to the unrestricted __builtins__:

py> os.__builtins__['eval']
<built-in function eval>


And because modules are singletons, it's not just a matter of replacing the
__builtins__ with a more restrictive one, as that would affect trusted
modules outside the sandbox too.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#106611

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2016-04-07 14:48 +1000
Message-ID<5705e69a$0$2821$c3e8da3$76491128@news.astraweb.com>
In reply to#106609
On Thursday 07 April 2016 13:40, Random832 wrote:

> On Wed, Apr 6, 2016, at 21:45, Steven D'Aprano wrote:
>> And you would have to do something about the unfortunate matter that
>> modules
>> have a reference to the unrestricted __builtins__:
>> 
>> py> os.__builtins__['eval']
>> <built-in function eval>
> 
> Well, I thought that the solution being discussed uses AST to generally
> forbid accessing attributes beginning with _ (you could also implement a
> whitelist there)


Sure, but I'm just demonstrating that the unrestricted builtins are just one 
attribute lookup away. And as Chris points out, if you have (say) the os 
module, then:

magic = os.sys.modules[
    ''.join(chr(i-1) for i in (96,96,99,118,106,109,117,106,111,116,96,96))
    ][''.join(chr(i+17) for i in (84,101,80,91))]


and that's Game Over.


It's not that this necessarily can't be done, but that it's sufficiently 
hard that very few people are willing to tackle it, and those who are, even 
fewer have come even close to a working restricted Python. PyPy has a 
sandbox, but I believe that relies on OS-level features like jails, and I 
don't really know how well it works.

The problem with sandboxing Python is that it's a game of cat and mouse: you 
eliminate one hole, and then wait for somebody to publish the next hole, 
then eliminate that, and so on. Will this process converge on a useful 
subset of Python? Perhaps. Will it happen soon? Even more doubtful.

Nobody says that its impossible. Only that its hard, real hard, and probably 
much harder than anyone thinks.



-- 
Steve

[toc] | [prev] | [next] | [standalone]


#106631

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-07 15:18 +0000
Message-ID<slrnngcuoa.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106609
On 2016-04-07, Chris Angelico <rosuav@gmail.com> wrote:
> On Thu, Apr 7, 2016 at 11:45 AM, Steven D'Aprano <steve@pearwood.info> wrote:
>> And you would have to do something about the unfortunate matter that modules
>> have a reference to the unrestricted __builtins__:
>>
>> py> os.__builtins__['eval']
>> <built-in function eval>
>
> This *in itself* is blocked by the rule against leading-underscore
> attribute lookup. However, if you can get the sys module, the world's
> your oyster; and any other module that imports sys will give it to
> you:
>
>>>> import os
>>>> os.sys
><module 'sys' (built-in)>
>>>> codecs.sys
><module 'sys' (built-in)>
>
> Can't monkey-patch that away, and codecs.sys.modules["builtins"] will
> give you access to the original builtins. And you can go to any number
> of levels, tracing a chain from a white-listed module to the
> unrestricted sys.modules. The only modules that would be safe to
> whitelist are those that either don't import anything significant (I'm
> pretty sure 'math' is safe), or import everything with underscores
> ("import sys as _sys").

No, actually absolutely no modules at all are safe to import directly.
This is because the untrusted code might alter them, and then the
altered code would be used by the trusted main application. Trivial
examples might include altering hashlib to always return the same
hash, 're' to always or never match, etc. If you import something
then it needs to be a individual copy of the module, with each name
referring either to an immutable object or to an individual proxy for
the real object.

[toc] | [prev] | [next] | [standalone]


#106647

FromSteven D'Aprano <steve@pearwood.info>
Date2016-04-08 15:28 +1000
Message-ID<57074180$0$1615$c3e8da3$5496439d@news.astraweb.com>
In reply to#106631
On Fri, 8 Apr 2016 01:18 am, Jon Ribbens wrote:

> No, actually absolutely no modules at all are safe to import directly.
> This is because the untrusted code might alter them


Good thinking! I never even thought of that.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#106627

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-07 12:13 +0000
Message-ID<slrnngcjsj.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106554
On 2016-04-06, Steven D'Aprano <steve@pearwood.info> wrote:
> On Wed, 6 Apr 2016 03:48 am, Chris Angelico wrote:
>> On Wed, Apr 6, 2016 at 3:26 AM, Jon Ribbens
>> <jon+usenet@unequivocal.co.uk> wrote:
>>> The received wisdom is that restricted code execution in Python is
>>> an insolubly hard problem, but it looks a bit like my 7-line example
>>> above disproves this theory, 
>
> Jon's 7-line example doesn't come even close to providing restricted code
> execution in Python. What it provides is a restricted subset of expression
> evaluation, which is *much* easier.

It's true that I was using eval(), but I don't think that actually
fundamentally changes the game. Almost exactly the same sanitisation
method can be used to make exec() code safe. ("import" for example
does not work because there is no "__import__" in the provided
builtins, but even if it did work it could be trivially disallowed by
searching for ast.Import and ast.ImportFrom nodes. "with" must be
disallowed because otherwise __exit__ can be used to get a frame
object.)

> It's barely more powerful than the ast.safe_eval function.

I think you mean ast.literal_eval(), and you're misremembering.
That function isn't even a calculator, it won't even work out
"2*2" for you. It (almost) literally just parses literals ;-)

> [Jon again]
>>> provided you choose carefully what you 
>>> provide in your restricted __builtins__ - but people who knows more
>>> than me about Python seem to have thought about this problem for
>>> longer than I have and come up with the opposite conclusion so I'm
>>> curious what I'm missing.
>
> You're missing that they're trying to allow enough Python functionality to
> run useful scripts (not just evaluate a few arithmetic expressions), but
> without allowing the script to break out of the restricted environment and
> do things which aren't permitted.

Hmm, I'm not missing that, I even explicitly mentioned it previously.
I think you're also missing that eval() can do a very great deal more
than just "arithmetic expressions".

> For example, check out Tav's admirable work some years ago on trying to
> allow Python code to read but not write files:
>
> http://tav.espians.com/a-challenge-to-break-python-security.html

Indeed, I have read that and the follow-ups. He was again making it
hard for himself by trying to allow execution of completely arbitrary
code, and still almost every way to escape relied on "_" attributes
(or him missing the obvious point that you can't check a string is
safe by doing "if foo == 'blah'" if "foo" might be a subtype of
str with a malicious __eq__ method).

> You should also read Guido's comments on capabilities:
>
> http://neopythonic.blogspot.com.au/2009/03/capabilities-for-python.html

Thanks, that's interesting.

> As Zooko says, Guido's "best argument is that reducing usability (in terms
> of forbidding language features, especially module import) and reducing the
> usefulness of extant library code" would make the resulting interpreter too
> feeble to be useful.

Well, no. It makes it too feeble to be used as a generic programming
language. But there is a whole other class of uses for which it would
still be very useful - making very configurable or dynamic systems,
for example. I don't know, imagine github allowed you to upload
restricted-Python code that could be used as a server-side commit
hook, to take a completely random example, or you could upload code
that would generate reports or data for graphing.

> Look at what you've done: you've restricted the entire world of
> Python down to, effectively, a calculator and a few string methods.

Again, no not really. You've tuples, sets, lists, dictionaries,
lambdas, generator and list expressions, etc. And although I made my
example __builtins__ very restricted indeed, that was just because
I'm asking about the basic principle of the idea. If the idea is
ok then the builtins can be gone through one by one and added if
they're safe.

> All the obvious, and even not-so-obvious, attack tools are gone:
> eval, exec, getattr, type, __import__.

Indeed. The fundamental point is that we must not allow the attacker
to have access to any of those things, or to gain access by using any
of the tools which we have provided. I think this is not an impossible
problem.

> I think this approach is promising enough that Jon should take it to a few
> other places for comments, to try to get more eyeballs. Stackoverflow and
> Reddit's /r/python, perhaps. 

I'll post some example code on github in a bit and see what people
think.

[toc] | [prev] | [next] | [standalone]


#106629

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-07 14:25 +0000
Message-ID<slrnngcrkv.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106627
On 2016-04-07, Chris Angelico <rosuav@gmail.com> wrote:
> Options 1 and 2 are nastily restricted. Option 3 is likely broken, as
> exception objects carry tracebacks and such.

Everything you're saying here is assuming that we must not let the
attacker see any exception objects, but I don't understand why you're
assuming that. As far as I can see, the information that exceptions
hold that we need to prevent access to is all in "__" attributes that
we're already blocking.

[toc] | [prev] | [next] | [standalone]


#106646

FromSteven D'Aprano <steve@pearwood.info>
Date2016-04-08 15:26 +1000
Message-ID<5707410c$0$1615$c3e8da3$5496439d@news.astraweb.com>
In reply to#106629
On Fri, 8 Apr 2016 12:25 am, Jon Ribbens wrote:

> On 2016-04-07, Chris Angelico <rosuav@gmail.com> wrote:
>> Options 1 and 2 are nastily restricted. Option 3 is likely broken, as
>> exception objects carry tracebacks and such.
> 
> Everything you're saying here is assuming that we must not let the
> attacker see any exception objects, but I don't understand why you're
> assuming that. As far as I can see, the information that exceptions
> hold that we need to prevent access to is all in "__" attributes that
> we're already blocking.

You might be right, but you're putting a lot of trust in one security
mechanism. If an attacker finds a way around that, you're screwed. "Defence
in depth" and "default deny" is, in my opinion, better: prevent the
untrusted user from seeing everything except those things which are proven
to be safe.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#106633

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-07 17:20 +0000
Message-ID<slrnngd5tc.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106627
On 2016-04-07, Random832 <random832@fastmail.com> wrote:
> On Thu, Apr 7, 2016, at 08:13, Jon Ribbens wrote:
>> > All the obvious, and even not-so-obvious, attack tools are gone:
>> > eval, exec, getattr, type, __import__.
>
> We don't even need to take these away, per se.
>
> eval and exec could be replaced with functions that perform the
> evaluation with the same rules in the same sandbox.

Ah, that's a good point.

I've put an example script here:

  https://github.com/jribbens/unsafe/blob/master/unsafe.py

When run as a script, it will execute whatever Python code you pass it
on stdin.

If anyone can break it (by which I mean escape from the sandbox,
not make it use up all the memory or go into an infinite loop,
both of which are trivial) then I would be very interested.

[toc] | [prev] | [next] | [standalone]


#106634

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-07 17:35 +0000
Message-ID<slrnngd6pv.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106633
On 2016-04-07, Chris Angelico <rosuav@gmail.com> wrote:
> On Fri, Apr 8, 2016 at 3:20 AM, Jon Ribbens
><jon+usenet@unequivocal.co.uk> wrote:
>> On 2016-04-07, Random832 <random832@fastmail.com> wrote:
>>> On Thu, Apr 7, 2016, at 08:13, Jon Ribbens wrote:
>>>> > All the obvious, and even not-so-obvious, attack tools are gone:
>>>> > eval, exec, getattr, type, __import__.
>>>
>>> We don't even need to take these away, per se.
>>>
>>> eval and exec could be replaced with functions that perform the
>>> evaluation with the same rules in the same sandbox.
>>
>> Ah, that's a good point.
>>
>> I've put an example script here:
>>
>>   https://github.com/jribbens/unsafe/blob/master/unsafe.py
>>
>> When run as a script, it will execute whatever Python code you pass it
>> on stdin.
>
> Now we're getting to something rather interesting. Going back to your
> previous post, though...
>
> On Wed, Apr 6, 2016 at 3:26 AM, Jon Ribbens
><jon+usenet@unequivocal.co.uk> wrote:
>> The received wisdom is that restricted code execution in Python is
>> an insolubly hard problem, but it looks a bit like my 7-line example
>> above disproves this theory
>
> ... the thing you were missing in your original example was a LOT of
> sophistication :)
>
> I don't currently have any exploits against your new code, but at this
> point, it's grown beyond the "hey, if this was insolubly hard, how
> come seven lines of code can do it?" question. This is the kind of
> effort it takes to sandbox Python inside Python.

Well, it entirely depends on how much you're trying to allow the
sandboxed code to do. Most of the stuff in that script (e.g.
_copy_module and safe versions of get/set/delattr, exec, and eval)
I don't think is really necessary for most sensible applications
of such an idea, I've just added it for completeness and to see
if it introduces any security holes that weren't there originally.

I could slim down the code again by simply removing all that extra
cruft and the principle would still be the same - it's only the
safe_compile() function that's adding anything interesting that
I haven't seen done before (and half of the lines in that function
are docstring or stuff to make nicer error messages).

[toc] | [prev] | [next] | [standalone]


#106791

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2016-04-10 17:06 +0000
Message-ID<slrnngl27g.19u.jon+usenet@wintry.unequivocal.co.uk>
In reply to#106633
On 2016-04-07, Jon Ribbens <jon+usenet@unequivocal.co.uk> wrote:
> I've put an example script here:
>
>   https://github.com/jribbens/unsafe/blob/master/unsafe.py
>
> When run as a script, it will execute whatever Python code you pass it
> on stdin.
>
> If anyone can break it (by which I mean escape from the sandbox,
> not make it use up all the memory or go into an infinite loop,
> both of which are trivial) then I would be very interested.

I've updated the script a bit, to fix a couple of bugs, to add back in
'with' and 'import' (of white-listed modules) and to add a REPL mode
which makes experimenting inside the sandbox easier. I'm still
interested to see if anyone can find a way out of it ;-)

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.python


csiph-web