Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #106410 > unrolled thread
| Started by | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| First post | 2016-04-03 21:12 +0000 |
| Last post | 2016-04-08 10:10 +0200 |
| Articles | 20 on this page of 25 — 8 participants |
Back to article view | Back to comp.lang.python
Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-03 21:12 +0000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 13:46 +0000
Re: Untrusted code execution Rustom Mody <rustompmody@gmail.com> - 2016-04-05 07:17 -0700
Re: Untrusted code execution Ian Kelly <ian.g.kelly@gmail.com> - 2016-04-05 08:50 -0600
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 17:26 +0000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 18:50 +0000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 19:14 +0000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 19:13 +0000
Re: Untrusted code execution Steven D'Aprano <steve@pearwood.info> - 2016-04-06 11:43 +1000
Re: Untrusted code execution Random832 <random832@fastmail.com> - 2016-04-06 09:14 -0400
Re: Untrusted code execution Steven D'Aprano <steve@pearwood.info> - 2016-04-07 11:45 +1000
Re: Untrusted code execution Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-04-07 14:48 +1000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 15:18 +0000
Re: Untrusted code execution Steven D'Aprano <steve@pearwood.info> - 2016-04-08 15:28 +1000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 12:13 +0000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 14:25 +0000
Re: Untrusted code execution Steven D'Aprano <steve@pearwood.info> - 2016-04-08 15:26 +1000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 17:20 +0000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-07 17:35 +0000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-10 17:06 +0000
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 17:40 +0000
Re: Untrusted code execution Paul Rubin <no.email@nospam.invalid> - 2016-04-05 13:39 -0700
Re: Untrusted code execution Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2016-04-05 21:13 +0000
Re: Untrusted code execution Paul Rubin <no.email@nospam.invalid> - 2016-04-07 00:08 -0700
Re: Untrusted code execution Lele Gaifax <lele@metapensiero.it> - 2016-04-08 10:10 +0200
Page 1 of 2 [1] 2 Next page →
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-03 21:12 +0000 |
| Subject | Untrusted code execution |
| Message-ID | <slrnng31v9.19u.jon+usenet@wintry.unequivocal.co.uk> |
I'd just like to say up front that this is more of a thought experiment
than anything else, I don't have any plans to use this idea on any
genuinely untrusted code. Apart from anything else, there's the
denial-of-service issue.
That said, is there any way that the following Python 3.4 code could
result in a arbitrary code execution security hole?
tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
for node in ast.walk(tree):
if (isinstance(node, ast.Name) and node.id.startswith("_") or
isinstance(node, ast.Attribute) and node.attr.startswith("_")):
raise ValueError("Access to private values is not allowed.")
namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
print(eval(compile(tree, "<script>", "eval"), namespace))
[toc] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-05 13:46 +0000 |
| Message-ID | <slrnng7gj4.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106410 |
On 2016-04-03, Jon Ribbens <jon+usenet@unequivocal.co.uk> wrote:
> I'd just like to say up front that this is more of a thought experiment
> than anything else, I don't have any plans to use this idea on any
> genuinely untrusted code. Apart from anything else, there's the
> denial-of-service issue.
>
> That said, is there any way that the following Python 3.4 code could
> result in a arbitrary code execution security hole?
>
> tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
> for node in ast.walk(tree):
> if (isinstance(node, ast.Name) and node.id.startswith("_") or
> isinstance(node, ast.Attribute) and node.attr.startswith("_")):
> raise ValueError("Access to private values is not allowed.")
> namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
> print(eval(compile(tree, "<script>", "eval"), namespace))
Nobody has any thoughts on this at all?
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2016-04-05 07:17 -0700 |
| Message-ID | <58de161a-ecdb-4ada-aab5-871876ea1574@googlegroups.com> |
| In reply to | #106519 |
On Tuesday, April 5, 2016 at 7:19:39 PM UTC+5:30, Jon Ribbens wrote:
> On 2016-04-03, Jon Ribbens wrote:
> > I'd just like to say up front that this is more of a thought experiment
> > than anything else, I don't have any plans to use this idea on any
> > genuinely untrusted code. Apart from anything else, there's the
> > denial-of-service issue.
> >
> > That said, is there any way that the following Python 3.4 code could
> > result in a arbitrary code execution security hole?
> >
> > tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
> > for node in ast.walk(tree):
> > if (isinstance(node, ast.Name) and node.id.startswith("_") or
> > isinstance(node, ast.Attribute) and node.attr.startswith("_")):
> > raise ValueError("Access to private values is not allowed.")
> > namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
> > print(eval(compile(tree, "<script>", "eval"), namespace))
>
> Nobody has any thoughts on this at all?
i actually did...
But dont know enough of the AST API to figure out what you are trying/avoiding etc
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2016-04-05 08:50 -0600 |
| Message-ID | <mailman.71.1459867858.32530.python-list@python.org> |
| In reply to | #106521 |
On Tue, Apr 5, 2016 at 8:17 AM, Rustom Mody <rustompmody@gmail.com> wrote:
> On Tuesday, April 5, 2016 at 7:19:39 PM UTC+5:30, Jon Ribbens wrote:
>> On 2016-04-03, Jon Ribbens wrote:
>> > I'd just like to say up front that this is more of a thought experiment
>> > than anything else, I don't have any plans to use this idea on any
>> > genuinely untrusted code. Apart from anything else, there's the
>> > denial-of-service issue.
>> >
>> > That said, is there any way that the following Python 3.4 code could
>> > result in a arbitrary code execution security hole?
>> >
>> > tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
>> > for node in ast.walk(tree):
>> > if (isinstance(node, ast.Name) and node.id.startswith("_") or
>> > isinstance(node, ast.Attribute) and node.attr.startswith("_")):
>> > raise ValueError("Access to private values is not allowed.")
>> > namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
>> > print(eval(compile(tree, "<script>", "eval"), namespace))
>>
>> Nobody has any thoughts on this at all?
>
> i actually did...
>
> But dont know enough of the AST API to figure out what you are trying/avoiding etc
Same here, although it looks to me like this approach could work. Or
I'm just not clever enough to see how it could be exploited.
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-05 17:26 +0000 |
| Message-ID | <slrnng7tgu.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106521 |
On 2016-04-05, Rustom Mody <rustompmody@gmail.com> wrote:
> On Tuesday, April 5, 2016 at 7:19:39 PM UTC+5:30, Jon Ribbens wrote:
>> On 2016-04-03, Jon Ribbens wrote:
>> > I'd just like to say up front that this is more of a thought experiment
>> > than anything else, I don't have any plans to use this idea on any
>> > genuinely untrusted code. Apart from anything else, there's the
>> > denial-of-service issue.
>> >
>> > That said, is there any way that the following Python 3.4 code could
>> > result in a arbitrary code execution security hole?
>> >
>> > tree = compile(untrusted_code, "<script>", "eval", ast.PyCF_ONLY_AST)
>> > for node in ast.walk(tree):
>> > if (isinstance(node, ast.Name) and node.id.startswith("_") or
>> > isinstance(node, ast.Attribute) and node.attr.startswith("_")):
>> > raise ValueError("Access to private values is not allowed.")
>> > namespace = {"__builtins__": {"int": int, "str": str, "len": len}}
>> > print(eval(compile(tree, "<script>", "eval"), namespace))
>>
>> Nobody has any thoughts on this at all?
>
> i actually did...
>
> But dont know enough of the AST API to figure out what you are
> trying/avoiding etc
The idea is that it should prevent you from accessing any names or
attributes that start with an underscore.
Python 2 used to have an 'rexec' module that was supposed to allow
restricted execution of code. This was deprecated because it was
insecure and in Python 3 it was completely removed.
Generally speaking the way that people broke out of restricted
environments was to try and get access to the unrestricted "builtins"
so you could use __import__(), open(), the code type, etc which then
mean you can do almost anything or at least mess with the file system.
The way that you went looking for "builtins" was to wander about the
object tree by doing things like:
"".__class__.__base__.__subclasses__()
etc and, e.g. finding a function and doing
func.__globals__["__builtins__"]
or perhaps creating a raw code object via:
(lambda a: 1).__code__.__class__(...arbitrary code...)()
The point is that as far as I can see basically all such techniques
are completely prevented if you cannot access "__" attributes (let
alone "_" ones).
The received wisdom is that restricted code execution in Python is
an insolubly hard problem, but it looks a bit like my 7-line example
above disproves this theory, provided you choose carefully what you
provide in your restricted __builtins__ - but people who knows more
than me about Python seem to have thought about this problem for
longer than I have and come up with the opposite conclusion so I'm
curious what I'm missing.
(Again this all comes with the caveat that DoS is always easy -
just something trivial like '"*"*10**10**10' is going to break things.
But DoS is a far cry from remote code execution.)
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-05 18:50 +0000 |
| Message-ID | <slrnng82eo.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106534 |
On 2016-04-05, Chris Angelico <rosuav@gmail.com> wrote: > On Wed, Apr 6, 2016 at 3:26 AM, Jon Ribbens ><jon+usenet@unequivocal.co.uk> wrote: >> The received wisdom is that restricted code execution in Python is >> an insolubly hard problem, but it looks a bit like my 7-line example >> above disproves this theory, provided you choose carefully what you >> provide in your restricted __builtins__ - but people who knows more >> than me about Python seem to have thought about this problem for >> longer than I have and come up with the opposite conclusion so I'm >> curious what I'm missing. > > No, it doesn't disprove anything. All you've shown is "here's a piece > of code that hasn't yet been compromised". :) Yes, obviously. I wasn't asking for pedantry. > Your code is a *lot* safer for using 'eval' rather than 'exec'. > Otherwise, you'd be easily exploited using exceptions, which carry a > ton of info. ... but all in attributes that don't start with "_", as far as I can see. I think a very similar approach would work with 'exec' too, just you would obviously have to disallow ast.Import and ast.ImportFrom. > But even so, I would not bet money (much less the security of my > systems) on this being safe. I wasn't planning on betting any money ;-)
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-05 19:14 +0000 |
| Message-ID | <slrnng83pt.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106538 |
On 2016-04-05, Jon Ribbens <jon+usenet@unequivocal.co.uk> wrote: > On 2016-04-05, Chris Angelico <rosuav@gmail.com> wrote: >> Your code is a *lot* safer for using 'eval' rather than 'exec'. >> Otherwise, you'd be easily exploited using exceptions, which carry a >> ton of info. > > ... but all in attributes that don't start with "_", as far as I can see. Sorry, obviously I meant "that *do* start with '_'".
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-05 19:13 +0000 |
| Message-ID | <slrnng83o7.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106534 |
On 2016-04-05, Chris Angelico <rosuav@gmail.com> wrote:
> You can also create objects of various types using literal/display
> syntax, and that might let you craft some weird construct that
> effectively access those attributes without actually having an
> attribute that starts with an underscore. (Think of "getattr(x,
> '\x5f_class__')", although obviously it'll take more work than that,
> since getattr itself isn't available.)
Indeed. Although I think it would be safe to add a "proxy" getattr()
to the namespace's __builtins__ that just checked if the first
character of "name" was "_" and if so raised an AttributeError or
somesuch, and otherwise passed straight through to the real getattr(),
e.g.:
def proxy_getattr(obj, name, *args):
if type(name) is str and not name.startswith("_"):
return getattr(obj, name, *args)
raise AttributeError("Not allowed to access private attributes")
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-06 11:43 +1000 |
| Message-ID | <570469d7$0$1598$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #106534 |
On Wed, 6 Apr 2016 03:48 am, Chris Angelico wrote: > On Wed, Apr 6, 2016 at 3:26 AM, Jon Ribbens > <jon+usenet@unequivocal.co.uk> wrote: >> The received wisdom is that restricted code execution in Python is >> an insolubly hard problem, but it looks a bit like my 7-line example >> above disproves this theory, Jon's 7-line example doesn't come even close to providing restricted code execution in Python. What it provides is a restricted subset of expression evaluation, which is *much* easier. It's barely more powerful than the ast.safe_eval function. That doesn't make it useless, but it does mean that it's not solving the problem of "restricted code execution". [Jon again] >> provided you choose carefully what you >> provide in your restricted __builtins__ - but people who knows more >> than me about Python seem to have thought about this problem for >> longer than I have and come up with the opposite conclusion so I'm >> curious what I'm missing. You're missing that they're trying to allow enough Python functionality to run useful scripts (not just evaluate a few arithmetic expressions), but without allowing the script to break out of the restricted environment and do things which aren't permitted. For example, check out Tav's admirable work some years ago on trying to allow Python code to read but not write files: http://tav.espians.com/a-challenge-to-break-python-security.html and followups to that post here: http://tav.espians.com/paving-the-way-to-securing-the-python-interpreter.html http://tav.espians.com/update-on-securing-the-python-interpreter.html You should also read Guido's comments on capabilities: http://neopythonic.blogspot.com.au/2009/03/capabilities-for-python.html As Zooko says, Guido's "best argument is that reducing usability (in terms of forbidding language features, especially module import) and reducing the usefulness of extant library code" would make the resulting interpreter too feeble to be useful. Look at what you've done: you've restricted the entire world of Python down to, effectively, a calculator and a few string methods. That's not to say that a calculator and a few string methods won't be useful to someone, but the next Javascript it is not... [Chris] > No, it doesn't disprove anything. All you've shown is "here's a piece > of code that hasn't yet been compromised". :) What you're missing is a > demonstrated exploit against your code. I can't provide one, but it's > entirely possible that one will be found. > > Your code is a *lot* safer for using 'eval' rather than 'exec'. > Otherwise, you'd be easily exploited using exceptions, which carry a > ton of info. But even so, I would not bet money (much less the > security of my systems) on this being safe. I think Jon is on the right approach here for restricting evaluation of evaluation, which is a nicely constrained and small subset of Python. He's not allowing unrestricted arbitrary code execution: he has a very restricted (too restricted -- what the hell can you do with just int, str and len?) whitelist of functions that are allowed, and the significant restriction that dunder names aren't allowed. This makes his function a tiny DSL for calculators and equivalent. I think that, if it checks out, it would make a good addition to the standard library. All the obvious, and even not-so-obvious, attack tools are gone: eval, exec, getattr, type, __import__. Because you're not supporting Python 2, the various func.func_* attack surfaces are all gone. Since you can't access dunders directly, and the obvious indirect methods like eval and getattr aren't available, I don't think that any of the usual attacks will work. Keep in mind that Jon's burden is easier: he doesn't need to worry about the caller's environment, only his own environment. So long as the attacker can't inject code into his "safe eval" code, the attacker can monkey-patch their own built-ins and it simply doesn't matter. (If the attacker can monkey-patch Jon's code, they can do anything they like.) I think this approach is promising enough that Jon should take it to a few other places for comments, to try to get more eyeballs. Stackoverflow and Reddit's /r/python, perhaps. Please do followup here with any results, positive or negative. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Random832 <random832@fastmail.com> |
|---|---|
| Date | 2016-04-06 09:14 -0400 |
| Message-ID | <mailman.129.1459948499.32530.python-list@python.org> |
| In reply to | #106554 |
On Tue, Apr 5, 2016, at 21:43, Steven D'Aprano wrote:
> As Zooko says, Guido's "best argument is that reducing usability (in
> terms
> of forbidding language features, especially module import) and reducing
> the
> usefulness of extant library code" would make the resulting interpreter
> too
> feeble to be useful.
You don't have to forbid module import. The sandbox could control what
modules can be loaded, and what happens when you try to load a module.
import sys
module = type(sys)
fm = {}
def fimp(name, *etc):
# In a real implementation, this could also load whitelisted modules
try:
return fm[name]
except KeyError:
raise ImportError("Tried to load restricted module " + name)
fm['builtins'] = fb = module('builtins')
fb.int = int
fb.str = str
fb.len = len
fb.print = print
fb.__import__ = fimp
fm['sys'] = fsys = module('sys')
fsys.modules = fm
exec("""
import sys
print(sys.modules.keys())
""", {'__builtins__': fb})
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-07 11:45 +1000 |
| Message-ID | <5705bba5$0$1620$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #106572 |
On Wed, 6 Apr 2016 11:14 pm, Random832 wrote: > On Tue, Apr 5, 2016, at 21:43, Steven D'Aprano wrote: >> As Zooko says, Guido's "best argument is that reducing usability (in >> terms >> of forbidding language features, especially module import) and reducing >> the >> usefulness of extant library code" would make the resulting interpreter >> too >> feeble to be useful. > > You don't have to forbid module import. The sandbox could control what > modules can be loaded, and what happens when you try to load a module. Sure, but you do have to forbid import of *arbitrary* modules. One could include a white list of allowed modules, but it would probably be quite small. And you would have to do something about the unfortunate matter that modules have a reference to the unrestricted __builtins__: py> os.__builtins__['eval'] <built-in function eval> And because modules are singletons, it's not just a matter of replacing the __builtins__ with a more restrictive one, as that would affect trusted modules outside the sandbox too. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2016-04-07 14:48 +1000 |
| Message-ID | <5705e69a$0$2821$c3e8da3$76491128@news.astraweb.com> |
| In reply to | #106609 |
On Thursday 07 April 2016 13:40, Random832 wrote:
> On Wed, Apr 6, 2016, at 21:45, Steven D'Aprano wrote:
>> And you would have to do something about the unfortunate matter that
>> modules
>> have a reference to the unrestricted __builtins__:
>>
>> py> os.__builtins__['eval']
>> <built-in function eval>
>
> Well, I thought that the solution being discussed uses AST to generally
> forbid accessing attributes beginning with _ (you could also implement a
> whitelist there)
Sure, but I'm just demonstrating that the unrestricted builtins are just one
attribute lookup away. And as Chris points out, if you have (say) the os
module, then:
magic = os.sys.modules[
''.join(chr(i-1) for i in (96,96,99,118,106,109,117,106,111,116,96,96))
][''.join(chr(i+17) for i in (84,101,80,91))]
and that's Game Over.
It's not that this necessarily can't be done, but that it's sufficiently
hard that very few people are willing to tackle it, and those who are, even
fewer have come even close to a working restricted Python. PyPy has a
sandbox, but I believe that relies on OS-level features like jails, and I
don't really know how well it works.
The problem with sandboxing Python is that it's a game of cat and mouse: you
eliminate one hole, and then wait for somebody to publish the next hole,
then eliminate that, and so on. Will this process converge on a useful
subset of Python? Perhaps. Will it happen soon? Even more doubtful.
Nobody says that its impossible. Only that its hard, real hard, and probably
much harder than anyone thinks.
--
Steve
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-07 15:18 +0000 |
| Message-ID | <slrnngcuoa.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106609 |
On 2016-04-07, Chris Angelico <rosuav@gmail.com> wrote:
> On Thu, Apr 7, 2016 at 11:45 AM, Steven D'Aprano <steve@pearwood.info> wrote:
>> And you would have to do something about the unfortunate matter that modules
>> have a reference to the unrestricted __builtins__:
>>
>> py> os.__builtins__['eval']
>> <built-in function eval>
>
> This *in itself* is blocked by the rule against leading-underscore
> attribute lookup. However, if you can get the sys module, the world's
> your oyster; and any other module that imports sys will give it to
> you:
>
>>>> import os
>>>> os.sys
><module 'sys' (built-in)>
>>>> codecs.sys
><module 'sys' (built-in)>
>
> Can't monkey-patch that away, and codecs.sys.modules["builtins"] will
> give you access to the original builtins. And you can go to any number
> of levels, tracing a chain from a white-listed module to the
> unrestricted sys.modules. The only modules that would be safe to
> whitelist are those that either don't import anything significant (I'm
> pretty sure 'math' is safe), or import everything with underscores
> ("import sys as _sys").
No, actually absolutely no modules at all are safe to import directly.
This is because the untrusted code might alter them, and then the
altered code would be used by the trusted main application. Trivial
examples might include altering hashlib to always return the same
hash, 're' to always or never match, etc. If you import something
then it needs to be a individual copy of the module, with each name
referring either to an immutable object or to an individual proxy for
the real object.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-08 15:28 +1000 |
| Message-ID | <57074180$0$1615$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #106631 |
On Fri, 8 Apr 2016 01:18 am, Jon Ribbens wrote: > No, actually absolutely no modules at all are safe to import directly. > This is because the untrusted code might alter them Good thinking! I never even thought of that. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-07 12:13 +0000 |
| Message-ID | <slrnngcjsj.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106554 |
On 2016-04-06, Steven D'Aprano <steve@pearwood.info> wrote:
> On Wed, 6 Apr 2016 03:48 am, Chris Angelico wrote:
>> On Wed, Apr 6, 2016 at 3:26 AM, Jon Ribbens
>> <jon+usenet@unequivocal.co.uk> wrote:
>>> The received wisdom is that restricted code execution in Python is
>>> an insolubly hard problem, but it looks a bit like my 7-line example
>>> above disproves this theory,
>
> Jon's 7-line example doesn't come even close to providing restricted code
> execution in Python. What it provides is a restricted subset of expression
> evaluation, which is *much* easier.
It's true that I was using eval(), but I don't think that actually
fundamentally changes the game. Almost exactly the same sanitisation
method can be used to make exec() code safe. ("import" for example
does not work because there is no "__import__" in the provided
builtins, but even if it did work it could be trivially disallowed by
searching for ast.Import and ast.ImportFrom nodes. "with" must be
disallowed because otherwise __exit__ can be used to get a frame
object.)
> It's barely more powerful than the ast.safe_eval function.
I think you mean ast.literal_eval(), and you're misremembering.
That function isn't even a calculator, it won't even work out
"2*2" for you. It (almost) literally just parses literals ;-)
> [Jon again]
>>> provided you choose carefully what you
>>> provide in your restricted __builtins__ - but people who knows more
>>> than me about Python seem to have thought about this problem for
>>> longer than I have and come up with the opposite conclusion so I'm
>>> curious what I'm missing.
>
> You're missing that they're trying to allow enough Python functionality to
> run useful scripts (not just evaluate a few arithmetic expressions), but
> without allowing the script to break out of the restricted environment and
> do things which aren't permitted.
Hmm, I'm not missing that, I even explicitly mentioned it previously.
I think you're also missing that eval() can do a very great deal more
than just "arithmetic expressions".
> For example, check out Tav's admirable work some years ago on trying to
> allow Python code to read but not write files:
>
> http://tav.espians.com/a-challenge-to-break-python-security.html
Indeed, I have read that and the follow-ups. He was again making it
hard for himself by trying to allow execution of completely arbitrary
code, and still almost every way to escape relied on "_" attributes
(or him missing the obvious point that you can't check a string is
safe by doing "if foo == 'blah'" if "foo" might be a subtype of
str with a malicious __eq__ method).
> You should also read Guido's comments on capabilities:
>
> http://neopythonic.blogspot.com.au/2009/03/capabilities-for-python.html
Thanks, that's interesting.
> As Zooko says, Guido's "best argument is that reducing usability (in terms
> of forbidding language features, especially module import) and reducing the
> usefulness of extant library code" would make the resulting interpreter too
> feeble to be useful.
Well, no. It makes it too feeble to be used as a generic programming
language. But there is a whole other class of uses for which it would
still be very useful - making very configurable or dynamic systems,
for example. I don't know, imagine github allowed you to upload
restricted-Python code that could be used as a server-side commit
hook, to take a completely random example, or you could upload code
that would generate reports or data for graphing.
> Look at what you've done: you've restricted the entire world of
> Python down to, effectively, a calculator and a few string methods.
Again, no not really. You've tuples, sets, lists, dictionaries,
lambdas, generator and list expressions, etc. And although I made my
example __builtins__ very restricted indeed, that was just because
I'm asking about the basic principle of the idea. If the idea is
ok then the builtins can be gone through one by one and added if
they're safe.
> All the obvious, and even not-so-obvious, attack tools are gone:
> eval, exec, getattr, type, __import__.
Indeed. The fundamental point is that we must not allow the attacker
to have access to any of those things, or to gain access by using any
of the tools which we have provided. I think this is not an impossible
problem.
> I think this approach is promising enough that Jon should take it to a few
> other places for comments, to try to get more eyeballs. Stackoverflow and
> Reddit's /r/python, perhaps.
I'll post some example code on github in a bit and see what people
think.
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-07 14:25 +0000 |
| Message-ID | <slrnngcrkv.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106627 |
On 2016-04-07, Chris Angelico <rosuav@gmail.com> wrote: > Options 1 and 2 are nastily restricted. Option 3 is likely broken, as > exception objects carry tracebacks and such. Everything you're saying here is assuming that we must not let the attacker see any exception objects, but I don't understand why you're assuming that. As far as I can see, the information that exceptions hold that we need to prevent access to is all in "__" attributes that we're already blocking.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-08 15:26 +1000 |
| Message-ID | <5707410c$0$1615$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #106629 |
On Fri, 8 Apr 2016 12:25 am, Jon Ribbens wrote: > On 2016-04-07, Chris Angelico <rosuav@gmail.com> wrote: >> Options 1 and 2 are nastily restricted. Option 3 is likely broken, as >> exception objects carry tracebacks and such. > > Everything you're saying here is assuming that we must not let the > attacker see any exception objects, but I don't understand why you're > assuming that. As far as I can see, the information that exceptions > hold that we need to prevent access to is all in "__" attributes that > we're already blocking. You might be right, but you're putting a lot of trust in one security mechanism. If an attacker finds a way around that, you're screwed. "Defence in depth" and "default deny" is, in my opinion, better: prevent the untrusted user from seeing everything except those things which are proven to be safe. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-07 17:20 +0000 |
| Message-ID | <slrnngd5tc.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106627 |
On 2016-04-07, Random832 <random832@fastmail.com> wrote: > On Thu, Apr 7, 2016, at 08:13, Jon Ribbens wrote: >> > All the obvious, and even not-so-obvious, attack tools are gone: >> > eval, exec, getattr, type, __import__. > > We don't even need to take these away, per se. > > eval and exec could be replaced with functions that perform the > evaluation with the same rules in the same sandbox. Ah, that's a good point. I've put an example script here: https://github.com/jribbens/unsafe/blob/master/unsafe.py When run as a script, it will execute whatever Python code you pass it on stdin. If anyone can break it (by which I mean escape from the sandbox, not make it use up all the memory or go into an infinite loop, both of which are trivial) then I would be very interested.
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-07 17:35 +0000 |
| Message-ID | <slrnngd6pv.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106633 |
On 2016-04-07, Chris Angelico <rosuav@gmail.com> wrote: > On Fri, Apr 8, 2016 at 3:20 AM, Jon Ribbens ><jon+usenet@unequivocal.co.uk> wrote: >> On 2016-04-07, Random832 <random832@fastmail.com> wrote: >>> On Thu, Apr 7, 2016, at 08:13, Jon Ribbens wrote: >>>> > All the obvious, and even not-so-obvious, attack tools are gone: >>>> > eval, exec, getattr, type, __import__. >>> >>> We don't even need to take these away, per se. >>> >>> eval and exec could be replaced with functions that perform the >>> evaluation with the same rules in the same sandbox. >> >> Ah, that's a good point. >> >> I've put an example script here: >> >> https://github.com/jribbens/unsafe/blob/master/unsafe.py >> >> When run as a script, it will execute whatever Python code you pass it >> on stdin. > > Now we're getting to something rather interesting. Going back to your > previous post, though... > > On Wed, Apr 6, 2016 at 3:26 AM, Jon Ribbens ><jon+usenet@unequivocal.co.uk> wrote: >> The received wisdom is that restricted code execution in Python is >> an insolubly hard problem, but it looks a bit like my 7-line example >> above disproves this theory > > ... the thing you were missing in your original example was a LOT of > sophistication :) > > I don't currently have any exploits against your new code, but at this > point, it's grown beyond the "hey, if this was insolubly hard, how > come seven lines of code can do it?" question. This is the kind of > effort it takes to sandbox Python inside Python. Well, it entirely depends on how much you're trying to allow the sandboxed code to do. Most of the stuff in that script (e.g. _copy_module and safe versions of get/set/delattr, exec, and eval) I don't think is really necessary for most sensible applications of such an idea, I've just added it for completeness and to see if it introduces any security holes that weren't there originally. I could slim down the code again by simply removing all that extra cruft and the principle would still be the same - it's only the safe_compile() function that's adding anything interesting that I haven't seen done before (and half of the lines in that function are docstring or stuff to make nicer error messages).
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2016-04-10 17:06 +0000 |
| Message-ID | <slrnngl27g.19u.jon+usenet@wintry.unequivocal.co.uk> |
| In reply to | #106633 |
On 2016-04-07, Jon Ribbens <jon+usenet@unequivocal.co.uk> wrote: > I've put an example script here: > > https://github.com/jribbens/unsafe/blob/master/unsafe.py > > When run as a script, it will execute whatever Python code you pass it > on stdin. > > If anyone can break it (by which I mean escape from the sandbox, > not make it use up all the memory or go into an infinite loop, > both of which are trivial) then I would be very interested. I've updated the script a bit, to fix a couple of bugs, to add back in 'with' and 'import' (of white-listed modules) and to add a REPL mode which makes experimenting inside the sandbox easier. I'm still interested to see if anyone can find a way out of it ;-)
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.python
csiph-web