Groups > comp.lang.python > #91220 > unrolled thread

Creating a reliable sandboxed Python environment

Started by	davidfstr@gmail.com
First post	2015-05-25 19:24 -0700
Last post	2015-05-29 11:56 +0200
Articles	20 on this page of 37 — 12 participants

Back to article view | Back to comp.lang.python

  Creating a reliable sandboxed Python environment davidfstr@gmail.com - 2015-05-25 19:24 -0700
    Re: Creating a reliable sandboxed Python environment Chris Angelico <rosuav@gmail.com> - 2015-05-26 12:44 +1000
    Re: Creating a reliable sandboxed Python environment Paul Rubin <no.email@nospam.invalid> - 2015-05-25 23:17 -0700
    Re: Creating a reliable sandboxed Python environment Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-26 17:10 +1000
      Re: Creating a reliable sandboxed Python environment Laura Creighton <lac@openend.se> - 2015-05-26 09:53 +0200
      Re: Creating a reliable sandboxed Python environment Laura Creighton <lac@openend.se> - 2015-05-26 10:02 +0200
    Re: Creating a reliable sandboxed Python environment Ned Batchelder <ned@nedbatchelder.com> - 2015-05-26 03:21 -0700
    Re: Creating a reliable sandboxed Python environment marco.nawijn@colosso.nl - 2015-05-26 05:01 -0700
    Re: Creating a reliable sandboxed Python environment davidfstr@gmail.com - 2015-05-28 09:34 -0700
      Re: Creating a reliable sandboxed Python environment Paul Rubin <no.email@nospam.invalid> - 2015-05-30 20:13 -0700
    Re: Creating a reliable sandboxed Python environment Stefan Behnel <stefan_ml@behnel.de> - 2015-05-28 20:41 +0200
    Re: Creating a reliable sandboxed Python environment Chris Angelico <rosuav@gmail.com> - 2015-05-29 04:51 +1000
      Re: Creating a reliable sandboxed Python environment Paul Rubin <no.email@nospam.invalid> - 2015-05-29 11:30 -0700
        Re: Creating a reliable sandboxed Python environment Marko Rauhamaa <marko@pacujo.net> - 2015-05-29 22:12 +0300
          Re: Creating a reliable sandboxed Python environment Paul Rubin <no.email@nospam.invalid> - 2015-05-29 13:15 -0700
    Re: Creating a reliable sandboxed Python environment Stefan Behnel <stefan_ml@behnel.de> - 2015-05-29 08:18 +0200
    Re: Creating a reliable sandboxed Python environment Chris Angelico <rosuav@gmail.com> - 2015-05-29 17:41 +1000
      Re: Creating a reliable sandboxed Python environment Paul Rubin <no.email@nospam.invalid> - 2015-05-29 11:33 -0700
        Re: Creating a reliable sandboxed Python environment Chris Angelico <rosuav@gmail.com> - 2015-05-30 08:49 +1000
          Re: Creating a reliable sandboxed Python environment Paul Rubin <no.email@nospam.invalid> - 2015-05-29 18:28 -0700
            Re: Creating a reliable sandboxed Python environment Chris Angelico <rosuav@gmail.com> - 2015-05-30 12:42 +1000
              Re: Creating a reliable sandboxed Python environment Paul Rubin <no.email@nospam.invalid> - 2015-05-29 21:48 -0700
                Re: Creating a reliable sandboxed Python environment Steven D'Aprano <steve@pearwood.info> - 2015-05-30 19:00 +1000
                  Re: Creating a reliable sandboxed Python environment Laura Creighton <lac@openend.se> - 2015-05-30 13:24 +0200
                    Re: Creating a reliable sandboxed Python environment Steven D'Aprano <steve@pearwood.info> - 2015-05-31 09:52 +1000
                      Re: Creating a reliable sandboxed Python environment Modulok <modulok@gmail.com> - 2015-05-30 19:08 -0600
                      Re: Creating a reliable sandboxed Python environment Laura Creighton <lac@openend.se> - 2015-05-31 08:14 +0200
                  Re: Creating a reliable sandboxed Python environment Stefan Behnel <stefan_ml@behnel.de> - 2015-05-30 20:42 +0200
                  Re: Creating a reliable sandboxed Python environment Paul Rubin <no.email@nospam.invalid> - 2015-05-30 13:00 -0700
                    Re: Creating a reliable sandboxed Python environment Chris Angelico <rosuav@gmail.com> - 2015-05-31 08:20 +1000
                      Re: Creating a reliable sandboxed Python environment Paul Rubin <no.email@nospam.invalid> - 2015-05-30 15:36 -0700
                  Re: Creating a reliable sandboxed Python environment Laura Creighton <lac@openend.se> - 2015-05-30 22:54 +0200
          Re: Creating a reliable sandboxed Python environment BartC <bc@freeuk.com> - 2015-05-30 13:06 +0100
            Re: Creating a reliable sandboxed Python environment Chris Angelico <rosuav@gmail.com> - 2015-05-30 22:37 +1000
    Re: Creating a reliable sandboxed Python environment Stefan Behnel <stefan_ml@behnel.de> - 2015-05-29 11:23 +0200
    Re: Creating a reliable sandboxed Python environment Chris Angelico <rosuav@gmail.com> - 2015-05-29 19:38 +1000
    Re: Creating a reliable sandboxed Python environment Laura Creighton <lac@openend.se> - 2015-05-29 11:56 +0200

Page 1 of 2 [1] 2 Next page →

#91220 — Creating a reliable sandboxed Python environment

From	davidfstr@gmail.com
Date	2015-05-25 19:24 -0700
Subject	Creating a reliable sandboxed Python environment
Message-ID	<60b424a2-2273-42b2-b60c-92656af0afa5@googlegroups.com>

I am writing a web service that accepts Python programs as input, runs the provided program with some profiling hooks, and returns various information about the program's runtime behavior. To do this in a safe manner, I need to be able to create a sandbox that restricts what the submitted Python program can do on the web server.

Almost all discussion about Python sandboxes I have seen on the internet involves selectively blacklisting functionality that gives access to system resources, such as trying to hide the "open" builtin to restrict access to file I/O. All such approaches are doomed to fail because you can always find a way around a blacklist.

For my particular sandbox, I wish to allow *only* the following kinds of actions (in a whitelist):
* reading from stdin & writing to stdout;
* reading from files, within a set of whitelisted directories;
* pure Python computation.

In particular all other operations available through system calls are banned. This includes, but is not limited to:
* writing to files;
* manipulating network sockets;
* communicating with other processes.

I believe it is not possible to limit such operations at the Python level. The best you could do is try replacing all the standard library modules, but that is again just a blacklist - it won't prevent a determined attacker from doing things like constructing their own 'code' object and executing it.

It might be necessary to isolate the Python process at the operating system level.
* A chroot jail on Linux & OS X can limit access to the filesystem. Again this is just a blacklist.
* No obvious way to block socket creation. Again this would be just a blacklist.
* No obvious way to detect unapproved system calls and block them.

In the limit, I could dynamically spin up a virtual machine and execute the Python program in the machine. However that's extremely expensive in computational time.

Has anyone on this list attempted to sandbox Python programs in a serious fashion? I'd be interested to hear your approach.

- David

[toc] | [next] | [standalone]

#91222

From	Chris Angelico <rosuav@gmail.com>
Date	2015-05-26 12:44 +1000
Message-ID	<mailman.46.1432608283.5151.python-list@python.org>
In reply to	#91220

On Tue, May 26, 2015 at 12:24 PM,  <davidfstr@gmail.com> wrote:
> I believe it is not possible to limit such operations at the Python level. The best you could do is try replacing all the standard library modules, but that is again just a blacklist - it won't prevent a determined attacker from doing things like constructing their own 'code' object and executing it.
>
> It might be necessary to isolate the Python process at the operating system level.
> * A chroot jail on Linux & OS X can limit access to the filesystem. Again this is just a blacklist.
> * No obvious way to block socket creation. Again this would be just a blacklist.
> * No obvious way to detect unapproved system calls and block them.
>
> In the limit, I could dynamically spin up a virtual machine and execute the Python program in the machine. However that's extremely expensive in computational time.
>
> Has anyone on this list attempted to sandbox Python programs in a serious fashion? I'd be interested to hear your approach.

Yes, I had a project along similar lines to yours, a few years back.
We wanted to let our end users customize our service using a Python
script. Our conclusions were:

1) As you say, it is fundamentally not possible to make this work at
the Python level.
2) It's extremely difficult to do at any other level, too.
3) Python is a great language, despite my then-boss's dislike of it.
4) Lua isn't as great a language, but it's much easier to sandbox.
5) Unicode is important, even if my then-boss took a lot of convincing
on that one. (Was a big point in Python's favour, and against Lua.)
6) Efficient transfer of complex structured data across a process
boundary is difficult.
7) Letting end users script your system safely is a fundamentally hard problem.

We ended up abandoning Python altogether and using ECMAScript (with
Google's V8 interpreter) as our scripting language, and even then, we
had to do all sorts of things to make it safe. (And I wouldn't bet my
life on it being safe even now. Not even sure I'd bet my data or
uptime on it being safe, either.)

My recommendation to you: If you absolutely have to run untrusted
Python code, don't concern yourself with *anything* that the Python
code can and can't do. You'll end up making gross and ugly hacks that
stop people from doing legitimate things, in an attempt to prevent
abuses. Instead, *just* guard yourself at the OS level - a chroot jail
to protect what matters, iptables rules to prevent anything going to
the outside world, run as a non-significant user with minimal
permissions, ulimit everything so they can't hurt you. Whatever it
takes, make it so that you could protect C code, because trust me,
it'll be less headaches than trying to sandbox anything at the Python
level. Or, worse, you won't get headaches, you'll just have a flawed
security model that eventually gets exploited.

There are a couple of alternatives. You could go for a really extreme
protection system and actually spin up a virtual machine, where
they're welcome to do whatever they like, and it'll run inside X
amount of memory and Y amount of CPU. Pretty costly (the overhead of a
full OS for every client), but it'll work. Or you could go to the
other extreme, and instead of actually permitting arbitrary Python
code, you instead allow a "Python-like syntax" wherein people can
manipulate the input. You'd need to then create some special hacks to
allow file I/O, so this probably wouldn't work for your scenario, but
imagine writing a sed-like program that accepts Python code. You could
do something like this:

for line in input:
    print(evaluate_user_code(line), file=output)

where evaluate_user_code() is a protected evaluator, like
ast.literal_eval() but additionally allowing access to one name
"line", which obviously would be the line in question.

But for your case, I think that'd require too many hacks to be useful.

ChrisA

From	Paul Rubin <no.email@nospam.invalid>
Date	2015-05-25 23:17 -0700
Message-ID	<87twuzyi9p.fsf@jester.gateway.sonic.net>
In reply to	#91220

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2015-05-26 17:10 +1000
Message-ID	<55641c67$0$12894$c3e8da3$5496439d@news.astraweb.com>
In reply to	#91220

From	Laura Creighton <lac@openend.se>
Date	2015-05-26 09:53 +0200
Message-ID	<mailman.51.1432626858.5151.python-list@python.org>
In reply to	#91235

From	Ned Batchelder <ned@nedbatchelder.com>
Date	2015-05-26 03:21 -0700
Message-ID	<62c19f45-5b25-4049-83ab-895af36c55c8@googlegroups.com>
In reply to	#91220

From	marco.nawijn@colosso.nl
Date	2015-05-26 05:01 -0700
Message-ID	<66e5667e-c368-4ce3-86e1-0987365af5eb@googlegroups.com>
In reply to	#91220

From	Stefan Behnel <stefan_ml@behnel.de>
Date	2015-05-28 20:41 +0200
Message-ID	<mailman.138.1432838475.5151.python-list@python.org>
In reply to	#91220

From	Marko Rauhamaa <marko@pacujo.net>
Date	2015-05-29 22:12 +0300
Message-ID	<87siafb3l5.fsf@elektro.pacujo.net>
In reply to	#91493

Creating a reliable sandboxed Python environment

Contents

#91220 — Creating a reliable sandboxed Python environment

#91222

#91233

#91235

#91238

#91240

#91241

#91242

#91377

#91558

#91381

#91384

#91493

#91496

#91499

#91431

#91433

#91495

#91507

#91511