Path: csiph.com!usenet.pasdenom.info!nntpfeed.proxad.net!proxad.net!feeder1-1.proxad.net!217.73.144.45.MISMATCH!feeder2.ecngs.de!ecngs!feeder.ecngs.de!border1.nntp.ams1.giganews.com!nntp.giganews.com!feeder1.cambriumusenet.nl!feed.tweaknews.nl!194.109.133.83.MISMATCH!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.006 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'subject:Python': 0.05; 'dynamically': 0.07; 'important,': 0.07; 'socket': 0.07; 'ugly': 0.07; 'accepts': 0.09; 'additionally': 0.09; 'attempted': 0.09; 'scripting': 0.09; 'cc:addr:python-list': 0.10; 'python': 0.11; 'language,': 0.11; 'question.': 0.13; 'obviously': 0.15; 'blacklist': 0.16; 'creation.': 0.16; 'dislike': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'hacks': 0.16; 'headaches': 0.16; 'input.': 0.16; 'input:': 0.16; 'jail': 0.16; 'limit,': 0.16; 'matters,': 0.16; 'modules,': 0.16; 'permissions,': 0.16; 'sorts': 0.16; 'takes,': 0.16; 'wherein': 0.16; 'wrote:': 0.16; "wouldn't": 0.16; 'memory': 0.17; 'detect': 0.18; 'say,': 0.18; 'script.': 0.18; 'library': 0.20; 'prevent': 0.20; 'machine': 0.21; 'cc:2**0': 0.21; 'cc:addr:python.org': 0.21; 'trying': 0.22; '(the': 0.22; 'do.': 0.22; 'back.': 0.22; 'level,': 0.22; 'code,': 0.23; 'code.': 0.23; '2015': 0.23; "python's": 0.23; 'replacing': 0.23; 'this:': 0.23; 'header:In- Reply-To:1': 0.24; 'script': 0.25; 'linux': 0.26; 'rules': 0.27; 'executing': 0.27; 'message-id:@mail.gmail.com': 0.28; 'boundary': 0.29; 'concern': 0.29; 'lines': 0.30; 'too.': 0.30; 'work.': 0.30; 'minimal': 0.31; "i'd": 0.31; 'operations': 0.31; 'code': 0.31; 'run': 0.32; "can't": 0.32; 'similar': 0.32; 'gets': 0.32; 'skip:p 30': 0.32; 'anyone': 0.32; 'probably': 0.32; 'expensive': 0.32; 'point': 0.33; 'instead,': 0.33; 'machine.': 0.33; 'safely': 0.33; 'yours,': 0.33; 'case,': 0.34; 'file': 0.34; 'received:google.com': 0.34; 'could': 0.35; 'along': 0.35; 'attempt': 0.35; 'unicode': 0.35; 'something': 0.35; 'really': 0.35; 'list': 0.35; "isn't": 0.35; 'problem.': 0.35; 'but': 0.36; 'being': 0.36; 'too': 0.36; 'project': 0.36; 'there': 0.36; 'possible': 0.36; '(and': 0.36; 'level.': 0.36; 'operating': 0.37; 'subject:: ': 0.37; 'level': 0.37; 'instead': 0.38; "won't": 0.38; '(with': 0.38; 'tue,': 0.38; 'virtual': 0.38; 'world,': 0.38; 'doing': 0.38; 'or,': 0.38; 'end': 0.39; 'wanted': 0.39; 'pm,': 0.39; 'things': 0.39; 'whatever': 0.39; 'hear': 0.62; 'skip:n 10': 0.63; 'safe': 0.63; 'great': 0.64; 'you.': 0.64; 'making': 0.64; 'our': 0.64; 'limit': 0.65; 'life': 0.66; 'believe': 0.67; '26,': 0.72; 'obvious': 0.72; 'useful.': 0.72; 'special': 0.72; 'transfer': 0.73; 'yourself': 0.73; 'protect': 0.74; 'you:': 0.79; 'abandoning': 0.84; 'chrisa': 0.84; 'conclusions': 0.84; 'costly': 0.84; 'cpu.': 0.84; 'extreme,': 0.84; 'headaches,': 0.84; 'isolate': 0.84; 'uptime': 0.84; 'absolutely': 0.87; 'to:none': 0.90; 'approach.': 0.91; 'safe.': 0.93; 'imagine': 0.96; 'serious': 0.97 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type:content-transfer-encoding; bh=X+d7WuPT+yBhK4KQGS7oHn0urvFa+SBZQ79ijNd5yhU=; b=eE4xo41jbgCZMNnwQ1m62/VHIUZmNvFIlcOtLssMecJJ36giXmfsCI/7PBSlXgvRWA soG0jvM/o9yOdbc8Td/NK51xdNQCee87hc8Wcirq1Beq44fCwqsWQTRkJPaJ/91medKW w0Sq7yxgZFL0en/aLc9HlRT/sW10FJOH8CuTCIZdjD0iaIIfVO5ZGOC7bg3y/JkDAZiJ ZrMdIFv+G7v3IXPISfZv7Kit9GEXxxzJYeNOE9TIv1WLQumqRifpiQElgfsCmFBvrPa9 rjbhlPOmQX/lWaDKd7knG+0VXUKHYVf0XCIUhZNHNPdWWe1oP76DrVTdiH6X8won5KUI /1tw== MIME-Version: 1.0 X-Received: by 10.50.61.166 with SMTP id q6mr6837382igr.14.1432608274179; Mon, 25 May 2015 19:44:34 -0700 (PDT) In-Reply-To: <60b424a2-2273-42b2-b60c-92656af0afa5@googlegroups.com> References: <60b424a2-2273-42b2-b60c-92656af0afa5@googlegroups.com> Date: Tue, 26 May 2015 12:44:34 +1000 Subject: Re: Creating a reliable sandboxed Python environment From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 79 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1432608283 news.xs4all.nl 2880 [2001:888:2000:d::a6]:44102 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:91222 On Tue, May 26, 2015 at 12:24 PM, wrote: > I believe it is not possible to limit such operations at the Python level= . The best you could do is try replacing all the standard library modules, = but that is again just a blacklist - it won't prevent a determined attacker= from doing things like constructing their own 'code' object and executing = it. > > It might be necessary to isolate the Python process at the operating syst= em level. > * A chroot jail on Linux & OS X can limit access to the filesystem. Again= this is just a blacklist. > * No obvious way to block socket creation. Again this would be just a bla= cklist. > * No obvious way to detect unapproved system calls and block them. > > In the limit, I could dynamically spin up a virtual machine and execute t= he Python program in the machine. However that's extremely expensive in com= putational time. > > Has anyone on this list attempted to sandbox Python programs in a serious= fashion? I'd be interested to hear your approach. Yes, I had a project along similar lines to yours, a few years back. We wanted to let our end users customize our service using a Python script. Our conclusions were: 1) As you say, it is fundamentally not possible to make this work at the Python level. 2) It's extremely difficult to do at any other level, too. 3) Python is a great language, despite my then-boss's dislike of it. 4) Lua isn't as great a language, but it's much easier to sandbox. 5) Unicode is important, even if my then-boss took a lot of convincing on that one. (Was a big point in Python's favour, and against Lua.) 6) Efficient transfer of complex structured data across a process boundary is difficult. 7) Letting end users script your system safely is a fundamentally hard prob= lem. We ended up abandoning Python altogether and using ECMAScript (with Google's V8 interpreter) as our scripting language, and even then, we had to do all sorts of things to make it safe. (And I wouldn't bet my life on it being safe even now. Not even sure I'd bet my data or uptime on it being safe, either.) My recommendation to you: If you absolutely have to run untrusted Python code, don't concern yourself with *anything* that the Python code can and can't do. You'll end up making gross and ugly hacks that stop people from doing legitimate things, in an attempt to prevent abuses. Instead, *just* guard yourself at the OS level - a chroot jail to protect what matters, iptables rules to prevent anything going to the outside world, run as a non-significant user with minimal permissions, ulimit everything so they can't hurt you. Whatever it takes, make it so that you could protect C code, because trust me, it'll be less headaches than trying to sandbox anything at the Python level. Or, worse, you won't get headaches, you'll just have a flawed security model that eventually gets exploited. There are a couple of alternatives. You could go for a really extreme protection system and actually spin up a virtual machine, where they're welcome to do whatever they like, and it'll run inside X amount of memory and Y amount of CPU. Pretty costly (the overhead of a full OS for every client), but it'll work. Or you could go to the other extreme, and instead of actually permitting arbitrary Python code, you instead allow a "Python-like syntax" wherein people can manipulate the input. You'd need to then create some special hacks to allow file I/O, so this probably wouldn't work for your scenario, but imagine writing a sed-like program that accepts Python code. You could do something like this: for line in input: print(evaluate_user_code(line), file=3Doutput) where evaluate_user_code() is a protected evaluator, like ast.literal_eval() but additionally allowing access to one name "line", which obviously would be the line in question. But for your case, I think that'd require too many hacks to be useful. ChrisA