Path: csiph.com!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail From: Chris Angelico Newsgroups: comp.lang.python Subject: Re: Undefined behaviour in C [was Re: The Cost of Dynamism] Date: Sat, 26 Mar 2016 08:23:41 +1100 Lines: 79 Message-ID: References: <8737rvxs89.fsf@elektro.pacujo.net> <56e7483d$0$1608$c3e8da3$5496439d@news.astraweb.com> <56ef9787$0$1516$c3e8da3$5496439d@news.astraweb.com> <56f02196$0$1588$c3e8da3$5496439d@news.astraweb.com> <56f42d9f$0$1618$c3e8da3$5496439d@news.astraweb.com> <56f55e2e$0$1619$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: news.uni-berlin.de Wgc56ZoMXhJF6qPTpwIERwxD/j2BuO4XbLWMQq+8pl2A== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'compiler': 0.05; 'cpython': 0.05; 'badly': 0.07; 'raises': 0.07; 'undefined': 0.07; 'cc:addr:python-list': 0.09; 'behave': 0.09; 'exist.': 0.09; 'ignoring': 0.09; 'rules.': 0.09; 'spec': 0.09; 'bug': 0.10; 'python': 0.10; 'assume': 0.11; 'language,': 0.11; 'question.': 0.13; 'def': 0.13; 'argument': 0.15; '"next")': 0.16; '(modulo': 0.16; '2016': 0.16; 'aggressively': 0.16; 'ctypes.': 0.16; 'did.': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'iterator': 0.16; 'iterator.': 0.16; 'iterators': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'silly': 0.16; 'substituting': 0.16; 'true:': 0.16; 'tup': 0.16; 'wrote:': 0.16; 'basically': 0.18; 'detect': 0.18; 'instance,': 0.18; 'integer': 0.18; 'language': 0.19; '>>>': 0.20; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'machine': 0.21; 'explicit': 0.22; 'permitted': 0.22; 'terminate': 0.22; 'am,': 0.23; 'code,': 0.23; 'wrote': 0.23; "python's": 0.23; 'sat,': 0.23; '(this': 0.24; 'somewhere': 0.24; 'thus': 0.24; 'written': 0.24; 'header:In- Reply-To:1': 0.24; 'fri,': 0.27; 'message-id:@mail.gmail.com': 0.27; 'yield': 0.27; 'correct': 0.28; 'function': 0.28; 'this.': 0.28; 'values': 0.28; 'behaviour': 0.29; 'guarantees': 0.29; 'sensible': 0.29; 'subject: [': 0.29; 'subset': 0.29; 'code:': 0.29; 'raise': 0.29; "i'm": 0.30; '(including': 0.30; 'that.': 0.30; 'code': 0.30; 'call.': 0.30; 'certain': 0.31; "can't": 0.32; '"the': 0.32; 'says': 0.32; "d'aprano": 0.33; 'done,': 0.33; 'rule': 0.33; 'steven': 0.33; 'that,': 0.34; 'received:google.com': 0.35; 'could': 0.35; 'but': 0.36; 'should': 0.36; 'received:209.85': 0.36; 'cases': 0.36; 'faster': 0.36; 'totally': 0.36; 'subject:: ': 0.37; 'really': 0.37; 'there,': 0.37; 'say': 0.37; 'received:209.85.213': 0.37; 'doing': 0.38; 'difference': 0.38; 'skip:z 10': 0.38; 'received:209': 0.38; 'names': 0.38; 'anything': 0.38; 'mean': 0.38; 'end': 0.39; 'means': 0.39; 'sure': 0.39; 'does': 0.39; 'some': 0.40; 'your': 0.60; 'subject:The': 0.61; 'default': 0.61; 'skip:n 10': 0.62; 'more': 0.63; 'face': 0.64; 'between': 0.65; 'mar': 0.65; 'else.': 0.66; 'promise': 0.66; 'travel,': 0.66; 'potentially': 0.67; 'consumer': 0.67; 'therefore': 0.67; '26,': 0.72; 'unusual': 0.72; 'sole': 0.76; '*lot*': 0.84; 'away,': 0.84; 'change?': 0.84; 'chrisa': 0.84; 'mistaken': 0.84; 'restricts': 0.84; 'ultimately,': 0.84; 'to:none': 0.91; 'imagine': 0.96 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc; bh=qfJvL/dOztvlQaTSNwMxLezu3JzQyITFdLX5SPIlVEc=; b=sYiluHPkfJjLwefRN9yihUB8cbG6Ww+BfNhikBTnL7EUHO2+RUV+eiFxzyKxacFDQC pVOrG502maWxNNhfjYMHX6eyVmw6d6kfkKuH1VhEyr3r+17ZTZFW9Njwz0QY65nsfXbg 5nYHed1JvaLDxYYrwJTvkOQvTPvUt8AnQQ5j5a39x7Fie7EvvigZ690XVYBgWX23+Bk4 Qz11kfXAd+VxTYhkSMQv8JugROslZ0yyKW44AVrVzY2ennSQzjI0zeqO/lT8X5xsXwWY gj5Z5R7FD81o+wD/dwUQ5Orm9q9FQMRPhJ1r+8UtPHD4CRaT9n5uWxvI66Mbr8jVD+XG 9cGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:cc; bh=qfJvL/dOztvlQaTSNwMxLezu3JzQyITFdLX5SPIlVEc=; b=XNSQjO860DaORZn/Q2aUaXdK+7h2M14icI8IxdQzhpEDyzgxpejK0+mQKAJh1VKit3 BX51JYZF3bSTBpjN1zoc1iTwKEUC0trlnPQoMWCOgZAR23i9b/KP2bLH6Pko2HAKf3MR IwEczEBmOgJ5EjqJ6Sf/m6rv6VJ19F7rfQdVO07+g86bJ8xRVrkJi46RLRvJp3VkRGCg 0OuHkICoPcblLnPn2U4lTP1zm+EhRC3DxRsZMrK+Xn/lfS+LwV+c0a09IQiT5nNbK8F1 2ZTlMoKZO7kpkZdZ2EKjELkYAERz3mVnFAMXRo0DP8VjCGgKmqKLek6mCmah1BuIJyZ9 XYvQ== X-Gm-Message-State: AD7BkJIuefnuFxuHMGyGUnEI15Be0BFwHotfhJJ5Sb+l2EOIep7VNSxlqZe2+d1ZcdhNZyjl/LfjEgYkuZV46g== X-Received: by 10.51.17.34 with SMTP id gb2mr532717igd.13.1458941021966; Fri, 25 Mar 2016 14:23:41 -0700 (PDT) In-Reply-To: <56f55e2e$0$1619$c3e8da3$5496439d@news.astraweb.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:105708 On Sat, Mar 26, 2016 at 2:50 AM, Steven D'Aprano wrote: > On Fri, 25 Mar 2016 06:54 am, BartC wrote: > >>> In the case of C, the line is limited to working with some specific type >>> (say, an int32). Even there, if the addition might overflow, the >>> behaviour is undefined and the compiler can do anything it wants >>> (including time travel, >> >> I'm pretty sure it can't do time travel... >> >> or erasing your hard disk). > > > You are absolutely, and potentially catastrophically, mistaken on that. > > Undefined behaviour does not mean "implementation specific behaviour". Nor > does it mean "something sensible will happen but we don't promise what it > will be". It means "the compiler can do anything", including ignoring the > code you actually wrote and substituting its own faster code, which may or > may not do what your original code did. C's undefined behaviour is basically "you're not allowed to do this". The compiler is free to assume that you will never do this, ergo it is free to write out machine code that is correct only so long as you do not do this. Let me draw a Python analogy: 1) A well-behaved iterator will return values until it raises StopIteration, and then raise StopIteration thereafter. 2) An iterator which raises and then returns is thus badly written and should not exist. 3) A consumer of an iterator is allowed to misbehave in the event that the iterator returns after raising. 4) Therefore an optimizer is free to *cause* the consumer to misbehave when given an ill-behaved iterator. Consider this code: def zip_forever(*iters, fillvalue=None): """Like zip_longest, but without the silly rule that it should terminate once they all finish""" while True: yield tuple(next(it, fillvalue) for it in iters) If all its iterators are well-behaved, this is identical (modulo monkey-patching of names like "tuple" and "next") to: def zip_forever(*iters, fillvalue=None): yield from zip_longest(*iters, fillvalue=fillvalue) tup = (None,) * len(iters) while True: yield tup Is PyPy/FAT Python/some other optimizer permitted to make this change? The only difference between C's "undefined behaviour" and Python's way of doing the spec is the answer to that one question. C says yes, you're totally allowed to assume that; the end result in all cases should be indistinguishable; any time you can detect that it optimized this away, it's because of a bug somewhere else. Is your code allowed to misbehave when other code has bugs? That, ultimately, is the question. Imagine a tightened up subset of the Python language that restricts certain unusual behaviours. (This has the same purpose as PyPy's RPython, and for all I know, RPython might do exactly this.) It's a narrowly-used single purpose language, and its sole guarantee is that well-written SubPython code will behave the same way in SubPython as it does in CPython. It might then, for instance, not permit rebinding of builtins, nor of function default argument replacements, without an explicit "drop_caches()" call. Code could then be far more aggressively optimized - but behaviour would become undefined in the event that you break one of its rules. That's really all that C's done, and you honestly don't have to get so panicky at the word "undefined". It's simply "don't do this". And let's face it... there's a *lot* of undefined behaviour in CPython once you play with ctypes. Do we have any language guarantees as to what will happen if you change the value of the integer 7 to now be 8? Or will you just say "don't do that"? ChrisA