Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Chris Angelico Newsgroups: comp.lang.python Subject: Re: Undefined behaviour in C [was Re: The Cost of Dynamism] Date: Sun, 27 Mar 2016 01:30:31 +1100 Lines: 77 Message-ID: References: <56ef9787$0$1516$c3e8da3$5496439d@news.astraweb.com> <56f02196$0$1588$c3e8da3$5496439d@news.astraweb.com> <56f42d9f$0$1618$c3e8da3$5496439d@news.astraweb.com> <56f55e2e$0$1619$c3e8da3$5496439d@news.astraweb.com> <87wpoq1omm.fsf@elektro.pacujo.net> <56f5f81d$0$1585$c3e8da3$5496439d@news.astraweb.com> <87io0a6j1w.fsf@nightsong.com> <56f67ee3$0$1583$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: news.uni-berlin.de AqTmEOuvliZ3GOjaY9fPsgiJyIfpb6q/Gz+xgsxiSJTQ== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'compiler': 0.05; 'cpython': 0.05; 'that?': 0.05; 'emulate': 0.07; 'expressions': 0.07; 'overflow': 0.07; 'plenty': 0.07; 'undefined': 0.07; 'valueerror:': 0.07; 'cc:addr:python-list': 0.09; 'garbage': 0.09; 'happen.': 0.09; 'implies': 0.09; 'integers': 0.09; 'interpreter,': 0.09; 'nameerror:': 0.09; 'pointers': 0.09; 'statements': 0.09; 'yeah,': 0.09; 'python': 0.10; 'assume': 0.11; 'itself.': 0.11; 'exception': 0.13; 'argument': 0.15; '"does': 0.16; '1:09': 0.16; '2016': 0.16; 'blocks:': 0.16; 'codebase,': 0.16; 'discarded.': 0.16; 'elsewhere.': 0.16; 'emit': 0.16; 'foo()': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'indexerror:': 0.16; 'infinitely': 0.16; 'integer.': 0.16; 'integers.': 0.16; 'interpreter;': 0.16; 'naive': 0.16; 'personally,': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'value;': 0.16; 'zero,': 0.16; 'wrote:': 0.16; 'memory': 0.17; 'first.': 0.18; 'instance,': 0.18; 'integer': 0.18; 'try:': 0.18; 'input': 0.18; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'algorithm': 0.20; '(on': 0.22; 'recognize': 0.22; 'suppose': 0.22; 'am,': 0.23; 'code,': 0.23; 'needed.': 0.23; 'references': 0.23; 'somewhere': 0.24; 'thus': 0.24; 'header:In-Reply-To:1': 0.24; 'signed': 0.24; "i've": 0.25; 'example': 0.26; 'error': 0.27; 'bugs': 0.27; 'possibility': 0.27; 'message-id:@mail.gmail.com': 0.27; 'actual': 0.28; '32-bit': 0.29; 'behaviour': 0.29; 'loop,': 0.29; 'subject: [': 0.29; 'objects': 0.29; 'raise': 0.29; 'reporting': 0.29; "i'm": 0.30; 'anywhere': 0.30; 'system,': 0.30; 'code': 0.30; "can't": 0.32; 'generally': 0.32; 'statement': 0.32; 'point': 0.33; 'useful': 0.33; 'foo': 0.33; 'int': 0.33; 'ram': 0.33; 'surprised': 0.33; 'wrap': 0.33; '(for': 0.34; 'case,': 0.34; 'except': 0.34; 'received:google.com': 0.35; 'could': 0.35; 'nothing.': 0.35; 'something': 0.35; 'asking': 0.35; "isn't": 0.35; 'but': 0.36; 'too': 0.36; 'there': 0.36; 'received:209.85': 0.36; 'keyword': 0.36; 'smaller': 0.36; 'subject:: ': 0.37; 'being': 0.37; 'received:209.85.213': 0.37; 'doing': 0.38; 'received:209': 0.38; 'wrong': 0.38; 'mean': 0.38; 'why': 0.39; 'sure': 0.39; 'does': 0.39; 'enough': 0.39; 'rather': 0.39; 'some': 0.40; 'ever': 0.60; 'claim': 0.61; 'subject:The': 0.61; "you'll": 0.61; 'back': 0.62; 'genuine': 0.63; 'more': 0.63; 'times': 0.63; 'between': 0.65; 'limit': 0.65; 'mar': 0.65; 'bothered': 0.66; 'complement': 0.66; "they're": 0.66; 'useful.': 0.72; 'increasing': 0.76; 'counts': 0.81; 'anywhere.': 0.84; 'chrisa': 0.84; 'costly': 0.84; 'front.': 0.84; "it'd": 0.84; 'moves': 0.84; 'to:none': 0.91; 'hands': 0.96 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc; bh=d0jNzSaaowOt+zmEpieJGxk3j5COFJiqkEpUNOuEHMI=; b=QJt+U8OTrRhjBDHyHR6M9112HBNdZYFngzFr53av1jGH7REcfXycyU13bsT3b7ERWT nxAj4VfgUyPJvfq5Si3DpTq3YLr6Uhmzybqr0IJXm/wke5vvD6korHtr57ASnPGjAVn5 aeY4NKklxujBIidLgjZ91GkULrvahBJkpN4u6tH7SUCOKJvvOakzzlNrFJc1hUhr5Kp5 9k11rihgrx9WvxX4TCLKqtpkJywwLsI/Ug7ufp4m9T5vwzH9cSVsmRVREQS/5iPg6DO0 BngywSLMwULDdefV4/91k+jxtMBzCVRMbN7TW2ALWWeOPneXIECjb+/FdNRazK2/39pw Cj6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:cc; bh=d0jNzSaaowOt+zmEpieJGxk3j5COFJiqkEpUNOuEHMI=; b=H2huJiYpC4nlAwSDS+T+OYSbkRplG/U7Nm0foA3ODkwd6y7dFqX/oYt2MENYVwkVeG tSomiyk2lScVJ3vqrtJmtqbqwb5TD/b5byCaCvOXoZt6I97N3TNiW/CP0CR7kJj+kLuU srsOEHXe/c+Rz5qwbn+NY2gfKPuaFu2EQ5BEXb302hwhXMxYKsXNklSrY1UJKXr2ZMve /5/wpLggalnnqg3DqPQg80X+C86RnTrrAqEqhzz/CzzpfyJ/vHRXwZuC51LHTHZ+Y3Sh DZZ50/HCiEerVupByMfpx6igwrvF24Si5lek/IDs1Ii2MHcmoPGlEcKQ86dRvB5MH0ev YLsA== X-Gm-Message-State: AD7BkJLBXtlzwnDj14tYXcHb7szviXtbam/mg3rZrewRjaZUrMW2uFugbd62t8y1aY8gqNjRcaVnUsXp+kFXsQ== X-Received: by 10.50.137.35 with SMTP id qf3mr1812155igb.92.1459002631821; Sat, 26 Mar 2016 07:30:31 -0700 (PDT) In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:105759 On Sun, Mar 27, 2016 at 1:09 AM, BartC wrote: > I'm surprised that both C and Python allow statements that apparently do > nothing. In both, an example is: > > x > > on a line by itself. This expression is evaluated, but then any result > discarded. If there was a genuine use for this (for example, reporting any > error with the evaluation), then it would be simple enough to require a > keyword in front. Tell me, which of these is a statement that "does nothing"? foo foo.bar foo["bar"] foo.__call__ foo() int(foo) All of them are expressions to be evaluated and the result discarded. I'm sure you'll recognize "foo()" as useful code, but to the interpreter, they're all the same. And any one of them could raise an exception rather than emit a value; for instance, consider these code blocks: # Personally, I prefer doing it the other way, but # if you have a big Py2 codebase, this will help # port it to Py3. try: raw_input except NameError: raw_input = input try: int(sys.argv[1]) except IndexError: print("No argument given") except ValueError: print("Not an integer") In each case, the "dummy evaluation" of an expression is used as a way of asking "Will this throw?". That's why this has to be squarely in the hands of linters, not the main interpreter; there's nothing that can't in some way be useful. >> The main reason the C int has undefined behaviour is that it's >> somewhere between "fixed size two's complement signed integer" and >> "integer with plenty of room". A C compiler is generally free to use a >> larger integer than you're expecting, which will cause numeric >> overflow to not happen. That's (part of[1]) why overflow of signed >> integers is undefined - it'd be too costly to emulate a smaller >> integer. So tell me... what happens in CPython if you incref an object >> more times than the native integer will permit? Are you bothered by >> this possibility, or do you simply assume that nobody will ever do >> that? > > > (On a ref-counted scheme I use, with 32-bit counts (I don't think it matters > if they are signed or not), each reference implies a 16-byte object > elsewhere. For the count to wrap around back to zero, that would mean 64GB > of RAM being needed. On a 32-bit system, something else will go wrong first. > > Even on 64-bits, it's a possibility I suppose although you might notice > memory problems sooner.) C code can claim references to Python objects without having actual pointers anywhere. A naive object traversal algorithm could claim temporary references on all the objects it moves past (to ensure they don't get garbage collected in the middle), and then get stuck in an infinite loop traversing a reference loop, thus infinitely increasing reference counts. Yeah, it can happen... I've had bugs like that in my code... Point is, CPython can generally assume that bug-free code will never get anywhere *near* the limit of a signed integer. Consequently, C's undefined behaviour isn't a problem; it does NOT mean we need to be scared of signed integers. ChrisA