Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!tudelft.nl!txtfeed1.tudelft.nl!feeder3.cambriumusenet.nl!feed.tweaknews.nl!62.179.104.142.MISMATCH!amsnews11.chello.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'example:': 0.03; 'debug': 0.03; '#include': 0.05; 'api.': 0.05; 'library,': 0.05; 'char': 0.07; 'question:': 0.07; 'semantic': 0.07; 'symbols': 0.07; 'python': 0.08; 'scripts': 0.09; '(int': 0.09; 'append': 0.09; 'empty,': 0.09; 'eof': 0.09; 'exceptions': 0.09; 'expected.': 0.09; 'finished,': 0.09; 'pointers': 0.09; 'url:faq': 0.09; 'url:github': 0.09; 'api': 0.09; 'c++': 0.12; 'subsequent': 0.15; '"...': 0.16; '"expected': 0.16; '"invalid': 0.16; '"unexpected': 0.16; '();': 0.16; '(msg,': 0.16; '*val,': 0.16; 'argc,': 0.16; 'exit(0);': 0.16; 'i);': 0.16; 'ignored,': 0.16; 'incomplete': 0.16; 'lengths': 0.16; 'list.)': 0.16; 'pyobject': 0.16; 'skip:# 70': 0.16; 'solution?': 0.16; 'url:blob': 0.16; 'val,': 0.16; 'syntax': 0.16; '>>>': 0.18; '3.2': 0.18; 'int': 0.18; 'modified': 0.18; 'slightly': 0.19; 'memory': 0.21; 'trying': 0.21; 'mechanism': 0.21; 'input': 0.22; 'wrote': 0.22; 'thus': 0.23; 'figure': 0.23; 'additionally': 0.23; 'performs': 0.23; 'fine': 0.24; 'compiled': 0.25; 'tests': 0.25; 'code': 0.25; 'ignore': 0.26; 'helpful': 0.26; "i'm": 0.26; 'function': 0.27; 'fix': 0.27; 'subject:" ': 0.28; 'users.': 0.28; 'sound': 0.28; 'script': 0.28; 'message-id:@mail.gmail.com': 0.28; 'exit': 0.29; 'second': 0.29; 'error': 0.29; 'lines': 0.30; 'seem': 0.30; '22,': 0.30; 'basically,': 0.30; 'i/o': 0.30; 'line:': 0.30; 'logic': 0.30; 'null;': 0.30; 'publicly': 0.30; 'tightly': 0.30; 'useless': 0.30; 'skip:( 20': 0.31; 'version': 0.32; 'does': 0.32; 'hi,': 0.32; 'proposed': 0.32; 'yet': 0.32; 'idea': 0.32; 'sort': 0.33; 'there': 0.33; 'done': 0.34; 'to:addr:python-list': 0.34; 'loop': 0.34; 'calling': 0.34; 'faq': 0.34; 'here,': 0.35; 'test': 0.35; 'subject:How': 0.35; 'url:python': 0.36; 'post': 0.36; 'file': 0.36; 'similar': 0.36; 'received:209.85.161': 0.36; 'question': 0.36; 'uses': 0.36; 'none': 0.37; 'charset:us-ascii': 0.37; 'faq:': 0.37; 'hack': 0.37; 'url:pipermail': 0.37; 'skip:" 10': 0.37; 'two': 0.37; 'but': 0.37; 'received:google.com': 0.37; 'using': 0.38; 'some': 0.38; 'several': 0.38; 'comments': 0.38; 'received:209.85': 0.38; 'subject:from': 0.38; 'correctly': 0.39; 'url:docs': 0.39; 'url:org': 0.39; 'either': 0.39; 'should': 0.39; 'else': 0.39; 'received:209': 0.40; 'to:addr:python.org': 0.40; 'more': 0.61; 'quick': 0.61; 'achieve': 0.61; 'double': 0.61; 'presented': 0.62; 'free': 0.64; 'piece': 0.66; '11,': 0.68; 'flow': 0.71; '[3]': 0.73; 'situations.': 0.73; '(custom': 0.84; '2);': 0.84; 'coupled': 0.84; 'canonical': 0.91; 'confirms': 0.91; 'subject:tell': 0.91 MIME-Version: 1.0 X-Originating-IP: [82.110.183.162] From: Mateusz Loskot Date: Wed, 11 Jan 2012 13:50:37 +0000 Subject: How do I tell "incomplete input" from "invalid input"? To: python-list@python.org Content-Type: multipart/mixed; boundary=14dae9340cc7eee50c04b640ebf1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 216 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1326289862 news.xs4all.nl 6879 [2001:888:2000:d::a6]:50579 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:18828 --14dae9340cc7eee50c04b640ebf1 Content-Type: text/plain; charset=UTF-8 Hi, I have been trying to figure out a reliable way to determine incomplete Python script input using Python C API. (Apology if it is OT here, I'm not sure where my post belongs, perhaps to cplusplus-sig list.) Apparently, most pointers lead to the Python FAQ [1] question: How do I tell "incomplete input" from "invalid input"? Unfortunately, this FAQ is either old or incomplete thus incorrect. First, the proposed testcomplete() function uses internal symbols which are not available to Python C API users. So, "whoever wrote that FAQ should be given 20 lashes with a short piece of string" [2]. The second solution is incomplete or incorrect. It does not handle correctly multi-line input longer than two lines with more flow control statements. For example: ########################################################################## >>> n = 10 >>> if n > 0: ... if n < 100: File "", line 2 if n < 100: ^ IndentationError: expected an indented block >>> ########################################################################## or ########################################################################## >>> for n in range(0, 5): ... if n > 2: File "", line 2 if n > 2: ^ IndentationError: expected an indented block >>> ########################################################################## I have attached a slightly modified C++ version of the second program from the FAQ question [1], file faq_incomplete_input.cpp which is also available from my GitHub repo [3] In this program, I added several FIX comments with proposed corrections. The idea is to additionally check for PyErr_ExceptionMatches (PyExc_IndentationError) and strcmp (msg, "expected an indented block") and prompt is sys.ps2, means more code expected. And, ignore errors until user confirms the input is finished, so the whole input is eventually sent to the Py_CompileString and then all exceptions are not ignored, but considered as real result of compilation. I simply wanted to achieve similar semantic to codeop._maybe_compile() (called by codeop.compile_command) which performs some sort of dirty hack in the following line: if not code1 and repr(err1) == repr(err2): So, the test in action for multi-line multi-statement input gives: ########################################################################## >>> c = codeop.compile_command("for n in range(0, 3):", "test", "single") err1 SyntaxError('unexpected EOF while parsing', ('test', 1, 22, 'for n in range(0, 3):\n')) err2 IndentationError('expected an indented block', ('test', 2, 1, '\n')) comparison.err1 SyntaxError('unexpected EOF while parsing', ('test', 1, 22, 'for n in range(0, 3):\n')) comparison.err2 IndentationError('expected an indented block', ('test', 2, 1, '\n')) code None code1 None >>> c = codeop.compile_command("for n in range(0, 3):\n\tif n > 0:", "test", "single") err1 IndentationError('expected an indented block', ('test', 2, 11, '\tif n > 0:\n')) err2 IndentationError('expected an indented block', ('test', 3, 1, '\n')) comparison.err1 IndentationError('expected an indented block', ('test', 2, 11, '\tif n > 0:\n')) comparison.err2 IndentationError('expected an indented block', ('test', 3, 1, '\n')) code None code1 None >>> ########################################################################## So, I reckon it make sense to use the same logic to when calling Py_CompileString. Does it sound as reasonable solution? Basically, there seem to be no canonical solution anywhere presented on how to perform incomplete input tests in reliable manner, how to perform parsing/compilation in subsequent steps against Python code given line-by-line. The C API used by Python function compile() is not publicly available. There is PyRun_InteractiveLoop mechanism but it is tightly coupled to FILE-based I/O which is not always available when Python is embedded, so the loop is useless in number of situations. Have I overlooked any other obvious solution? Finally, it would be helpful if the Python FAQ is up to date. [1] http://docs.python.org/py3k/faq/extending.html#how-do-i-tell-incomplete-input-from-invalid-input [2] http://mail.python.org/pipermail/python-list/2004-August/887195.html [3] https://github.com/mloskot/workshop/blob/master/python/ Best regards, -- Mateusz Loskot, http://mateusz.loskot.net --14dae9340cc7eee50c04b640ebf1 Content-Type: text/x-c++src; charset=US-ASCII; name="faq_incomplete_input.cpp" Content-Disposition: attachment; filename="faq_incomplete_input.cpp" Content-Transfer-Encoding: base64 X-Attachment-Id: f_gxad3haz0 Ly8NCi8vIEEgcXVpY2sgYW5kIGRpcnR5IEMrKyB2ZXJzaW9uIG9mIHRoZSBDIHByb2dyYW0gcHJl c2VudGVkIGluIFB5dGhvbiBGQVE6DQovLyBodHRwOi8vZG9jcy5weXRob24ub3JnL3B5M2svZmFx L2V4dGVuZGluZy5odG1sI2hvdy1kby1pLXRlbGwtaW5jb21wbGV0ZS1pbnB1dC1mcm9tLWludmFs aWQtaW5wdXQNCi8vIE1vZGlmaWNhdGlvbnM6DQovLyAtIGRvIG5vdCB1c2UgcmVhZGxpbmUgbGli cmFyeSwgYnV0IDxpb3N0cmVhbT4NCi8vDQovLyBUZXN0ZWQgdXNpbmcgVmlzdWFsIEMrKyAyMDEw ICgxMC4wKSBhbmQgUHl0aG9uIDMuMiAoY3VzdG9tIERlYnVnIGJ1aWxkKQ0KLy8NCi8vIFRoZSAi aW5jb21wbGV0ZSBpbnB1dCIgc29sdXRpb24gcHJlc2VudGVkIGluIHRoZSBGQVEgaXMgaW5jb21w bGV0ZSBhbmQgaXQgZG9lcw0KLy8gbm90IGFsbG93IG11bHRpLWxpbmUgc2NyaXB0cyB3aXRoIG1v cmUgdGhhbiAyIGxpbmVzIG9mIGZsb3cgY29udHJvbCBzdGF0ZW1lbnRzOg0KLy8NCi8vPj4+IG4g PSAxMA0KLy8+Pj4gaWYgbiA+IDA6DQovLy4uLiAgICAgaWYgbiA8IDEwMDoNCi8vICBGaWxlICI8 c3RkaW4+IiwgbGluZSAyDQovLyAgICBpZiBuIDwgMTAwOg0KLy8gICAgICAgICAgICAgIF4NCi8v SW5kZW50YXRpb25FcnJvcjogZXhwZWN0ZWQgYW4gaW5kZW50ZWQgYmxvY2sNCi8vPj4+DQovLw0K Ly8+Pj4gbiA9IDEwDQovLz4+PiBpZiBuID4gMDoNCi8vLi4uICAgICBpZiBuIDwgMTAwOg0KLy8g IEZpbGUgIjxzdGRpbj4iLCBsaW5lIDINCi8vICAgIGlmIG4gPCAxMDA6DQovLyAgICAgICAgICAg ICAgXg0KLy9JbmRlbnRhdGlvbkVycm9yOiBleHBlY3RlZCBhbiBpbmRlbnRlZCBibG9jaw0KLy8+ Pj4NCi8vDQojaW5jbHVkZSA8Y3N0ZGlvPg0KI2luY2x1ZGUgPGlvc3RyZWFtPg0KI2luY2x1ZGUg PHN0cmluZz4NCg0KI2luY2x1ZGUgPFB5dGhvbi5oPg0KI2luY2x1ZGUgPG9iamVjdC5oPg0KI2lu Y2x1ZGUgPGNvbXBpbGUuaD4NCiNpbmNsdWRlIDxldmFsLmg+DQoNCmludCBtYWluIChpbnQgYXJn YywgY2hhciogYXJndltdKQ0Kew0KICAgIGludCBpLCBqLCBkb25lID0gMDsgICAgICAgICAgICAg ICAgICAgICAgICAgIC8qIGxlbmd0aHMgb2YgbGluZSwgY29kZSAqLw0KICAgIGNoYXIgcHMxW10g PSAiPj4+ICI7DQogICAgY2hhciBwczJbXSA9ICIuLi4gIjsNCiAgICBjaGFyICpwcm9tcHQgPSBw czE7DQogICAgY2hhciAqbXNnLCAqY29kZSA9IE5VTEw7DQogICAgY2hhciBjb25zdCogbGluZShO VUxMKTsNCiAgICBQeU9iamVjdCAqc3JjLCAqZ2xiLCAqbG9jOw0KICAgIFB5T2JqZWN0ICpleGMs ICp2YWwsICp0cmIsICpvYmosICpkdW07DQogICAgc3RkOjpzdHJpbmcgaW5wdXRfbGluZTsNCg0K ICAgIFB5X0luaXRpYWxpemUgKCk7DQogICAgbG9jID0gUHlEaWN0X05ldyAoKTsNCiAgICBnbGIg PSBQeURpY3RfTmV3ICgpOw0KICAgIFB5RGljdF9TZXRJdGVtU3RyaW5nIChnbGIsICJfX2J1aWx0 aW5zX18iLCBQeUV2YWxfR2V0QnVpbHRpbnMgKCkpOw0KDQogICAgd2hpbGUgKCFkb25lKQ0KICAg IHsNCiAgICAgICAgc3RkOjpjb3V0IDw8IHByb21wdDsNCiAgICAgICAgc3RkOjpnZXRsaW5lKHN0 ZDo6Y2luLCBpbnB1dF9saW5lKTsNCg0KICAgICAgICBpZiAoaW5wdXRfbGluZS5lbXB0eSgpIHx8 IGlucHV0X2xpbmVbMF0gPT0gNCAvKkVPVCovKSAvKiBDVFJMLVogb3IgQ1RSTC1EIHByZXNzZWQg Ki8NCiAgICAgICAgew0KICAgICAgICAgICAgZG9uZSA9IDE7DQogICAgICAgIH0NCiAgICAgICAg ZWxzZQ0KICAgICAgICB7DQogICAgICAgICAgICBsaW5lID0gaW5wdXRfbGluZS5jX3N0cigpOw0K ICAgICAgICAgICAgaSA9IHN0cmxlbiAobGluZSk7DQogICAgICAgICAgICBpZiAoTlVMTCA9PSBj b2RlKSAgICAgICAgICAgICAgICAgICAgICAgIC8qIG5vdGhpbmcgaW4gY29kZSB5ZXQgKi8NCiAg ICAgICAgICAgICAgICBqID0gMDsNCiAgICAgICAgICAgIGVsc2UNCiAgICAgICAgICAgICAgICBq ID0gc3RybGVuIChjb2RlKTsNCg0KICAgICAgICAgICAgY29kZSA9IChjaGFyKilyZWFsbG9jIChj b2RlLCBpICsgaiArIDIpOw0KICAgICAgICAgICAgaWYgKE5VTEwgPT0gY29kZSkgICAgICAgICAg ICAgICAgICAgICAgICAvKiBvdXQgb2YgbWVtb3J5ICovDQogICAgICAgICAgICAgICAgZXhpdCAo MSk7DQoNCiAgICAgICAgICAgIGlmICgwID09IGopICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgLyogY29kZSB3YXMgZW1wdHksIHNvICovDQogICAgICAgICAgICAgICAgY29kZVswXSA9ICdc MCc7ICAgICAgICAgICAgICAgICAgICAgICAgLyoga2VlcCBzdHJuY2F0IGhhcHB5ICovDQoNCiAg ICAgICAgICAgIHN0cm5jYXQgKGNvZGUsIGxpbmUsIGkpOyAgICAgICAgICAgICAgICAgLyogYXBw ZW5kIGxpbmUgdG8gY29kZSAqLw0KICAgICAgICAgICAgY29kZVtpICsgal0gPSAnXG4nOyAgICAg ICAgICAgICAgICAgICAgICAvKiBhcHBlbmQgJ1xuJyB0byBjb2RlICovDQogICAgICAgICAgICBj b2RlW2kgKyBqICsgMV0gPSAnXDAnOw0KDQogICAgICAgICAgICBzcmMgPSBQeV9Db21waWxlU3Ry aW5nIChjb2RlLCAiPHN0ZGluPiIsIFB5X3NpbmdsZV9pbnB1dCk7DQoNCiAgICAgICAgICAgIGlm IChOVUxMICE9IHNyYykgICAgICAgICAgICAgICAgICAgICAgICAgLyogY29tcGlsZWQganVzdCBm aW5lIC0gKi8NCiAgICAgICAgICAgIHsNCiAgICAgICAgICAgICAgICBpZiAocHMxICA9PSBwcm9t cHQgfHwgICAgICAgICAgICAgICAgLyogIj4+PiAiIG9yICovIA0KICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAvKiAiLi4uICIgYW5kIGRvdWJsZSAn XG4nICovDQogICAgICAgICAgICAgICAgICAgICgnXHInID09IGNvZGVbaSArIGogLSAxXSAmJiAg ICAgIC8qIFdpbmRvd3Mtc3BlY2lmaWMgZml4ICovICAgIA0KICAgICAgICAgICAgICAgICAgICAg J1xuJyA9PSBjb2RlW2kgKyBqIC0gMl0pKSAgICAgICAgICAgDQogICAgICAgICAgICAgICAgeyAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLyogc28gZXhlY3V0 ZSBpdCAqLw0KICAgICAgICAgICAgICAgICAgICBkdW0gPSBQeUV2YWxfRXZhbENvZGUgKHNyYywg Z2xiLCBsb2MpOw0KICAgICAgICAgICAgICAgICAgICBQeV9YREVDUkVGIChkdW0pOw0KICAgICAg ICAgICAgICAgICAgICBQeV9YREVDUkVGIChzcmMpOw0KICAgICAgICAgICAgICAgICAgICBmcmVl IChjb2RlKTsNCiAgICAgICAgICAgICAgICAgICAgY29kZSA9IE5VTEw7DQogICAgICAgICAgICAg ICAgICAgIGlmIChQeUVycl9PY2N1cnJlZCAoKSkNCiAgICAgICAgICAgICAgICAgICAgICAgIFB5 RXJyX1ByaW50ICgpOw0KICAgICAgICAgICAgICAgICAgICBwcm9tcHQgPSBwczE7DQogICAgICAg ICAgICAgICAgfQ0KICAgICAgICAgICAgfSAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAvKiBzeW50YXggZXJyb3Igb3IgRV9FT0Y/ICovDQogICAgICAgICAgICBlbHNlIGlm IChQeUVycl9FeGNlcHRpb25NYXRjaGVzIChQeUV4Y19TeW50YXhFcnJvcikpDQogICAgICAgICAg ICB7DQogICAgICAgICAgICAgICAgLy8gbWxvc2tvdCBhbHNvIHByb3Bvc2VzIEZJWCAxIChhYm92 ZSk6IA0KICAgICAgICAgICAgICAgIC8vIHx8IFB5RXJyX0V4Y2VwdGlvbk1hdGNoZXMgKFB5RXhj X0luZGVudGF0aW9uRXJyb3IpDQoNCiAgICAgICAgICAgICAgICAvLyBtbG9za290IGFsc28gcHJv cG9zZXMgRklYIDIgKGJlbG93KToNCiAgICAgICAgICAgICAgICAvLyB8fCAgc3RyY21wIChtc2cs ICJleHBlY3RlZCBhbiBpbmRlbnRlZCBibG9jayIpICYmIHBzMiAgPT0gcHJvbXB0IA0KDQogICAg ICAgICAgICAgICAgUHlFcnJfRmV0Y2ggKCZleGMsICZ2YWwsICZ0cmIpOyAgICAgICAgLyogY2xl YXJzIGV4Y2VwdGlvbiEgKi8NCg0KICAgICAgICAgICAgICAgIGlmIChQeUFyZ19QYXJzZVR1cGxl ICh2YWwsICJzTyIsICZtc2csICZvYmopICYmIDAgPT0gc3RyY21wIChtc2csICJ1bmV4cGVjdGVk IEVPRiB3aGlsZSBwYXJzaW5nIikpIC8qIEVfRU9GICovDQogICAgICAgICAgICAgICAgew0KICAg ICAgICAgICAgICAgICAgICBQeV9YREVDUkVGIChleGMpOw0KICAgICAgICAgICAgICAgICAgICBQ eV9YREVDUkVGICh2YWwpOw0KICAgICAgICAgICAgICAgICAgICBQeV9YREVDUkVGICh0cmIpOw0K ICAgICAgICAgICAgICAgICAgICBwcm9tcHQgPSBwczI7DQogICAgICAgICAgICAgICAgfQ0KICAg ICAgICAgICAgICAgIGVsc2UgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIC8qIHNv bWUgb3RoZXIgc3ludGF4IGVycm9yICovDQogICAgICAgICAgICAgICAgew0KICAgICAgICAgICAg ICAgICAgICBQeUVycl9SZXN0b3JlIChleGMsIHZhbCwgdHJiKTsNCiAgICAgICAgICAgICAgICAg ICAgUHlFcnJfUHJpbnQgKCk7DQogICAgICAgICAgICAgICAgICAgIGZyZWUgKGNvZGUpOw0KICAg ICAgICAgICAgICAgICAgICBjb2RlID0gTlVMTDsNCiAgICAgICAgICAgICAgICAgICAgcHJvbXB0 ID0gcHMxOw0KICAgICAgICAgICAgICAgIH0NCiAgICAgICAgICAgIH0NCiAgICAgICAgICAgIGVs c2UgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLyogc29tZSBub24tc3ludGF4 IGVycm9yICovDQogICAgICAgICAgICB7DQogICAgICAgICAgICAgICAgUHlFcnJfUHJpbnQgKCk7 DQogICAgICAgICAgICAgICAgZnJlZSAoY29kZSk7DQogICAgICAgICAgICAgICAgY29kZSA9IE5V TEw7DQogICAgICAgICAgICAgICAgcHJvbXB0ID0gcHMxOw0KICAgICAgICAgICAgfQ0KICAgICAg ICB9DQogICAgfQ0KDQogICAgUHlfWERFQ1JFRihnbGIpOw0KICAgIFB5X1hERUNSRUYobG9jKTsN CiAgICBQeV9GaW5hbGl6ZSgpOw0KICAgIGV4aXQoMCk7DQp9 --14dae9340cc7eee50c04b640ebf1--