Path: csiph.com!usenet.pasdenom.info!news.redatomik.org!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Date: Thu, 26 Mar 2015 18:47:37 -0500
From: Tim Chase <python.list@tim.thechases.com>
To: python-list@python.org
Subject: Re: A simple single line, triple-quoted comment is giving syntax error. Why?
In-Reply-To: <55149305$0$13009$c3e8da3$5496439d@news.astraweb.com>
References: <f7b84d9d-c559-4711-ad63-1deea5f12c62@googlegroups.com> <med6mn$ag9$6@dont-email.me> <3533816.ZYnZ2OzjCs@PointedEars.de> <mailman.51.1426995416.10327.python-list@python.org> <3971951.908YQu3oQO@PointedEars.de> <mailman.52.1426997516.10327.python-list@python.org> <1504323.jUKfeKbQsP@PointedEars.de> <mailman.181.1427346636.10327.python-list@python.org> <2362875.FIK8ImHTJA@PointedEars.de> <CALwzidmOsagqXs5Nhr5w1w2Ukv9oDp2mxikcRmHd9rKG2x3Low@mail.gmail.com> <mailman.212.1427394775.10327.python-list@python.org> <1582467.9xuzSGXesM@PointedEars.de> <55149305$0$13009$c3e8da3$5496439d@news.astraweb.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.225.1427414131.10327.python-list@python.org>
Lines: 46
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:88110

On 2015-03-27 10:15, Steven D'Aprano wrote:
> If that's all it is, why don't you just run the tokenizer over it
> and see what it says?
> 
> py> from cStringIO import StringIO
> py> code = StringIO('spam = "abcd" "efgh"\n')
> py> import tokenize
> py> for item in tokenize.generate_tokens(code.readline):
> ...     print item
> ...
> (1, 'spam', (1, 0), (1, 4), 'spam = "abcd" "efgh"\n')
> (51, '=', (1, 5), (1, 6), 'spam = "abcd" "efgh"\n')
> (3, '"abcd"', (1, 7), (1, 13), 'spam = "abcd" "efgh"\n')
> (3, '"efgh"', (1, 14), (1, 20), 'spam = "abcd" "efgh"\n')
> (4, '\n', (1, 20), (1, 21), 'spam = "abcd" "efgh"\n')
> (0, '', (2, 0), (2, 0), '')
> 
> 
> Looks to me that the two string literals each get their own token,

Nice.  I haven't played with the tokenize module before, but
resolving arguments on comp.lang.python is one of the best possible
uses.

It was interesting to try other feeders to generate_tokens(), my favorite being

>>> import tokenize
>>> i = iter(["spam = 'abc' 'def'"])
>>> for item in tokenize.generate_tokens(lambda: next(i)):
...     print(item)
... 
TokenInfo(type=1 (NAME), string='spam', start=(1, 0), end=(1, 4), line="spam = 'abc' 'def'")
TokenInfo(type=52 (OP), string='=', start=(1, 5), end=(1, 6), line="spam = 'abc' 'def'")
TokenInfo(type=3 (STRING), string="'abc'", start=(1, 7), end=(1, 12), line="spam = 'abc' 'def'")
TokenInfo(type=3 (STRING), string="'def'", start=(1, 13), end=(1, 18), line="spam = 'abc' 'def'")
TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')


It's also nice to have the translation from token-type to token-type-name in Py3

-tkc