Groups > comp.lang.python > #60953 > unrolled thread

Why is there no natural syntax for accessing attributes with names not being valid identifiers?

Started by	Piotr Dobrogost <p@google-groups-2013.dobrogost.net>
First post	2013-12-03 09:14 -0800
Last post	2013-12-05 01:50 +0000
Articles	20 on this page of 57 — 20 participants

Back to article view | Back to comp.lang.python

Page 2 of 3 — ← Prev page 1 [2] 3 Next page →

#61060

From	Ethan Furman <ethan@stoneleaf.us>
Date	2013-12-04 15:09 -0800
Message-ID	<mailman.3593.1386199893.18130.python-list@python.org>
In reply to	#61057

On 12/04/2013 02:13 PM, Piotr Dobrogost wrote:
> On Wednesday, December 4, 2013 10:41:49 PM UTC+1, Neil Cerutti wrote:
>> On 2013-12-04, Piotr Dobrogost <> wrote:
>>
>>> Right. If there's already a way to have attributes with these
>>> "non-standard" names (which is a good thing)
>>
>> At best its a neutral thing. You can use dict for the same
>> purpose with very little effort and no(?) loss of efficiency.
>
> As much as many people in this topic would like to put equal
>  sign between attributes and dictionary's keys they are not the
>  same thing. AFAIK descriptor protocol works only with attributes,
>  right?

Correct.  It is looking very unlikely that you are going to get enough support for this change.  Perhaps you should look 
at different ways of spelling your identifiers?  Why can't you use an underscore instead of a hyphen?

--
~Ethan~

[toc] | [prev] | [next] | [standalone]

#61062

From	Piotr Dobrogost <p@google-groups-2013.dobrogost.net>
Date	2013-12-04 15:57 -0800
Message-ID	<5d99a76c-35eb-4c1d-bdb5-e4e1f6bea188@googlegroups.com>
In reply to	#61060

On Thursday, December 5, 2013 12:09:52 AM UTC+1, Ethan Furman wrote:
> Perhaps you should look 
> at different ways of spelling your identifiers?  Why can't you use an
> underscore instead of a hyphen?

So that underscore could be left for use inside fields' names?
However I think we could use some unique Unicode character for this instead hyphen as long as Python allows any alphanumeric Unicode character inside identifiers which I think it does...

Regards,
Piotr

[toc] | [prev] | [next] | [standalone]

#61064

From	Ethan Furman <ethan@stoneleaf.us>
Date	2013-12-04 16:26 -0800
Message-ID	<mailman.3595.1386204403.18130.python-list@python.org>
In reply to	#61062

On 12/04/2013 03:57 PM, Piotr Dobrogost wrote:
> On Thursday, December 5, 2013 12:09:52 AM UTC+1, Ethan Furman wrote:
>>
>> Perhaps you should look at different ways of spelling your identifiers?
>>  Why can't you use an underscore instead of a hyphen?
>
> So that underscore could be left for use inside fields' names?
> However I think we could use some unique Unicode character for this
>  instead hyphen as long as Python allows any alphanumeric Unicode
>  character inside identifiers which I think it does...

Yes, although I don't remember at which version that became true...

--
~Ethan~

[toc] | [prev] | [next] | [standalone]

#61067

From	Ned Batchelder <ned@nedbatchelder.com>
Date	2013-12-04 20:17 -0500
Message-ID	<mailman.3596.1386206281.18130.python-list@python.org>
In reply to	#61062

On 12/4/13 6:57 PM, Piotr Dobrogost wrote:
> On Thursday, December 5, 2013 12:09:52 AM UTC+1, Ethan Furman wrote:
>> Perhaps you should look
>> at different ways of spelling your identifiers?  Why can't you use an
>> underscore instead of a hyphen?
>
> So that underscore could be left for use inside fields' names?
> However I think we could use some unique Unicode character for this instead hyphen as long as Python allows any alphanumeric Unicode character inside identifiers which I think it does...
>
> Regards,
> Piotr
>

You object to typing [''] but you don't mind typing an unusual Unicode 
character?

--Ned.

[toc] | [prev] | [next] | [standalone]

#61072

From	Terry Reedy <tjreedy@udel.edu>
Date	2013-12-04 21:58 -0500
Message-ID	<mailman.3599.1386212319.18130.python-list@python.org>
In reply to	#61045

On 12/4/2013 3:46 PM, Mark Lawrence wrote:
> On 04/12/2013 20:35, Piotr Dobrogost wrote:
>> On Wednesday, December 4, 2013 2:06:44 AM UTC+1, Tim Chase wrote:
>>>
>>> I think random832 is saying that the designed purpose of setattr()
>>> was to dynamically set attributes by name, so they could later be
>>> accessed the traditional way; not designed from the ground-up to
>>> support non-identifier names.  But because of the getattr/setattr
>>> machinery (dict key/value pairs), it doesn't prevent you from having
>>> non-identifiers as names as long as you use only the getattr/setattr
>>> method of accessing them.
>>
>> Right. If there's already a way to have attributes with these
>> "non-standard" names

Fact.

>> (which is a good thing)

Opinion, not universally shared by developers, or 'good thing only as 
long as kept obscure'.

 >> then for uniformity with dot access to attributes with "standard" names

In a later post (after you wrote this) I explained that standard names 
are not always accessed with a dot, and that uniformity is impossible.

 >> there should be a variant of dot access allowing to access
 >> these "non-standard" named attributes, too.

More opinion. I am sure that I am not the only developer who disagrees.

> The obvious thing to do is to either raise this on python ideas, or if
> you're that confident about it raise an issue on the bug tracker with a
> patch, which would include changes to unit tests and documentation as
> well as code, get it reviewed and approved and Bob's your uncle, job
> done.

I think the latter would be foolish. Syntax changes have a high bar for 
acceptance. They should do more than save a few keystrokes. Use of new 
syntax makes code backward incompatible. New or changed Python modules 
can be backported (as long as they do not use new syntax ;-) either 
privately or publicly (on PyPI).

3.2 had no syntax changes; 3.3 one that I know of ('yield from'), which 
replaced about 15-20 *lines* of very tricky code;  3.4 has none that I 
can remember.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#61073

From	Ethan Furman <ethan@stoneleaf.us>
Date	2013-12-04 19:05 -0800
Message-ID	<mailman.3600.1386212708.18130.python-list@python.org>
In reply to	#61045

On 12/04/2013 06:58 PM, Terry Reedy wrote:
> On 12/4/2013 3:46 PM, Mark Lawrence wrote:
>> On 04/12/2013 20:35, Piotr Dobrogost wrote:
>>>
>>> there should be a variant of dot access allowing to access
>>> these "non-standard" named attributes, too.
>
> More opinion. I am sure that I am not the only developer who disagrees.

+1


>> The obvious thing to do is to either raise this on python ideas, or if
>> you're that confident about it raise an issue on the bug tracker with a
>> patch, which would include changes to unit tests and documentation as
>> well as code, get it reviewed and approved and Bob's your uncle, job
>> done.
>
> I think the latter would be foolish. Syntax changes have a high bar for acceptance. They should do more than save a few
> keystrokes.

+1

--
~Ethan~

[toc] | [prev] | [next] | [standalone]

#61080

From	Steven D'Aprano <steve@pearwood.info>
Date	2013-12-05 07:56 +0000
Message-ID	<52a031a1$0$11112$c3e8da3@news.astraweb.com>
In reply to	#61045

On Wed, 04 Dec 2013 12:35:14 -0800, Piotr Dobrogost wrote:

> Right. If there's already a way to have attributes with these
> "non-standard" names (which is a good thing)

No it is not a good thing. It is a bad thing, and completely an accident 
of implementation that it works at all.

Python does not support names (variable names, method names, attribute 
names, module names etc.) which are not valid identifiers except by 
accident. The right way to handle non-identifier names is to use keys in 
a dictionary, which works for any legal string.

As you correctly say in another post:

"attribute is quite a different beast then key in a dictionary"

attributes are intended to be variables, not arbitrary keys. In some 
languages, they are even called "instance variables". As they are 
variables, they should be legal identifiers:

spam = 42  # legal identifier name
spam\n-ham\n = 42  # illegal identifier name

Sticking a dot in front of the name doesn't make it any different. 
Variables, and attributes, should be legal identifiers. If I remember 
correctly (and I may not), this issue has been raised with the Python-Dev 
core developers, including Guido, and their decision was:

- allowing non-identifier attribute names is an accident of 
implementation; 

- Python implementations are allowed to optimize __dict__ to prohibit non-
valid identifiers;

- but it's probably not worth doing in CPython.

getattr already enforces that the attribute name is a string rather than 
any arbitrary object.

You've also raised the issue of linking attribute names to descriptors. 
Descriptors is certainly a good reason to use attributes, but it's not a 
good reason for allowing non-identifier names. Instead of writing:

obj.'#$^%\n-\'."'

just use a legal identifier name! The above is an extreme example, but 
the principle applies to less extreme examples. It might be slightly 
annoying to write obj.foo_bar when you actually want of obj.'foo.bar' or 
obj.'foo\nbar' or some other variation, but frankly, that's just too bad 
for you.

As far as descriptors go, you can implement descriptor-like functionality 
by overriding __getitem__. Here's a basic example:

class MyDict(dict):
    def __getitem__(self, key):
        obj = super(MyDict, self).__getitem__(key)
        if hasattr(obj, '__get__'):
            obj = obj.__get__(self)

which ought to be close to (but not identical) to the semantics of 
attribute descriptors.

While I can see that there is some benefit to allowing non-identifier 
attributes, I believe such benefit is very small, and not enough to 
justify the cost by allowing non-identifier attributes. If I wanted to 
program in a language where #$^%\n-\'." was a legal name for a variable, 
I'd program in Forth.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#60990

From	Tim Roberts <timr@probo.com>
Date	2013-12-03 21:45 -0800
Message-ID	<17gt99hg615jfm7bdid26185884d2pfdkf@4ax.com>
In reply to	#60972

Piotr Dobrogost <p@google-groups-2013.dobrogost.net> wrote:
>
>Attribute access syntax being very concise is very often preferred 
>to dict's interface. 

It is not "very concise".  It is slightly more concise.

    x = obj.value1
    x = dct['value1']

You have saved 3 keystrokes.  That is not a significant enough savings to
create new syntax.  Remember the Python philosophy that there ought to be
one way to do it.
-- 
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

[toc] | [prev] | [next] | [standalone]

#60991

From	rusi <rustompmody@gmail.com>
Date	2013-12-03 22:31 -0800
Message-ID	<080d6a56-588b-425f-8968-8f77bc330427@googlegroups.com>
In reply to	#60990

On Wednesday, December 4, 2013 11:15:05 AM UTC+5:30, Tim Roberts wrote:
> Piotr Dobrogost  wrote:
> >
> >Attribute access syntax being very concise is very often preferred 
> >to dict's interface. 
>
> It is not "very concise".  It is slightly more concise.
>
>     x = obj.value1
>     x = dct['value1']
>
> You have saved 3 keystrokes.  That is not a significant enough savings to
> create new syntax.  Remember the Python philosophy that there ought to be
> one way to do it.

Its a more fundamental problem than that:
It emerges from the OP's second post) that he wants '-' in the attributes.
Is that all?

Where does this syntax-enlargement stop? Spaces? Newlines?

[toc] | [prev] | [next] | [standalone]

#60992

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2013-12-04 01:57 -0700
Message-ID	<mailman.3546.1386147492.18130.python-list@python.org>
In reply to	#60991

On Tue, Dec 3, 2013 at 11:31 PM, rusi <rustompmody@gmail.com> wrote:
> Its a more fundamental problem than that:
> It emerges from the OP's second post) that he wants '-' in the attributes.
> Is that all?
>
> Where does this syntax-enlargement stop? Spaces? Newlines?

At non-strings.

>>> setattr(foo, 21+21, 42)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: attribute name must be string, not 'int'

[toc] | [prev] | [next] | [standalone]

#61004

From	rusi <rustompmody@gmail.com>
Date	2013-12-04 02:09 -0800
Message-ID	<549180f1-fb98-4b59-b92f-5beceb1a6fb5@googlegroups.com>
In reply to	#60992

On Wednesday, December 4, 2013 2:27:28 PM UTC+5:30, Ian wrote:
> On Tue, Dec 3, 2013 at 11:31 PM, rusi  wrote:
> > Its a more fundamental problem than that:
> > It emerges from the OP's second post) that he wants '-' in the attributes.
> > Is that all?
> >
> > Where does this syntax-enlargement stop? Spaces? Newlines?
>
> At non-strings.
>
> >>> setattr(foo, 21+21, 42)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: attribute name must be string, not 'int'

Not sure what's your point. 

OP wants attribute identifiers like outer_fieldset-inner_fieldset-third_field.
Say I have a python expression: 
obj.outer_fieldset-inner_fieldset-third_field

It can (in the proposed extension) be parsed as above, or as:
obj.outer_fieldset - inner_fieldset-third_field
the first hyphen being minus and the second being part of the identifier.

How do we decide which '-' are valid identifier components -- hyphens
and which minus-signs?

So to state my point differently:
The grammar of python is well-defined
It has a 'sub-grammar' of strings that is completely* free-for-all ie just
about anything can be put into a string literal.
The border between the orderly and the wild world are the quote-marks.
Remove that border and you get complete grammatical chaos.
[Maybe I should have qualified my reference to 'spaces'.
Algol-68 allowed spaces in identifiers (for readability!!)
The result was chaos]

I used the spaces case to indicate the limit of chaos. Other characters (that
already have uses) are just as problematic.

* Oh well there are some restrictions like quotes need to be escaped, no 
  newlines etc etc -- minor enough to be ignored.

[toc] | [prev] | [next] | [standalone]

#61006

From	Antoon Pardon <antoon.pardon@rece.vub.ac.be>
Date	2013-12-04 11:29 +0100
Message-ID	<mailman.3552.1386152954.18130.python-list@python.org>
In reply to	#61004

Op 04-12-13 11:09, rusi schreef:
> On Wednesday, December 4, 2013 2:27:28 PM UTC+5:30, Ian wrote:
>> On Tue, Dec 3, 2013 at 11:31 PM, rusi  wrote:
>>> Its a more fundamental problem than that:
>>> It emerges from the OP's second post) that he wants '-' in the attributes.
>>> Is that all?
>>>
>>> Where does this syntax-enlargement stop? Spaces? Newlines?
>>
>> At non-strings.
>>
>>>>> setattr(foo, 21+21, 42)
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> TypeError: attribute name must be string, not 'int'
> 
> Not sure what's your point. 
> 
> OP wants attribute identifiers like outer_fieldset-inner_fieldset-third_field.
> Say I have a python expression: 
> obj.outer_fieldset-inner_fieldset-third_field
> 
> It can (in the proposed extension) be parsed as above, or as:
> obj.outer_fieldset - inner_fieldset-third_field
> the first hyphen being minus and the second being part of the identifier.
> 
> How do we decide which '-' are valid identifier components -- hyphens
> and which minus-signs?
> 
> So to state my point differently:
> The grammar of python is well-defined
> It has a 'sub-grammar' of strings that is completely* free-for-all ie just
> about anything can be put into a string literal.
> The border between the orderly and the wild world are the quote-marks.
> Remove that border and you get complete grammatical chaos.
> [Maybe I should have qualified my reference to 'spaces'.
> Algol-68 allowed spaces in identifiers (for readability!!)
> The result was chaos]
> 
> I used the spaces case to indicate the limit of chaos. Other characters (that
> already have uses) are just as problematic.

I don't agree with the latter. As it is now python can make the
distinction between

from A import B    and     fromAimportB.

I see no a priori reason why this should be limited to letters. A
language designer might choose to allow a bigger set of characters
in identifiers like '-', '+' and others. In that case a-b would be
an identifier and a - b would be the operation. Just as in python
fromAimportB is an identifier and from A import B is an import
statement.

-- 
Antoon Pardon

[toc] | [prev] | [next] | [standalone]

#61015

From	rusi <rustompmody@gmail.com>
Date	2013-12-04 04:01 -0800
Message-ID	<68a2d20a-793f-4493-b856-c6c65617eb0d@googlegroups.com>
In reply to	#61006

On Wednesday, December 4, 2013 3:59:06 PM UTC+5:30, Antoon Pardon wrote:
> Op 04-12-13 11:09, rusi schreef:
> > I used the spaces case to indicate the limit of chaos. 
> > Other characters (that
> > already have uses) are just as problematic.
>
> I don't agree with the latter. As it is now python can make the
> distinction between
>
> from A import B    and     fromAimportB.
>
> I see no a priori reason why this should be limited to letters. A
> language designer might choose to allow a bigger set of characters
> in identifiers like '-', '+' and others. In that case a-b would be
> an identifier and a - b would be the operation. Just as in python
> fromAimportB is an identifier and from A import B is an import
> statement.

Im not sure what you are saying.
Sure a language designer can design a language differently from python.
I mentioned lisp. Cobol is another behaving exactly as you describe.

My point is that when you do (something like) that, you will need to change the
lexical and grammatical structure of the language.  And this will make 
for rather far-reaching changes ALL OVER the language not just in what-follows-dot.

IOW: I dont agree that we have a disagreement :-)

[toc] | [prev] | [next] | [standalone]

#61016

From	Antoon Pardon <antoon.pardon@rece.vub.ac.be>
Date	2013-12-04 13:32 +0100
Message-ID	<mailman.3560.1386160340.18130.python-list@python.org>
In reply to	#61015

Op 04-12-13 13:01, rusi schreef:
> On Wednesday, December 4, 2013 3:59:06 PM UTC+5:30, Antoon Pardon wrote:
>> Op 04-12-13 11:09, rusi schreef:
>>> I used the spaces case to indicate the limit of chaos. 
>>> Other characters (that
>>> already have uses) are just as problematic.
>>
>> I don't agree with the latter. As it is now python can make the
>> distinction between
>>
>> from A import B    and     fromAimportB.
>>
>> I see no a priori reason why this should be limited to letters. A
>> language designer might choose to allow a bigger set of characters
>> in identifiers like '-', '+' and others. In that case a-b would be
>> an identifier and a - b would be the operation. Just as in python
>> fromAimportB is an identifier and from A import B is an import
>> statement.
> 
> Im not sure what you are saying.
> Sure a language designer can design a language differently from python.
> I mentioned lisp. Cobol is another behaving exactly as you describe.
> 
> My point is that when you do (something like) that, you will need to change the
> lexical and grammatical structure of the language.  And this will make 
> for rather far-reaching changes ALL OVER the language not just in what-follows-dot.

No you don't need to change the lexical and grammatical structure of
the language. Changing the characters allowed in identifiers, is not a
change in lexical structure. The only difference in lexical structuring
would be that '-', '>=' and other similars symbols would have to be
treated like keyword like 'from', 'as' etc instead of being recognizable
by just being present.

And the grammatical structure of the language wouldn't change at all.
Sure a-b would now be an identifier and not an operation but that is
of no concern for the parser.

People would have to be careful to insert spaces around operators
and that might make the language somewhat error prone but that doesn't
mean the syntactical structure is different.

-- 
Antoon Pardon

[toc] | [prev] | [next] | [standalone]

#61017

From	rusi <rustompmody@gmail.com>
Date	2013-12-04 05:02 -0800
Message-ID	<a1938295-93bf-4d25-9ab6-9fac211b83eb@googlegroups.com>
In reply to	#61016

On Wednesday, December 4, 2013 6:02:18 PM UTC+5:30, Antoon Pardon wrote:
> Op 04-12-13 13:01, rusi schreef:
> > On Wednesday, December 4, 2013 3:59:06 PM UTC+5:30, Antoon Pardon wrote:
> >> Op 04-12-13 11:09, rusi schreef:
> >>> I used the spaces case to indicate the limit of chaos. 
> >>> Other characters (that
> >>> already have uses) are just as problematic.
> >>
> >> I don't agree with the latter. As it is now python can make the
> >> distinction between
> >>
> >> from A import B    and     fromAimportB.
> >>
> >> I see no a priori reason why this should be limited to letters. A
> >> language designer might choose to allow a bigger set of characters
> >> in identifiers like '-', '+' and others. In that case a-b would be
> >> an identifier and a - b would be the operation. Just as in python
> >> fromAimportB is an identifier and from A import B is an import
> >> statement.
> > 
> > Im not sure what you are saying.
> > Sure a language designer can design a language differently from python.
> > I mentioned lisp. Cobol is another behaving exactly as you describe.
> > 
> > My point is that when you do (something like) that, you will need to change the
> > lexical and grammatical structure of the language.  And this will make 
> > for rather far-reaching changes ALL OVER the language not just in what-follows-dot.
>
> No you don't need to change the lexical and grammatical structure of
> the language. Changing the characters allowed in identifiers, is not a
> change in lexical structure. The only difference in lexical structuring
> would be that '-', '>=' and other similars symbols would have to be
> treated like keyword like 'from', 'as' etc instead of being recognizable
> by just being present.

Well I am mystified…
Consider the string a-b in a program text.
A Cobol or Lisp system sees this as one identifier.
Python, C (and most modern languages) see this ident, operator, ident.

As I understand it this IS the lexical structure of the language and the lexer
is the part that implements this:
- in cobol/lisp keeping it as one
- in python/C breaking it into 3

Maybe you understand in some other way the phrase "lexical structure"?

> And the grammatical structure of the language wouldn't change at all.
> Sure a-b would now be an identifier and not an operation but that is
> of no concern for the parser.

About grammar maybe what you are saying will hold: presumably if the token-set
is the same, one could keep the same grammar, with the differences being 
entirely inter-lexeme ones.

[toc] | [prev] | [next] | [standalone]

#61043

From	Antoon Pardon <antoon.pardon@rece.vub.ac.be>
Date	2013-12-04 20:57 +0100
Message-ID	<mailman.3582.1386187102.18130.python-list@python.org>
In reply to	#61017

Op 04-12-13 14:02, rusi schreef:
> On Wednesday, December 4, 2013 6:02:18 PM UTC+5:30, Antoon Pardon wrote:
>> Op 04-12-13 13:01, rusi schreef:
>>> On Wednesday, December 4, 2013 3:59:06 PM UTC+5:30, Antoon Pardon wrote:
>>>> Op 04-12-13 11:09, rusi schreef:
>>>>> I used the spaces case to indicate the limit of chaos. 
>>>>> Other characters (that
>>>>> already have uses) are just as problematic.
>>>>
>>>> I don't agree with the latter. As it is now python can make the
>>>> distinction between
>>>>
>>>> from A import B    and     fromAimportB.
>>>>
>>>> I see no a priori reason why this should be limited to letters. A
>>>> language designer might choose to allow a bigger set of characters
>>>> in identifiers like '-', '+' and others. In that case a-b would be
>>>> an identifier and a - b would be the operation. Just as in python
>>>> fromAimportB is an identifier and from A import B is an import
>>>> statement.
>>>
>>> Im not sure what you are saying.
>>> Sure a language designer can design a language differently from python.
>>> I mentioned lisp. Cobol is another behaving exactly as you describe.
>>>
>>> My point is that when you do (something like) that, you will need to change the
>>> lexical and grammatical structure of the language.  And this will make 
>>> for rather far-reaching changes ALL OVER the language not just in what-follows-dot.
>>
>> No you don't need to change the lexical and grammatical structure of
>> the language. Changing the characters allowed in identifiers, is not a
>> change in lexical structure. The only difference in lexical structuring
>> would be that '-', '>=' and other similars symbols would have to be
>> treated like keyword like 'from', 'as' etc instead of being recognizable
>> by just being present.
> 
> Well I am mystified…
> Consider the string a-b in a program text.
> A Cobol or Lisp system sees this as one identifier.
> Python, C (and most modern languages) see this ident, operator, ident.
> 
> As I understand it this IS the lexical structure of the language and the lexer
> is the part that implements this:
> - in cobol/lisp keeping it as one
> - in python/C breaking it into 3
> 
> Maybe you understand in some other way the phrase "lexical structure"?

Yes I do. The fact that a certain string is lexically evaluated differently
is IMO not enough to conclude the language has a different lexical structure.
It only means that the values allowed within the structure are different. What
I see here is that some languages have an other alphabet over which identifiers
are allowed.

>> And the grammatical structure of the language wouldn't change at all.
>> Sure a-b would now be an identifier and not an operation but that is
>> of no concern for the parser.
> 
> About grammar maybe what you are saying will hold: presumably if the token-set
> is the same, one could keep the same grammar, with the differences being 
> entirely inter-lexeme ones.

And the question is. If the token-set is the same, how is then is the lexical
structure different rather than just the possible values associate with the tokens?

-- 
Antoon Pardon

[toc] | [prev] | [next] | [standalone]

#61007

From	Chris Angelico <rosuav@gmail.com>
Date	2013-12-04 21:33 +1100
Message-ID	<mailman.3553.1386153202.18130.python-list@python.org>
In reply to	#61004

On Wed, Dec 4, 2013 at 9:09 PM, rusi <rustompmody@gmail.com> wrote:
> OP wants attribute identifiers like outer_fieldset-inner_fieldset-third_field.
> Say I have a python expression:
> obj.outer_fieldset-inner_fieldset-third_field

I don't think so. What the OP asked for was:

my_object.'valid-attribute-name-but-not-valid-identifier'

Or describing it another way: A literal string instead of a token.
This is conceivable, at least, but I don't think it gives any
advantage over a dictionary.

What you could do, though, is create a single object that can be
indexed either with dot notation or as a dictionary. For that to work,
there'd have to be some restrictions (eg no leading underscores - at
very least, __token__ should be special still), but it wouldn't be
hard to do - two magic methods and the job's done, I think; you might
even be able to manage on one. (Code golf challenge, anyone?) Of
course, there's still the question of whether that even is an
advantage.

ChrisA

[toc] | [prev] | [next] | [standalone]

#61012

From	rusi <rustompmody@gmail.com>
Date	2013-12-04 03:27 -0800
Message-ID	<41f1543d-edb7-4e8d-bb47-f23d0440f180@googlegroups.com>
In reply to	#61007

On Wednesday, December 4, 2013 4:03:14 PM UTC+5:30, Chris Angelico wrote:
> On Wed, Dec 4, 2013 at 9:09 PM, rusi  wrote:
> > OP wants attribute identifiers like outer_fieldset-inner_fieldset-third_field.
> > Say I have a python expression:
> > obj.outer_fieldset-inner_fieldset-third_field
>
> I don't think so. What the OP asked for was:
>
> my_object.'valid-attribute-name-but-not-valid-identifier'
>
> Or describing it another way: A literal string instead of a token.

This is just pushing the issue one remove away.
Firstly a literal string is very much a token -- lexically.
Now consider the syntax as defined by the grammar.

Let Ident = Set of strings* that are valid python identifiers -- 
something like [a-zA-Z][a-zA-Z0-9]*

Let Exp = Set to strings* that are python expressions

* Note that I am using string from the language implementers pov not language
user ie the python identifier var is the implementers string "var" whereas
the python string literal "var" is the implementer's string "\"var\""

Now clearly Ident is a proper subset of Exp.

Now what is the proposal?
You want to extend the syntactically allowable a.b set.
If the b's can be any arbitrary expression we can have
var.fld(1,2) with the grammatical ambiguity that this can be
(var.fld)(1,2)   -- the usual interpretation
Or
var.(fld(1,2)) -- the new interpretation -- ie a computed field name.

OTOH if you say superset of Ident but subset of Exp, then we have to determine
what this new limbo set is to be. ie what is the grammatical category of 
'what-follows-a-dot' ??

Some other-language notes:
1. In C there is one case somewhat like this:
#include "string"
the "string" cannot be an arbitrary expression as the rest of C.  But then this 
is not really C but the C preprocessor

2. In lisp the Ident set is way more permissive than in most languages -- 
allowing operators etc that would be delimiters in most languages.
If one wants to go even beyond that and include say spaces and parenthesis -- 
almost the only delimiters that lisp has -- one must write |ident with spaces|
ie for identifiers the bars behave somewhat like strings' quote marks.
Because the semantics of identifiers and strings are different -- the lexical
structures need to reflect that difference -- so you cannot replace the bars
by quotes.

[toc] | [prev] | [next] | [standalone]

#61011

From	Tim Chase <python.list@tim.thechases.com>
Date	2013-12-04 05:25 -0600
Message-ID	<mailman.3557.1386156238.18130.python-list@python.org>
In reply to	#61004

On 2013-12-04 21:33, Chris Angelico wrote:
> I don't think so. What the OP asked for was:
> 
> my_object.'valid-attribute-name-but-not-valid-identifier'
> 
> Or describing it another way: A literal string instead of a token.
> This is conceivable, at least, but I don't think it gives any
> advantage over a dictionary.

In both cases (attribute-access-as-dict-functionality and
attribute-access-as-avoiding-setattr), forcing a literal actually
diminishes Python's power.  I like the ability to do

  a[key.strip().lower()] = some_value
  setattr(thing, key.strip().lower(), some_value)

which can't be done (?) with mere literal notation.  What would they
look like?

  a.(key.strip().lower()) = some_value

(note that "key.strip().lower()" not actually a "literal" that
ast.literal_eval would accept). That's pretty ugly, IMHO :-)

-tkc

[toc] | [prev] | [next] | [standalone]

#61013

From	Jussi Piitulainen <jpiitula@ling.helsinki.fi>
Date	2013-12-04 13:30 +0200
Message-ID	<qotzjogonup.fsf@ruuvi.it.helsinki.fi>
In reply to	#61004

rusi writes:
> On Wednesday, December 4, 2013 2:27:28 PM UTC+5:30, Ian wrote:
> > On Tue, Dec 3, 2013 at 11:31 PM, rusi  wrote:
> > > Its a more fundamental problem than that:
> > > It emerges from the OP's second post) that he wants '-' in the
> > > attributes.  Is that all?
> > >
> > > Where does this syntax-enlargement stop? Spaces? Newlines?
> >
> > At non-strings.
> >
> > >>> setattr(foo, 21+21, 42)
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in <module>
> > TypeError: attribute name must be string, not 'int'
> 
> Not sure what's your point. 
> 
> OP wants attribute identifiers like
> outer_fieldset-inner_fieldset-third_field.
> Say I have a python expression: 
> obj.outer_fieldset-inner_fieldset-third_field
> 
> It can (in the proposed extension) be parsed as above, or as:
> obj.outer_fieldset - inner_fieldset-third_field
> the first hyphen being minus and the second being part of the
> identifier.
> 
> How do we decide which '-' are valid identifier components --
> hyphens and which minus-signs?

I think the OP might be after the JavaScript mechanism where an
attribute name can be any string, the indexing brackets are always
available, and the dot notation is available when the attribute name
looks like a simple identifier. That could be made to work. (I'm not
saying should, or should not. Just that it seems technically simple.)

Hm. Can't specific classes be made to behave this way even now by
implementing suitable underscored methods?

[toc] | [prev] | [next] | [standalone]

Page 2 of 3 — ← Prev page 1 [2] 3 Next page →

csiph-web

Why is there no natural syntax for accessing attributes with names not being valid identifiers?

Contents

#61060

#61062

#61064

#61067

#61072

#61073

#61080

#60990

#60991

#60992

#61004

#61006

#61015

#61016

#61017

#61043

#61007

#61012

#61011

#61013