Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #108031 > unrolled thread
| Started by | moa47401@gmail.com |
|---|---|
| First post | 2016-05-02 14:30 -0700 |
| Last post | 2016-05-04 09:40 +1000 |
| Articles | 17 — 8 participants |
Back to article view | Back to comp.lang.python
Need help understanding list structure moa47401@gmail.com - 2016-05-02 14:30 -0700
Re: Need help understanding list structure Erik <python@lucidity.plus.com> - 2016-05-02 22:48 +0100
Re: Need help understanding list structure moa47401@gmail.com - 2016-05-02 15:33 -0700
Re: Need help understanding list structure Michael Torrie <torriem@gmail.com> - 2016-05-02 17:25 -0600
Re: Need help understanding list structure Ben Finney <ben+python@benfinney.id.au> - 2016-05-03 09:43 +1000
Re: Need help understanding list structure moa47401@gmail.com - 2016-05-03 06:21 -0700
Re: Need help understanding list structure Chris Angelico <rosuav@gmail.com> - 2016-05-03 23:47 +1000
Re: Need help understanding list structure moa47401@gmail.com - 2016-05-03 09:01 -0700
RE: Need help understanding list structure Dan Strohl <D.Strohl@F5.com> - 2016-05-03 16:52 +0000
Re: Need help understanding list structure moa47401@gmail.com - 2016-05-03 10:31 -0700
RE: Need help understanding list structure Dan Strohl <D.Strohl@F5.com> - 2016-05-03 17:54 +0000
Use __repr__ to show the programmer's representation (was: Need help understanding list structure) Ben Finney <ben+python@benfinney.id.au> - 2016-05-04 04:14 +1000
RE: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) Dan Strohl <D.Strohl@F5.com> - 2016-05-03 18:35 +0000
Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) moa47401@gmail.com - 2016-05-03 12:24 -0700
Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) Random832 <random832@fastmail.com> - 2016-05-03 15:37 -0400
Re: Need help understanding list structure MRAB <python@mrabarnett.plus.com> - 2016-05-03 20:57 +0100
Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) Chris Angelico <rosuav@gmail.com> - 2016-05-04 09:40 +1000
| From | moa47401@gmail.com |
|---|---|
| Date | 2016-05-02 14:30 -0700 |
| Subject | Need help understanding list structure |
| Message-ID | <d6b0c524-f020-4726-b91a-517b2e8024ea@googlegroups.com> |
I've been using an old text parsing library and have been able to accomplish most of what I wanted to do. But I don't understand the list structure it uses well enough to build additional methods. If I print the list, it has thousands of elements within its brackets separated by commas as I would expect. But the elements appear to be memory pointers not the actual text. Here's an example: <gedcom.Element object at 0x018BDE10> If I iterate over the list, I do get the actual text of each element and am able to use it. Also, if I iterate over the list and place each element in a new list using append, then each element in the new list is the text I expect not memory pointers. But... if I copy the old list to a new list using new = old[:] or new = list(old) the new list is exactly like the original with memory pointers. Can someone help me understand why or under what circumstances a list shows pointers instead of the text data?
[toc] | [next] | [standalone]
| From | Erik <python@lucidity.plus.com> |
|---|---|
| Date | 2016-05-02 22:48 +0100 |
| Message-ID | <mailman.329.1462225716.32212.python-list@python.org> |
| In reply to | #108031 |
On 02/05/16 22:30, moa47401@gmail.com wrote: > Can someone help me understand why or under what circumstances a list > shows pointers instead of the text data? When Python's "print" statement/function is invoked, it will print the textual representation of the object according to its class's __str__ or __repr__ method. That is, the print function prints out whatever text the class says it should. For classes which don't implement a __str__ or __repr__ method, then the text "<CLASS object at ADDRESS>" is used - where CLASS is the class name and ADDRESS is the "memory pointer". > If I iterate over the list, I do get the actual text of each element > and am able to use it. > > Also, if I iterate over the list and place each element in a new list > using append, then each element in the new list is the text I expect > not memory pointers. Look at the __iter__ method of the class of the object you are iterating over. I suspect that it returns string objects, not the objects that are in the list itself. String objects have a __str__ or __repr__ method that represents them as the text, so that is what 'print' will output. Hope that helps, E.
[toc] | [prev] | [next] | [standalone]
| From | moa47401@gmail.com |
|---|---|
| Date | 2016-05-02 15:33 -0700 |
| Message-ID | <ab5d3884-77ed-45d4-86ab-0333bf67dca1@googlegroups.com> |
| In reply to | #108032 |
> When Python's "print" statement/function is invoked, it will print the > textual representation of the object according to its class's __str__ or > __repr__ method. That is, the print function prints out whatever text > the class says it should. > > For classes which don't implement a __str__ or __repr__ method, then > the text "<CLASS object at ADDRESS>" is used - where CLASS is the class > name and ADDRESS is the "memory pointer". > > > If I iterate over the list, I do get the actual text of each element > > and am able to use it. > > > > Also, if I iterate over the list and place each element in a new list > > using append, then each element in the new list is the text I expect > > not memory pointers. > > Look at the __iter__ method of the class of the object you are iterating > over. I suspect that it returns string objects, not the objects that are > in the list itself. > > String objects have a __str__ or __repr__ method that represents them as > the text, so that is what 'print' will output. > > Hope that helps, E. Yes, that does help. You're right. The author of the library I'm using didn't implement either a __str__ or __repr__ method. Am I correct in assuming that parsing a large text file would be quicker returning pointers instead of strings? I've never run into this before.
[toc] | [prev] | [next] | [standalone]
| From | Michael Torrie <torriem@gmail.com> |
|---|---|
| Date | 2016-05-02 17:25 -0600 |
| Message-ID | <mailman.331.1462231551.32212.python-list@python.org> |
| In reply to | #108033 |
On 05/02/2016 04:33 PM, moa47401@gmail.com wrote: > Yes, that does help. You're right. The author of the library I'm > using didn't implement either a __str__ or __repr__ method. Am I > correct in assuming that parsing a large text file would be quicker > returning pointers instead of strings? I've never run into this > before. I'm not sure what you mean by "returning pointers." The list isn't returning pointers. It's a list of *objects*. To be specific, a list of gedcom.Element objects, though they could be anything, including numbers or strings. If you refer to the source code where the Element class is defined you can see what these objects contain. I suspect they contain a lot more information than simply text. Lists of objects is a common idiom in Python. As you've discovered, if you shallow copy a list, the new list will contain the exact same objects. In many cases, this does not matter. For example a list of numbers, which are immutable or unchangeable objects. It doesn't matter that the instances are shared, since the instances themselves will never change. If the objects are mutable, as they are in your case, a shallow copy may not always be what you want. As to your question. A list never shows "pointers" as you say. A list always contains objects, and if you simply "print" the list, it will try to show a representation of the list, using the objects' repr dunder methods. Some classes I have used have their repr methods print out what the constructor would look like, if you were to construct the object yourself. This is very useful. If I recall, this is what BeautifulSoup objects do, which is incredibly useful. In your case, as Erik said, the objects you are dealing with don't provide repr dunder methods, so Python just lets you know they are objects of a certain class, and what their ids are, which is helpful if you're trying to determine if two objects are the same object. These are not "pointers" in the sense you're talking. You'll get text if the object prints text for you. This is true of any object you might store in the list. I hope this helps a bit. Exploring from the interactive prompt as you are doing is very useful, once you understand what it's saying to you.
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2016-05-03 09:43 +1000 |
| Message-ID | <mailman.332.1462232638.32212.python-list@python.org> |
| In reply to | #108033 |
moa47401@gmail.com writes: > Am I correct in assuming that parsing a large text file would be > quicker returning pointers instead of strings? What do you mean by “return a pointer”? Python doesn't have pointers. In the Python language, a container type (such as ‘set’, ‘list’, ‘dict’, etc.) contains the objects directly. There are no “pointers” there; by accessing the items of a container, you access the items directly. What do you mean by “would be quicker”? I am concerned you are seeking speed of the program at the expense of understandability and clarity of the code. Instead, you should be writing clear, maintainable code. *Only if* the clear, maintainable code you write then actually ends up being too slow, should you then worry about what parts are quick or slow by *measuring* the specific parts of code to discover what is actually occupying the time. -- \ “All television is educational television. The question is: | `\ what is it teaching?” —Nicholas Johnson | _o__) | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | moa47401@gmail.com |
|---|---|
| Date | 2016-05-03 06:21 -0700 |
| Message-ID | <b93a85c8-042a-4c1f-800e-68a02dac3b78@googlegroups.com> |
| In reply to | #108036 |
Thanks for the replies. I definitely need a better understanding of "<CLASS object at ADDRESS>" when using Python objects. So far no luck with web searches or my Python books. Could someone point (no pun intended) me to a good resource? Not that it matters, but the reason I got off track is there are pointers within my data that point to other pieces of the data and have nothing to do with Python.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2016-05-03 23:47 +1000 |
| Message-ID | <mailman.346.1462283241.32212.python-list@python.org> |
| In reply to | #108069 |
On Tue, May 3, 2016 at 11:21 PM, <moa47401@gmail.com> wrote: > Thanks for the replies. I definitely need a better understanding of "<CLASS object at ADDRESS>" when using Python objects. So far no luck with web searches or my Python books. Could someone point (no pun intended) me to a good resource? > > Not that it matters, but the reason I got off track is there are pointers within my data that point to other pieces of the data and have nothing to do with Python. What you're seeing there is the default object representation. It isn't actually quoting an address but an "identity number". The only meaning of that number is that it's unique [1]; so if you see the same number come up twice in the list, you can be confident that the list has two references to the same object. Beyond that, it basically tells you nothing. So you know what kind of object it is, but anything else you'll have to figure out for yourself. ChrisA [1] Among concurrently-existing objects; if an object disappears, its ID can be reused.
[toc] | [prev] | [next] | [standalone]
| From | moa47401@gmail.com |
|---|---|
| Date | 2016-05-03 09:01 -0700 |
| Message-ID | <db6035ff-7d16-4b21-92c8-c520a9b66230@googlegroups.com> |
| In reply to | #108071 |
At the risk of coming across as a complete dunder-head, I think my confusion has to do with the type of data the library returns in the list. Any kind of text or integer list I manually create, doesn't do this.
See my questions down below at the end.
If I run the following statements on the list returned by the gedcom library:
print(type(myList))
print(len(myList))
print(myList[0])
print(myList[0:29])
print(myList)
for x in myList:
print(x)
I get this:
<class 'list'>
29
0 HEAD
[<gedcom.Element object at 0x0135BFD0>, <gedcom.Element object at 0x01372030>, <gedcom.Element object at 0x01372050>, <gedcom.Element object at 0x013720B0>, <gedcom.Element object at 0x013720F0>, <gedcom.Element object at 0x01372130>, <gedcom.Element object at 0x01372190>, <gedcom.Element object at 0x013721F0>, <gedcom.Element object at 0x01372270>, <gedcom.Element object at 0x01372230>, <gedcom.Element object at 0x013722F0>, <gedcom.Element object at 0x013722B0>, <gedcom.Element object at 0x01372370>, <gedcom.Element object at 0x01372390>, <gedcom.Element object at 0x01372410>, <gedcom.Element object at 0x01372470>, <gedcom.Element object at 0x01372490>, <gedcom.Element object at 0x013724F0>, <gedcom.Element object at 0x01372530>, <gedcom.Element object at 0x01372590>, <gedcom.Element object at 0x013725F0>, <gedcom.Element object at 0x01372630>, <gedcom.Element object at 0x01372690>, <gedcom.Element object at 0x013726F0>, <gedcom.Element object at 0x01372710>, <gedcom.Element object at 0x01372770>, <gedcom.Element object at 0x01372790>, <gedcom.Element object at 0x013727D0>, <gedcom.Element object at 0x01372830>]
[<gedcom.Element object at 0x0135BFD0>, <gedcom.Element object at 0x01372030>, <gedcom.Element object at 0x01372050>, <gedcom.Element object at 0x013720B0>, <gedcom.Element object at 0x013720F0>, <gedcom.Element object at 0x01372130>, <gedcom.Element object at 0x01372190>, <gedcom.Element object at 0x013721F0>, <gedcom.Element object at 0x01372270>, <gedcom.Element object at 0x01372230>, <gedcom.Element object at 0x013722F0>, <gedcom.Element object at 0x013722B0>, <gedcom.Element object at 0x01372370>, <gedcom.Element object at 0x01372390>, <gedcom.Element object at 0x01372410>, <gedcom.Element object at 0x01372470>, <gedcom.Element object at 0x01372490>, <gedcom.Element object at 0x013724F0>, <gedcom.Element object at 0x01372530>, <gedcom.Element object at 0x01372590>, <gedcom.Element object at 0x013725F0>, <gedcom.Element object at 0x01372630>, <gedcom.Element object at 0x01372690>, <gedcom.Element object at 0x013726F0>, <gedcom.Element object at 0x01372710>, <gedcom.Element object at 0x01372770>, <gedcom.Element object at 0x01372790>, <gedcom.Element object at 0x013727D0>, <gedcom.Element object at 0x01372830>]
0 HEAD
1 SOUR AncestQuest
2 NAME Ancestral Quest
2 VERS 14.00.9
2 CORP Incline Software, LC
3 ADDR PO Box 95543
4 CONT South Jordan, UT 84095
4 CONT USA
1 DATE 3 MAY 2016
2 TIME 10:44:10
1 FILE test_gedcom.ged
1 GEDC
2 VERS 5.5
2 FORM LINEAGE-LINKED
1 CHAR ANSEL
0 @I1@ INDI
1 NAME John /Allen/
1 SEX M
1 BIRT
2 DATE 1750
2 PLAC VA
1 DEAT
2 DATE 1804
2 PLAC KY
1 _UID D6C103E6105D654B85D47DA1B36E474BC7D1
1 CHAN
2 DATE 3 MAY 2016
3 TIME 10:43:35
0 TRLR
Questions:
Why does printing a single item print the actual text of the object?
Why does printing a range print the "representations" of the objects?
Why does iterating over the list print the actual text of the objects?
How can I determine what type of data is in the list?
[toc] | [prev] | [next] | [standalone]
| From | Dan Strohl <D.Strohl@F5.com> |
|---|---|
| Date | 2016-05-03 16:52 +0000 |
| Message-ID | <mailman.353.1462294382.32212.python-list@python.org> |
| In reply to | #108079 |
Take a look at the docs for print() https://docs.python.org/3.5/library/functions.html#print str() https://docs.python.org/3.5/library/stdtypes.html#str repr() https://docs.python.org/3.5/library/functions.html#repr When you do "print(object)", python will run everything through str() and output it. Str() will try to return a string representation of the object, what actually comes back will depend on how the objects author defined it. If object.__str__() has been defined, it will use that, if __str__() is not defined, it will use object.__repr__(). If object.__repr__() has not been defined, it will something that looks like "object_name object at xxxxxxx". So, as to your specific questions / comments: > At the risk of coming across as a complete dunder-head, I think my confusion > has to do with the type of data the library returns in the list. Any kind of text > or integer list I manually create, doesn't do this. Actually, they do, but strings and integers have well defined __str__ and __repr__ methods, and behave pretty well. So... think about what is actually being passed to the print() function in each case: > print(type(myList)) Passing the string object of the results of type(myList) > print(len(myList)) Passing the integer object returning from len(myList) > print(myList[0]) Passing the gedcom ELEMENT object found at location 0 in the list > print(myList[0:29]) Passing a list object created by copying the items from location 0 to location 29 in the original list > print(myList) Passing the entire list object > for x in myList: > print(x) Passing the individual gedcom ELEMENT objects from the list. So, in each of these cases, you are passing different types of objects, and each one (string, integer, list, gedcom) will behave differently depending on how it is coded. > Why does printing a single item print the actual text of the object? If you are printing a single item (print(myList[0]), you are printing the __str__ or __repr__ for the object stored at that location. > Why does printing a range print the "representations" of the objects? If you are printing a range of objects, you are printing the __str__ or __repr__ for the RANGE OBJECT or LIST object, not the object itself, which apparently only checks for a __repr__() method in the contained gedcom ELEMENT objects, which is not defined (see below). > Why does iterating over the list print the actual text of the objects? When iterating over the list, you are printing the specific items again (just like when you did print(myList[0]) ) > How can I determine what type of data is in the list? Try print(type(myList[0])), which will give you the type of data in the first object in the list (though, keep in mind that the object type could be different in each item in the list). If we look at the gedcom library (assuming you are talking about this one: https://github.com/madprime/python-gedcom/blob/master/gedcom/__init__.py) at the end of the file you can see they defined __str__() in the ELEMENT object, but there is no definition for __repr__(), which matches what we surmised above. If you want to fix it by editing the gedcom library, you could simply add a line at the end like: __repr__() = __str__() Hope that helps. Dan Strohl
[toc] | [prev] | [next] | [standalone]
| From | moa47401@gmail.com |
|---|---|
| Date | 2016-05-03 10:31 -0700 |
| Message-ID | <221dcc70-39d8-4c9c-8827-8e0bc1ec1fda@googlegroups.com> |
| In reply to | #108085 |
I added a __repr__ method at the end of the gedcom library like so:
def __repr__(self):
""" Format this element as its original string """
result = repr(self.level())
if self.pointer() != "":
result += ' ' + self.pointer()
result += ' ' + self.tag()
if self.value() != "":
result += ' ' + self.value()
return result
and now I can print myList properly.
Eric and Michael also mentioned repr above, but I guess I needed someone to spell it out for me. Thanks for taking the time to put it in terms an old dog could understand.
[toc] | [prev] | [next] | [standalone]
| From | Dan Strohl <D.Strohl@F5.com> |
|---|---|
| Date | 2016-05-03 17:54 +0000 |
| Message-ID | <mailman.354.1462298084.32212.python-list@python.org> |
| In reply to | #108088 |
> I added a __repr__ method at the end of the gedcom library like so:
>
> def __repr__(self):
> """ Format this element as its original string """
> result = repr(self.level())
> if self.pointer() != "":
> result += ' ' + self.pointer()
> result += ' ' + self.tag()
> if self.value() != "":
> result += ' ' + self.value()
> return result
>
> and now I can print myList properly.
>
> Eric and Michael also mentioned repr above, but I guess I needed someone
> to spell it out for me. Thanks for taking the time to put it in terms an old dog
> could understand.
>
Glad to help! (being an old dog myself, I know the feeling!)
One other point for you, if your "__repr__(self)" code is the same as the "__str__(self)" code (which it looks like it is, at a glance at least), you can instead reference the __str__ method and save having a duplicate code block... some examples:
=========
Option 1: This is the easiest to read (IMHO) and allows for the possibility that str() is doing something here like formatting or whatever. (in this case it shouldn't be though). However, to call this actually is taking multiple steps (calling object.__repr__, whch calls str(), which calls object.__str__(). )
def __repr__(self):
return str(self)
=========
Option 2: this isn't hard to read, and just takes two steps (calling object.__repr__(), which calls object.__str__().
def __repr__(self):
return self.__str__()
========
Option 3: it's not that this is hard to read, but since it doesn't follow the standard "def blah(self):" pattern, sometimes I overlook these in the code (even when I put them there). This however is the shortest since it really just tells the object to return object.__str__() if either object.__repr__() OR object.__str__() is called.
__repr__ = __str__
This probably doesn't matter much in this case, since it probably isn't called that much in normal use (though there are always exceptions), and in the end, Python is fast enough that unless you really need to slice off a few milliseconds, you will never notice the difference, but just some food for thought.
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2016-05-04 04:14 +1000 |
| Subject | Use __repr__ to show the programmer's representation (was: Need help understanding list structure) |
| Message-ID | <mailman.355.1462299295.32212.python-list@python.org> |
| In reply to | #108088 |
Dan Strohl via Python-list <python-list@python.org> writes:
> One other point for you, if your "__repr__(self)" code is the same as
> the "__str__(self)" code (which it looks like it is, at a glance at
> least), you can instead reference the __str__ method and save having a
> duplicate code block...
Alternatively, consider: the ‘__repr__’ method is intended to return a
*programmer's* representation of the object. Commonly, this is text
which looks like the Python expression which would create an equal
instance::
>>> foo = datetime.date.fromtimestamp(13012345678)
>>> print(repr(foo))
datetime.date(2382, 5, 7)
So if there is a sensible “here is the expression that could have been
used to create this instance” text, have the ‘__repr__’ method return
that text::
>>> foo = LoremIpsum(bingle, bongle, bungle)
>>> print(repr(foo))
packagename.LoremIpsum("spam", 753, frob=True)
That text is very useful because it can be fed back into the interactive
interpreter to make an equal-valued instance and experiment further.
For some types, there isn't such an expression that would evaluate to an
equal-valued instance of the type. So the conventional non-evaluating
representation is used::
>>> foo = frobnicate_the_widget(widget)
>>> print(repr(foo))
<LoremIpsum instance, foo: "spam" bar: 753>
This gives the crucial information of what the type is, and also gives
other interesting (to the programmer) attributes that characterise the
specific instance.
The fallback “<LoremIpsum instance at 0xDEADBEEF>” is the least helpful;
it gives the type and identity of the instance, but only because that's
the lowest common information ‘object’ can guarantee. Always implement a
more informative representation for your custom type, if you can.
--
\ “Intellectual property is to the 21st century what the slave |
`\ trade was to the 16th.” —David Mertz |
_o__) |
Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Dan Strohl <D.Strohl@F5.com> |
|---|---|
| Date | 2016-05-03 18:35 +0000 |
| Subject | RE: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) |
| Message-ID | <mailman.356.1462300553.32212.python-list@python.org> |
| In reply to | #108088 |
> > One other point for you, if your "__repr__(self)" code is the same as > > the "__str__(self)" code (which it looks like it is, at a glance at > > least), you can instead reference the __str__ method and save having a > > duplicate code block... > > Alternatively, consider: the ‘__repr__’ method is intended to return a > *programmer's* representation of the object. Commonly, this is text which > looks like the Python expression which would create an equal > instance:: Definitely true per what _repr__ is supposed to do per python docs. However, in this case, that might not have solved the problem if the goal was to return the same as a __str__. (Though to be fair, I don’t really know what the actual problem was, so I might provide a different approach with a different goal <grin>). I also have never actually used repr() to create code that could be fed back to the interpreter (not saying it isn’t done, just that I haven’t run into needing it), and there are so many of the libraries that do not return a usable repr string that I would hesitate to even try it outside of a very narrow use case. Personally, I normally use __repr__ to give me a useful troubleshooting representation of the object... but that representation might not be exactly the same as what would recreate the object since I might show information that is dynamic to the current state (like, the name of the parent instead of just a pointer or something). I know that isn’t per the rules, but in the end, it makes it easier for me to troubleshoot the code. Having said that, Ben is totally correct in terms of the "right" way to do it, my earlier suggestion was not "right". Dan
[toc] | [prev] | [next] | [standalone]
| From | moa47401@gmail.com |
|---|---|
| Date | 2016-05-03 12:24 -0700 |
| Subject | Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) |
| Message-ID | <b1cb80ee-1455-4cae-9db3-8655d61f508e@googlegroups.com> |
| In reply to | #108092 |
quote - (Though to be fair, I don't really know what the actual problem was, so I might provide a different approach with a different goal <grin>) Originally I was trying to understand the exact structure of the list being returned by the gedcom library. It worked as it was, but I wanted to add additional functionality. I also wanted to understand what character set it was returning. I was giving it a gedcom file with ansel encoding, which is normal. My genealogy program can also export its database to gedcom using UTF-8 and Unicode. But both of those character sets caused the gedcom library to generate an error msg that the file violated GEDCOM format. Keep in mind the gedcom format established by the Latter-day Saints hasn't been updated in 20+ years.
[toc] | [prev] | [next] | [standalone]
| From | Random832 <random832@fastmail.com> |
|---|---|
| Date | 2016-05-03 15:37 -0400 |
| Subject | Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) |
| Message-ID | <mailman.359.1462304247.32212.python-list@python.org> |
| In reply to | #108093 |
On Tue, May 3, 2016, at 15:24, moa47401@gmail.com wrote: > I also wanted to understand what character set it was returning. I was > giving it a gedcom file with ansel encoding, which is normal. My > genealogy program can also export its database to gedcom using UTF-8 and > Unicode. But both of those character sets caused the gedcom library to > generate an error msg that the file violated GEDCOM format. You haven't said where this library can be found, or what the error message was.
[toc] | [prev] | [next] | [standalone]
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2016-05-03 20:57 +0100 |
| Message-ID | <mailman.360.1462305440.32212.python-list@python.org> |
| In reply to | #108088 |
On 2016-05-03 18:54, Dan Strohl via Python-list wrote: > >> I added a __repr__ method at the end of the gedcom library like so: >> >> def __repr__(self): >> """ Format this element as its original string """ >> result = repr(self.level()) >> if self.pointer() != "": >> result += ' ' + self.pointer() >> result += ' ' + self.tag() >> if self.value() != "": >> result += ' ' + self.value() >> return result >> >> and now I can print myList properly. >> >> Eric and Michael also mentioned repr above, but I guess I needed someone >> to spell it out for me. Thanks for taking the time to put it in terms an old dog >> could understand. >> > > Glad to help! (being an old dog myself, I know the feeling!) > > One other point for you, if your "__repr__(self)" code is the same as the "__str__(self)" code (which it looks like it is, at a glance at least), you can instead reference the __str__ method and save having a duplicate code block... some examples: > > ========= > Option 1: This is the easiest to read (IMHO) and allows for the possibility that str() is doing something here like formatting or whatever. (in this case it shouldn't be though). However, to call this actually is taking multiple steps (calling object.__repr__, whch calls str(), which calls object.__str__(). ) > > def __repr__(self): > return str(self) > > ========= > Option 2: this isn't hard to read, and just takes two steps (calling object.__repr__(), which calls object.__str__(). > > def __repr__(self): > return self.__str__() > > ======== > Option 3: it's not that this is hard to read, but since it doesn't follow the standard "def blah(self):" pattern, sometimes I overlook these in the code (even when I put them there). This however is the shortest since it really just tells the object to return object.__str__() if either object.__repr__() OR object.__str__() is called. > > > __repr__ = __str__ > > > This probably doesn't matter much in this case, since it probably isn't called that much in normal use (though there are always exceptions), and in the end, Python is fast enough that unless you really need to slice off a few milliseconds, you will never notice the difference, but just some food for thought. > > Option 4: Delete the __str__ method. If there's no __str__, Python falls back to __repr__. If there's no __repr__, Python falls back to the <Something object at somewhere> format.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2016-05-04 09:40 +1000 |
| Subject | Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) |
| Message-ID | <mailman.369.1462318828.32212.python-list@python.org> |
| In reply to | #108088 |
On Wed, May 4, 2016 at 4:35 AM, Dan Strohl via Python-list
<python-list@python.org> wrote:
> I also have never actually used repr() to create code that could be fed back to the interpreter (not saying it isn’t done, just that I haven’t run into needing it), and there are so many of the libraries that do not return a usable repr string that I would hesitate to even try it outside of a very narrow use case.
Here's a repr that I like using with SQLAlchemy:
def __repr__(self):
return (self.__class__.__name__ + "(" +
", ".join("%s=%r" % (col.name, getattr(self, col.name)) for
col in self.__table__.columns) +
")")
That results in something that *looks* like you could eval it, but you
shouldn't ever actually do that (because it'd create a new object).
It's still an effective way to make the repr readable; imagine a list
that prints out like this:
[Person(id=3, name="Fred"), Person(id=6, name="Barney"), Person(id=8,
name="Joe")]
You can tell exactly where one starts and another ends; you can read
what's going on with these record objects. Making them
"pseudo-evalable" is worth doing, even if you should never *actually*
eval them.
ChrisA
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web