Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #108031 > unrolled thread

Need help understanding list structure

Started bymoa47401@gmail.com
First post2016-05-02 14:30 -0700
Last post2016-05-04 09:40 +1000
Articles 17 — 8 participants

Back to article view | Back to comp.lang.python


Contents

  Need help understanding list structure moa47401@gmail.com - 2016-05-02 14:30 -0700
    Re: Need help understanding list structure Erik <python@lucidity.plus.com> - 2016-05-02 22:48 +0100
      Re: Need help understanding list structure moa47401@gmail.com - 2016-05-02 15:33 -0700
        Re: Need help understanding list structure Michael Torrie <torriem@gmail.com> - 2016-05-02 17:25 -0600
        Re: Need help understanding list structure Ben Finney <ben+python@benfinney.id.au> - 2016-05-03 09:43 +1000
          Re: Need help understanding list structure moa47401@gmail.com - 2016-05-03 06:21 -0700
            Re: Need help understanding list structure Chris Angelico <rosuav@gmail.com> - 2016-05-03 23:47 +1000
              Re: Need help understanding list structure moa47401@gmail.com - 2016-05-03 09:01 -0700
                RE: Need help understanding list structure Dan Strohl <D.Strohl@F5.com> - 2016-05-03 16:52 +0000
                  Re: Need help understanding list structure moa47401@gmail.com - 2016-05-03 10:31 -0700
                    RE: Need help understanding list structure Dan Strohl <D.Strohl@F5.com> - 2016-05-03 17:54 +0000
                    Use __repr__ to show the programmer's representation (was: Need help understanding list structure) Ben Finney <ben+python@benfinney.id.au> - 2016-05-04 04:14 +1000
                    RE: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) Dan Strohl <D.Strohl@F5.com> - 2016-05-03 18:35 +0000
                      Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) moa47401@gmail.com - 2016-05-03 12:24 -0700
                        Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) Random832 <random832@fastmail.com> - 2016-05-03 15:37 -0400
                    Re: Need help understanding list structure MRAB <python@mrabarnett.plus.com> - 2016-05-03 20:57 +0100
                    Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure) Chris Angelico <rosuav@gmail.com> - 2016-05-04 09:40 +1000

#108031 — Need help understanding list structure

Frommoa47401@gmail.com
Date2016-05-02 14:30 -0700
SubjectNeed help understanding list structure
Message-ID<d6b0c524-f020-4726-b91a-517b2e8024ea@googlegroups.com>
I've been using an old text parsing library and have been able to accomplish most of what I wanted to do. But I don't understand the list structure it uses well enough to build additional methods.

If I print the list, it has thousands of elements within its brackets separated by commas as I would expect. But the elements appear to be memory pointers not the actual text. Here's an example:
<gedcom.Element object at 0x018BDE10>

If I iterate over the list, I do get the actual text of each element and am able to use it.

Also, if I iterate over the list and place each element in a new list using append, then each element in the new list is the text I expect not memory pointers.

But... if I copy the old list to a new list using 

new = old[:] 
or 
new = list(old)

the new list is exactly like the original with memory pointers.

Can someone help me understand why or under what circumstances a list shows pointers instead of the text data?


[toc] | [next] | [standalone]


#108032

FromErik <python@lucidity.plus.com>
Date2016-05-02 22:48 +0100
Message-ID<mailman.329.1462225716.32212.python-list@python.org>
In reply to#108031
On 02/05/16 22:30, moa47401@gmail.com wrote:
> Can someone help me understand why or under what circumstances a list
> shows pointers instead of the text data?

When Python's "print" statement/function is invoked, it will print the 
textual representation of the object according to its class's __str__ or
__repr__ method. That is, the print function prints out whatever text
the class says it should.

For classes which don't implement a __str__ or __repr__ method, then
the text "<CLASS object at ADDRESS>" is used - where CLASS is the class
name and ADDRESS is the "memory pointer".

 > If I iterate over the list, I do get the actual text of each element
 > and am able to use it.
 >
 > Also, if I iterate over the list and place each element in a new list
 > using append, then each element in the new list is the text I expect
 > not memory pointers.

Look at the __iter__ method of the class of the object you are iterating 
over. I suspect that it returns string objects, not the objects that are 
in the list itself.

String objects have a __str__ or __repr__ method that represents them as 
the text, so that is what 'print' will output.

Hope that helps, E.

[toc] | [prev] | [next] | [standalone]


#108033

Frommoa47401@gmail.com
Date2016-05-02 15:33 -0700
Message-ID<ab5d3884-77ed-45d4-86ab-0333bf67dca1@googlegroups.com>
In reply to#108032
> When Python's "print" statement/function is invoked, it will print the 
> textual representation of the object according to its class's __str__ or
> __repr__ method. That is, the print function prints out whatever text
> the class says it should.
> 
> For classes which don't implement a __str__ or __repr__ method, then
> the text "<CLASS object at ADDRESS>" is used - where CLASS is the class
> name and ADDRESS is the "memory pointer".
> 
>  > If I iterate over the list, I do get the actual text of each element
>  > and am able to use it.
>  >
>  > Also, if I iterate over the list and place each element in a new list
>  > using append, then each element in the new list is the text I expect
>  > not memory pointers.
> 
> Look at the __iter__ method of the class of the object you are iterating 
> over. I suspect that it returns string objects, not the objects that are 
> in the list itself.
> 
> String objects have a __str__ or __repr__ method that represents them as 
> the text, so that is what 'print' will output.
> 
> Hope that helps, E.

Yes, that does help. You're right. The author of the library I'm using didn't implement either a __str__ or __repr__ method. Am I correct in assuming that parsing a large text file would be quicker returning pointers instead of strings? I've never run into this before.

[toc] | [prev] | [next] | [standalone]


#108035

FromMichael Torrie <torriem@gmail.com>
Date2016-05-02 17:25 -0600
Message-ID<mailman.331.1462231551.32212.python-list@python.org>
In reply to#108033
On 05/02/2016 04:33 PM, moa47401@gmail.com wrote:
> Yes, that does help. You're right. The author of the library I'm
> using didn't implement either a __str__ or __repr__ method. Am I
> correct in assuming that parsing a large text file would be quicker
> returning pointers instead of strings? I've never run into this
> before.

I'm not sure what you mean by "returning pointers." The list isn't
returning pointers. It's a list of *objects*.  To be specific, a list of
gedcom.Element objects, though they could be anything, including numbers
or strings.  If you refer to the source code where the Element class is
defined you can see what these objects contain. I suspect they contain a
lot more information than simply text.

Lists of objects is a common idiom in Python.  As you've discovered, if
you shallow copy a list, the new list will contain the exact same
objects.  In many cases, this does not matter.  For example a list of
numbers, which are immutable or unchangeable objects.  It doesn't matter
that the instances are shared, since the instances themselves will never
change.  If the objects are mutable, as they are in your case, a shallow
copy may not always be what you want.

As to your question.  A list never shows "pointers" as you say.  A list
always contains objects, and if you simply "print" the list, it will try
to show a representation of the list, using the objects' repr dunder
methods.  Some classes I have used have their repr methods print out
what the constructor would look like, if you were to construct the
object yourself.  This is very useful.  If I recall, this is what
BeautifulSoup objects do, which is incredibly useful.

In your case, as Erik said, the objects you are dealing with don't
provide repr dunder methods, so Python just lets you know they are
objects of a certain class, and what their ids are, which is helpful if
you're trying to determine if two objects are the same object.  These
are not "pointers" in the sense you're talking.  You'll get text if the
object prints text for you. This is true of any object you might store
in the list.

I hope this helps a bit.  Exploring from the interactive prompt as you
are doing is very useful, once you understand what it's saying to you.

[toc] | [prev] | [next] | [standalone]


#108036

FromBen Finney <ben+python@benfinney.id.au>
Date2016-05-03 09:43 +1000
Message-ID<mailman.332.1462232638.32212.python-list@python.org>
In reply to#108033
moa47401@gmail.com writes:

> Am I correct in assuming that parsing a large text file would be
> quicker returning pointers instead of strings?

What do you mean by “return a pointer”? Python doesn't have pointers.

In the Python language, a container type (such as ‘set’, ‘list’, ‘dict’,
etc.) contains the objects directly. There are no “pointers” there; by
accessing the items of a container, you access the items directly.


What do you mean by “would be quicker”? I am concerned you are seeking
speed of the program at the expense of understandability and clarity of
the code.

Instead, you should be writing clear, maintainable code.

*Only if* the clear, maintainable code you write then actually ends up
being too slow, should you then worry about what parts are quick or slow
by *measuring* the specific parts of code to discover what is actually
occupying the time.

-- 
 \         “All television is educational television. The question is: |
  `\                           what is it teaching?” —Nicholas Johnson |
_o__)                                                                  |
Ben Finney

[toc] | [prev] | [next] | [standalone]


#108069

Frommoa47401@gmail.com
Date2016-05-03 06:21 -0700
Message-ID<b93a85c8-042a-4c1f-800e-68a02dac3b78@googlegroups.com>
In reply to#108036
Thanks for the replies. I definitely need a better understanding of "<CLASS object at ADDRESS>" when using Python objects. So far no luck with web searches or my Python books. Could someone point (no pun intended) me to a good resource? 

Not that it matters, but the reason I got off track is there are pointers within my data that point to other pieces of the data and have nothing to do with Python.

[toc] | [prev] | [next] | [standalone]


#108071

FromChris Angelico <rosuav@gmail.com>
Date2016-05-03 23:47 +1000
Message-ID<mailman.346.1462283241.32212.python-list@python.org>
In reply to#108069
On Tue, May 3, 2016 at 11:21 PM,  <moa47401@gmail.com> wrote:
> Thanks for the replies. I definitely need a better understanding of "<CLASS object at ADDRESS>" when using Python objects. So far no luck with web searches or my Python books. Could someone point (no pun intended) me to a good resource?
>
> Not that it matters, but the reason I got off track is there are pointers within my data that point to other pieces of the data and have nothing to do with Python.

What you're seeing there is the default object representation. It
isn't actually quoting an address but an "identity number". The only
meaning of that number is that it's unique [1]; so if you see the same
number come up twice in the list, you can be confident that the list
has two references to the same object.

Beyond that, it basically tells you nothing. So you know what kind of
object it is, but anything else you'll have to figure out for
yourself.

ChrisA

[1] Among concurrently-existing objects; if an object disappears, its
ID can be reused.

[toc] | [prev] | [next] | [standalone]


#108079

Frommoa47401@gmail.com
Date2016-05-03 09:01 -0700
Message-ID<db6035ff-7d16-4b21-92c8-c520a9b66230@googlegroups.com>
In reply to#108071
At the risk of coming across as a complete dunder-head, I think my confusion has to do with the type of data the library returns in the list. Any kind of text or integer list I manually create, doesn't do this.

See my questions down below at the end.

If I run the following statements on the list returned by the gedcom library:

print(type(myList))
print(len(myList))
print(myList[0])
print(myList[0:29])
print(myList)
for x in myList:
    print(x)

I get this:

<class 'list'>
29
0 HEAD
[<gedcom.Element object at 0x0135BFD0>, <gedcom.Element object at 0x01372030>, <gedcom.Element object at 0x01372050>, <gedcom.Element object at 0x013720B0>, <gedcom.Element object at 0x013720F0>, <gedcom.Element object at 0x01372130>, <gedcom.Element object at 0x01372190>, <gedcom.Element object at 0x013721F0>, <gedcom.Element object at 0x01372270>, <gedcom.Element object at 0x01372230>, <gedcom.Element object at 0x013722F0>, <gedcom.Element object at 0x013722B0>, <gedcom.Element object at 0x01372370>, <gedcom.Element object at 0x01372390>, <gedcom.Element object at 0x01372410>, <gedcom.Element object at 0x01372470>, <gedcom.Element object at 0x01372490>, <gedcom.Element object at 0x013724F0>, <gedcom.Element object at 0x01372530>, <gedcom.Element object at 0x01372590>, <gedcom.Element object at 0x013725F0>, <gedcom.Element object at 0x01372630>, <gedcom.Element object at 0x01372690>, <gedcom.Element object at 0x013726F0>, <gedcom.Element object at 0x01372710>, <gedcom.Element object at 0x01372770>, <gedcom.Element object at 0x01372790>, <gedcom.Element object at 0x013727D0>, <gedcom.Element object at 0x01372830>]
[<gedcom.Element object at 0x0135BFD0>, <gedcom.Element object at 0x01372030>, <gedcom.Element object at 0x01372050>, <gedcom.Element object at 0x013720B0>, <gedcom.Element object at 0x013720F0>, <gedcom.Element object at 0x01372130>, <gedcom.Element object at 0x01372190>, <gedcom.Element object at 0x013721F0>, <gedcom.Element object at 0x01372270>, <gedcom.Element object at 0x01372230>, <gedcom.Element object at 0x013722F0>, <gedcom.Element object at 0x013722B0>, <gedcom.Element object at 0x01372370>, <gedcom.Element object at 0x01372390>, <gedcom.Element object at 0x01372410>, <gedcom.Element object at 0x01372470>, <gedcom.Element object at 0x01372490>, <gedcom.Element object at 0x013724F0>, <gedcom.Element object at 0x01372530>, <gedcom.Element object at 0x01372590>, <gedcom.Element object at 0x013725F0>, <gedcom.Element object at 0x01372630>, <gedcom.Element object at 0x01372690>, <gedcom.Element object at 0x013726F0>, <gedcom.Element object at 0x01372710>, <gedcom.Element object at 0x01372770>, <gedcom.Element object at 0x01372790>, <gedcom.Element object at 0x013727D0>, <gedcom.Element object at 0x01372830>]
0 HEAD
1 SOUR AncestQuest
2 NAME Ancestral Quest
2 VERS 14.00.9
2 CORP Incline Software, LC
3 ADDR PO Box 95543
4 CONT South Jordan, UT 84095
4 CONT USA
1 DATE 3 MAY 2016
2 TIME 10:44:10
1 FILE test_gedcom.ged
1 GEDC
2 VERS 5.5
2 FORM LINEAGE-LINKED
1 CHAR ANSEL
0 @I1@ INDI
1 NAME John /Allen/
1 SEX M
1 BIRT
2 DATE 1750
2 PLAC VA
1 DEAT
2 DATE 1804
2 PLAC KY
1 _UID D6C103E6105D654B85D47DA1B36E474BC7D1
1 CHAN
2 DATE 3 MAY 2016
3 TIME 10:43:35
0 TRLR

Questions:

Why does printing a single item print the actual text of the object?
Why does printing a range print the "representations" of the objects?
Why does iterating over the list print the actual text of the objects?
How can I determine what type of data is in the list?

[toc] | [prev] | [next] | [standalone]


#108085

FromDan Strohl <D.Strohl@F5.com>
Date2016-05-03 16:52 +0000
Message-ID<mailman.353.1462294382.32212.python-list@python.org>
In reply to#108079
Take a look at the docs for 
print() https://docs.python.org/3.5/library/functions.html#print 
str() https://docs.python.org/3.5/library/stdtypes.html#str
repr() https://docs.python.org/3.5/library/functions.html#repr 

When you do "print(object)", python will run everything through str() and output it.  

Str() will try to return a string representation of the object, what actually comes back will depend on how the objects author defined it.  If object.__str__() has been defined, it will use that, if __str__() is not defined, it will use object.__repr__().  If object.__repr__() has not been defined, it will something that looks like "object_name object at xxxxxxx".

So, as to your specific questions / comments:

> At the risk of coming across as a complete dunder-head, I think my confusion
> has to do with the type of data the library returns in the list. Any kind of text
> or integer list I manually create, doesn't do this.
Actually, they do, but strings and integers have well defined __str__ and __repr__ methods, and behave pretty well.

So... think about what is actually being passed to the print() function in each case:

> print(type(myList))
Passing the string object of the results of type(myList)

> print(len(myList))
Passing the integer object returning from len(myList)

> print(myList[0])
Passing the gedcom ELEMENT object found at location 0 in the list

> print(myList[0:29])
Passing a list object created by copying the items from location 0 to location 29 in the original list

> print(myList)
Passing the entire list object

> for x in myList:
>     print(x)
Passing the individual gedcom ELEMENT objects from the list.

So, in each of these cases, you are passing different types of objects, and each one (string, integer, list, gedcom)  will behave differently depending on how it is coded.

> Why does printing a single item print the actual text of the object?
If you are printing a single item (print(myList[0]), you are printing the __str__ or __repr__ for the object stored at that location.

> Why does printing a range print the "representations" of the objects?
If you are printing a range of objects, you are printing the __str__ or __repr__ for the RANGE OBJECT or LIST object, not the object itself, which apparently only checks for a __repr__() method in the contained gedcom ELEMENT objects, which is not defined (see below).

> Why does iterating over the list print the actual text of the objects?
When iterating over the list, you are printing the specific items again (just like when you did print(myList[0])  ) 

> How can I determine what type of data is in the list?
Try print(type(myList[0])), which will give you the type of data in the first object in the list (though, keep in mind that the object type could be different in each item in the list).


If we look at the gedcom library (assuming you are talking about this one: https://github.com/madprime/python-gedcom/blob/master/gedcom/__init__.py) at the end of the file you can see they defined __str__() in the ELEMENT object,  but there is no definition for __repr__(), which matches what we surmised above.

If you want to fix it by editing the gedcom library, you could simply add a line at the end like:
__repr__() = __str__()


Hope that helps.

Dan Strohl

[toc] | [prev] | [next] | [standalone]


#108088

Frommoa47401@gmail.com
Date2016-05-03 10:31 -0700
Message-ID<221dcc70-39d8-4c9c-8827-8e0bc1ec1fda@googlegroups.com>
In reply to#108085
I added a __repr__ method at the end of the gedcom library like so:

def __repr__(self):
        """ Format this element as its original string """
        result = repr(self.level())
        if self.pointer() != "":
            result += ' ' + self.pointer()
        result += ' ' + self.tag()
        if self.value() != "":
            result += ' ' + self.value()
        return result

and now I can print myList properly.

Eric and Michael also mentioned repr above, but I guess I needed someone to spell it out for me. Thanks for taking the time to put it in terms an old dog could understand.

[toc] | [prev] | [next] | [standalone]


#108090

FromDan Strohl <D.Strohl@F5.com>
Date2016-05-03 17:54 +0000
Message-ID<mailman.354.1462298084.32212.python-list@python.org>
In reply to#108088
> I added a __repr__ method at the end of the gedcom library like so:
> 
> def __repr__(self):
>         """ Format this element as its original string """
>         result = repr(self.level())
>         if self.pointer() != "":
>             result += ' ' + self.pointer()
>         result += ' ' + self.tag()
>         if self.value() != "":
>             result += ' ' + self.value()
>         return result
> 
> and now I can print myList properly.
> 
> Eric and Michael also mentioned repr above, but I guess I needed someone
> to spell it out for me. Thanks for taking the time to put it in terms an old dog
> could understand.
> 

Glad to help!  (being an old dog myself, I know the feeling!)

One other point for you, if your "__repr__(self)" code is the same as the "__str__(self)" code (which it looks like it is, at a glance at least), you can instead reference the __str__ method and save having a duplicate code block...  some examples:

=========
Option 1:  This is the easiest to read (IMHO) and allows for the possibility that str() is doing something here like formatting or whatever.  (in this case it shouldn't be though).  However, to call this actually is taking multiple steps (calling object.__repr__, whch calls str(), which calls object.__str__(). )

def __repr__(self):
    return str(self)

=========
Option 2: this isn't hard to read, and just takes two steps (calling object.__repr__(), which calls object.__str__().

def __repr__(self):
    return self.__str__()

========
Option 3:  it's not that this is hard to read, but since it doesn't follow the standard "def blah(self):" pattern, sometimes I overlook these in the code (even when I put them there).  This however is the shortest since it really just tells the object to return object.__str__() if either object.__repr__() OR object.__str__() is called.


__repr__ = __str__


This probably doesn't matter much in this case, since it probably isn't called that much in normal use (though there are always exceptions), and in the end, Python is fast enough that unless you really need to slice off a few milliseconds, you will never notice the difference, but just some food for thought.

[toc] | [prev] | [next] | [standalone]


#108091 — Use __repr__ to show the programmer's representation (was: Need help understanding list structure)

FromBen Finney <ben+python@benfinney.id.au>
Date2016-05-04 04:14 +1000
SubjectUse __repr__ to show the programmer's representation (was: Need help understanding list structure)
Message-ID<mailman.355.1462299295.32212.python-list@python.org>
In reply to#108088
Dan Strohl via Python-list <python-list@python.org> writes:

> One other point for you, if your "__repr__(self)" code is the same as
> the "__str__(self)" code (which it looks like it is, at a glance at
> least), you can instead reference the __str__ method and save having a
> duplicate code block...

Alternatively, consider: the ‘__repr__’ method is intended to return a
*programmer's* representation of the object. Commonly, this is text
which looks like the Python expression which would create an equal
instance::

    >>> foo = datetime.date.fromtimestamp(13012345678)
    >>> print(repr(foo))
    datetime.date(2382, 5, 7)

So if there is a sensible “here is the expression that could have been
used to create this instance” text, have the ‘__repr__’ method return
that text::

    >>> foo = LoremIpsum(bingle, bongle, bungle)
    >>> print(repr(foo))
    packagename.LoremIpsum("spam", 753, frob=True)

That text is very useful because it can be fed back into the interactive
interpreter to make an equal-valued instance and experiment further.

For some types, there isn't such an expression that would evaluate to an
equal-valued instance of the type. So the conventional non-evaluating
representation is used::

    >>> foo = frobnicate_the_widget(widget)
    >>> print(repr(foo))
    <LoremIpsum instance, foo: "spam" bar: 753>

This gives the crucial information of what the type is, and also gives
other interesting (to the programmer) attributes that characterise the
specific instance.

The fallback “<LoremIpsum instance at 0xDEADBEEF>” is the least helpful;
it gives the type and identity of the instance, but only because that's
the lowest common information ‘object’ can guarantee. Always implement a
more informative representation for your custom type, if you can.

-- 
 \        “Intellectual property is to the 21st century what the slave |
  `\                              trade was to the 16th.” —David Mertz |
_o__)                                                                  |
Ben Finney

[toc] | [prev] | [next] | [standalone]


#108092 — RE: Use __repr__ to show the programmer's representation (was: Need help understanding list structure)

FromDan Strohl <D.Strohl@F5.com>
Date2016-05-03 18:35 +0000
SubjectRE: Use __repr__ to show the programmer's representation (was: Need help understanding list structure)
Message-ID<mailman.356.1462300553.32212.python-list@python.org>
In reply to#108088
> > One other point for you, if your "__repr__(self)" code is the same as
> > the "__str__(self)" code (which it looks like it is, at a glance at
> > least), you can instead reference the __str__ method and save having a
> > duplicate code block...
> 
> Alternatively, consider: the ‘__repr__’ method is intended to return a
> *programmer's* representation of the object. Commonly, this is text which
> looks like the Python expression which would create an equal
> instance::

Definitely true per what _repr__ is supposed to do per python docs.   However, in this case, that might not have solved the problem if the goal was to return the same as a __str__.  (Though to be fair, I don’t really know what the actual problem was, so I might provide a different approach with a different goal <grin>).

I also have never actually used repr() to create code that could be fed back to the interpreter (not saying it isn’t done, just that I haven’t run into needing it), and there are so many of the libraries that do not return a usable repr string that I would hesitate to even try it outside of a very narrow use case.

Personally, I normally use __repr__ to give me a useful troubleshooting representation of the object... but that representation might not be exactly the same as what would recreate the object since I might show information that is dynamic to the current state (like, the name of the parent instead of just a pointer or something).

I know that isn’t per the rules, but in the end, it makes it easier for me to troubleshoot the code.

Having said that, Ben is totally correct in terms of the "right" way to do it, my earlier suggestion was not "right".

Dan

[toc] | [prev] | [next] | [standalone]


#108093 — Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure)

Frommoa47401@gmail.com
Date2016-05-03 12:24 -0700
SubjectRe: Use __repr__ to show the programmer's representation (was: Need help understanding list structure)
Message-ID<b1cb80ee-1455-4cae-9db3-8655d61f508e@googlegroups.com>
In reply to#108092
quote - (Though to be fair, I don't really know what the actual problem was, so I might provide a different approach with a different goal <grin>)

Originally I was trying to understand the exact structure of the list being returned by the gedcom library. It worked as it was, but I wanted to add additional functionality.

I also wanted to understand what character set it was returning. I was giving it a gedcom file with ansel encoding, which is normal. My genealogy program can also export its database to gedcom using UTF-8 and Unicode. But both of those character sets caused the gedcom library to generate an error msg that the file violated GEDCOM format.

Keep in mind the gedcom format established by the Latter-day Saints hasn't been updated in 20+ years.

[toc] | [prev] | [next] | [standalone]


#108095 — Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure)

FromRandom832 <random832@fastmail.com>
Date2016-05-03 15:37 -0400
SubjectRe: Use __repr__ to show the programmer's representation (was: Need help understanding list structure)
Message-ID<mailman.359.1462304247.32212.python-list@python.org>
In reply to#108093
On Tue, May 3, 2016, at 15:24, moa47401@gmail.com wrote:
> I also wanted to understand what character set it was returning. I was
> giving it a gedcom file with ansel encoding, which is normal. My
> genealogy program can also export its database to gedcom using UTF-8 and
> Unicode. But both of those character sets caused the gedcom library to
> generate an error msg that the file violated GEDCOM format.

You haven't said where this library can be found, or what the error
message was.

[toc] | [prev] | [next] | [standalone]


#108096

FromMRAB <python@mrabarnett.plus.com>
Date2016-05-03 20:57 +0100
Message-ID<mailman.360.1462305440.32212.python-list@python.org>
In reply to#108088
On 2016-05-03 18:54, Dan Strohl via Python-list wrote:
>
>> I added a __repr__ method at the end of the gedcom library like so:
>>
>> def __repr__(self):
>>         """ Format this element as its original string """
>>         result = repr(self.level())
>>         if self.pointer() != "":
>>             result += ' ' + self.pointer()
>>         result += ' ' + self.tag()
>>         if self.value() != "":
>>             result += ' ' + self.value()
>>         return result
>>
>> and now I can print myList properly.
>>
>> Eric and Michael also mentioned repr above, but I guess I needed someone
>> to spell it out for me. Thanks for taking the time to put it in terms an old dog
>> could understand.
>>
>
> Glad to help!  (being an old dog myself, I know the feeling!)
>
> One other point for you, if your "__repr__(self)" code is the same as the "__str__(self)" code (which it looks like it is, at a glance at least), you can instead reference the __str__ method and save having a duplicate code block...  some examples:
>
> =========
> Option 1:  This is the easiest to read (IMHO) and allows for the possibility that str() is doing something here like formatting or whatever.  (in this case it shouldn't be though).  However, to call this actually is taking multiple steps (calling object.__repr__, whch calls str(), which calls object.__str__(). )
>
> def __repr__(self):
>     return str(self)
>
> =========
> Option 2: this isn't hard to read, and just takes two steps (calling object.__repr__(), which calls object.__str__().
>
> def __repr__(self):
>     return self.__str__()
>
> ========
> Option 3:  it's not that this is hard to read, but since it doesn't follow the standard "def blah(self):" pattern, sometimes I overlook these in the code (even when I put them there).  This however is the shortest since it really just tells the object to return object.__str__() if either object.__repr__() OR object.__str__() is called.
>
>
> __repr__ = __str__
>
>
> This probably doesn't matter much in this case, since it probably isn't called that much in normal use (though there are always exceptions), and in the end, Python is fast enough that unless you really need to slice off a few milliseconds, you will never notice the difference, but just some food for thought.
>
>
Option 4:

Delete the __str__ method.


If there's no __str__, Python falls back to __repr__.

If there's no __repr__, Python falls back to the <Something object at 
somewhere> format.

[toc] | [prev] | [next] | [standalone]


#108106 — Re: Use __repr__ to show the programmer's representation (was: Need help understanding list structure)

FromChris Angelico <rosuav@gmail.com>
Date2016-05-04 09:40 +1000
SubjectRe: Use __repr__ to show the programmer's representation (was: Need help understanding list structure)
Message-ID<mailman.369.1462318828.32212.python-list@python.org>
In reply to#108088
On Wed, May 4, 2016 at 4:35 AM, Dan Strohl via Python-list
<python-list@python.org> wrote:
> I also have never actually used repr() to create code that could be fed back to the interpreter (not saying it isn’t done, just that I haven’t run into needing it), and there are so many of the libraries that do not return a usable repr string that I would hesitate to even try it outside of a very narrow use case.

Here's a repr that I like using with SQLAlchemy:

def __repr__(self):
    return (self.__class__.__name__ + "(" +
        ", ".join("%s=%r" % (col.name, getattr(self, col.name)) for
col in self.__table__.columns) +
    ")")

That results in something that *looks* like you could eval it, but you
shouldn't ever actually do that (because it'd create a new object).
It's still an effective way to make the repr readable; imagine a list
that prints out like this:

[Person(id=3, name="Fred"), Person(id=6, name="Barney"), Person(id=8,
name="Joe")]

You can tell exactly where one starts and another ends; you can read
what's going on with these record objects. Making them
"pseudo-evalable" is worth doing, even if you should never *actually*
eval them.

ChrisA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web