Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #103568 > unrolled thread

Error in Tree Structure

Started bysubhabangalore@gmail.com
First post2016-02-27 01:17 -0800
Last post2016-02-29 09:23 -0800
Articles 4 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  Error in Tree Structure subhabangalore@gmail.com - 2016-02-27 01:17 -0800
    Re: Error in Tree Structure Steven D'Aprano <steve@pearwood.info> - 2016-02-27 22:29 +1100
    Re: Error in Tree Structure Rustom Mody <rustompmody@gmail.com> - 2016-02-27 08:13 -0800
      Re: Error in Tree Structure subhabangalore@gmail.com - 2016-02-29 09:23 -0800

#103568 — Error in Tree Structure

Fromsubhabangalore@gmail.com
Date2016-02-27 01:17 -0800
SubjectError in Tree Structure
Message-ID<33c93316-0ae3-4b8f-b2f6-29f76c8b3f32@googlegroups.com>
I was trying to implement the code, 

import nltk
import nltk.tag, nltk.chunk, itertools
def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
    words, ents = zip(*tree.pos())
    iobs = []
    prev = None
    for ent in ents:
        if ent == tree.node:
            iobs.append('O')
            prev = None
        elif prev == ent:
             iobs.append('I-%s' % ent)
        else:
             iobs.append('B-%s' % ent)
             prev = ent
    words, tags = zip(*tag(words))
    return itertools.izip(words, tags, iobs)

def ieer_chunked_sents(tag=nltk.tag.pos_tag):
    for doc in ieer.parsed_docs():
        tagged = ieertree2conlltags(doc.text, tag)
        yield nltk.chunk.conlltags2tree(tagged)

                
from chunkers import ieer_chunked_sents, ClassifierChunker
from nltk.corpus import treebank_chunk
ieer_chunks = list(ieer_chunked_sents())
chunker = ClassifierChunker(ieer_chunks[:80])
print chunker.parse(treebank_chunk.tagged_sents()[0])
score = chunker.evaluate(ieer_chunks[80:])
print score.accuracy()

It is running fine. 
But as I am trying to rewrite the code as,
chunker = ClassifierChunker(list1),
where list1 is same value as,
ieer_chunks[:80]
only I am pasting the value as 
[Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', [(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', [(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), (u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'), (u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE', [(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), (u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), (u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),....(u'(cm-kjd)', 'NN')])]
the value of whole list directly I am getting syntax error.
I tried to paste it in Python IDE outside code there also it is giving syntax error. 
If I do not paste the value and and rename ieer_chunks[:80] as list1 there is no error.
I may be doing some problem while copying the value and pasting it. 
But I did not change anything there.

Is it any error in Python part or in NLTK part? 

Thanks in advance.
If any one may guide me what is the error I am doing and how may I solve it.


 

[toc] | [next] | [standalone]


#103576

FromSteven D'Aprano <steve@pearwood.info>
Date2016-02-27 22:29 +1100
Message-ID<56d1888f$0$1586$c3e8da3$5496439d@news.astraweb.com>
In reply to#103568
On Sat, 27 Feb 2016 08:17 pm, subhabangalore@gmail.com wrote:

> Is it any error in Python part or in NLTK part?

Neither.

Any time you think there is an error in Python, it is 99.9% sure that the
error is in your code, not Python.

If the error is a SyntaxError, that is 99.99999%.

> If any one may guide me what is the error I am doing and how may I solve
> it.

Look at the SyntaxError traceback and read what it says. Does it tell you
what the error is? Does it use a ^ as an arrow to point to the error, or
immediately after the error?

Chances are, the error is that you have added or deleted a bracket or
parenthesis somewhere. Dealing with nested lists like this is usually
awful, because they are unreadable. Can you avoid copying and pasting the
nested list?

If not, try re-formatting it so you can at least read it:

# Unreadable, bad:
thelist = [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','),
Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION',
[(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL',
[(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'),
(u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'),
(u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE',
[(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'),
(u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'),
(u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'),
(u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'),
(u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),
(u'(cm-kjd)', 'NN')])]


# Slightly more readable, good:
thelist = [
          Tree('S', [
                    Tree(
                        'LOCATION', [
                                    (u'NAIROBI', 'NNP')
                                    ]
                        ), 
                    (u',', ','), 
                    Tree(
                        'LOCATION', [
                                    (u'Kenya', 'NNP')
                                    ]
                        ), 
                    (u'(', '('), 
                    Tree(
                        'ORGANIZATION', [
                                        (u'AP', 'NNP')
                                        ]
                       ),
                    (u')', ')'), 
                    (u'_', 'NNP'), 
                    Tree(
                        'CARDINAL', [
                                    (u'Thousands', 'NNP')
                                    ]
                        ), 
                    (u'of', 'IN'), 
                    (u'laborers,', 'JJ'), 
                    (u'students', 'NNS'), 
                    (u'and', 'CC'), 
                    (u'opposition', 'NN'), 
                    (u'politicians', 'NNS'), 
                    (u'on', 'IN'), 
                    Tree(
                        'DATE', [
                                (u'Saturday', 'NNP')
                                ]
                        ), 
                    (u'protested', 'VBD'), 
                    (u'tax', 'NN'), 
                    (u'hikes', 'NNS'), 
                    (u'imposed', 'VBN'), 
                    (u'by', 'IN'), 
                    (u'their', 'PRP$'), 
                    (u'cash-strapped', 'JJ'), 
                    (u'government,', 'NN'), 
                    (u'which', 'WDT'), 
                    (u'they', 'PRP'), 
                    (u'accused', 'VBD'), 
                    (u'of', 'IN'), 
                    (u'failing', 'VBG'), 
                    (u'to', 'TO'), 
                    (u'provide', 'VB'), 
                    (u'basic', 'JJ'), 
                    (u'services.', 'NN'), 
                    (u'(cm-kjd)', 'NN')
                    ]
              )   
          ]



If you format your giant list so you can see the structure, then you will
likely find where there is a problem. Perhaps you have:

- missing or too many [ ( ) or ]

- a missing or extra comma

- a missing or extra quotation marks

- unexpected symbols like .... inside the list




-- 
Steven

[toc] | [prev] | [next] | [standalone]


#103589

FromRustom Mody <rustompmody@gmail.com>
Date2016-02-27 08:13 -0800
Message-ID<573e58fa-36af-4e32-9460-5335d889df26@googlegroups.com>
In reply to#103568
On Saturday, February 27, 2016 at 2:47:53 PM UTC+5:30, subhaba...@gmail.com wrote:
> I was trying to implement the code, 
> 
> import nltk
> import nltk.tag, nltk.chunk, itertools
> def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
>     words, ents = zip(*tree.pos())
>     iobs = []
>     prev = None
>     for ent in ents:
>         if ent == tree.node:
>             iobs.append('O')
>             prev = None
>         elif prev == ent:
>              iobs.append('I-%s' % ent)
>         else:
>              iobs.append('B-%s' % ent)
>              prev = ent
>     words, tags = zip(*tag(words))
>     return itertools.izip(words, tags, iobs)
> 
> def ieer_chunked_sents(tag=nltk.tag.pos_tag):
>     for doc in ieer.parsed_docs():
>         tagged = ieertree2conlltags(doc.text, tag)
>         yield nltk.chunk.conlltags2tree(tagged)
> 
>                 
> from chunkers import ieer_chunked_sents, ClassifierChunker
> from nltk.corpus import treebank_chunk
> ieer_chunks = list(ieer_chunked_sents())
> chunker = ClassifierChunker(ieer_chunks[:80])
> print chunker.parse(treebank_chunk.tagged_sents()[0])
> score = chunker.evaluate(ieer_chunks[80:])
> print score.accuracy()
> 
> It is running fine. 
> But as I am trying to rewrite the code as,
> chunker = ClassifierChunker(list1),
> where list1 is same value as,
> ieer_chunks[:80]
> only I am pasting the value as 
> [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', [(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', [(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), (u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'), (u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE', [(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), (u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), (u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),....(u'(cm-kjd)', 'NN')])]
> the value of whole list directly I am getting syntax error.

Dunno how literally you intend this but there is a "...." near the end 
of the list. Intended?

[toc] | [prev] | [next] | [standalone]


#103738

Fromsubhabangalore@gmail.com
Date2016-02-29 09:23 -0800
Message-ID<07d3149e-9867-478a-bc1c-65263ad50659@googlegroups.com>
In reply to#103589
On Saturday, February 27, 2016 at 9:43:56 PM UTC+5:30, Rustom Mody wrote:
> On Saturday, February 27, 2016 at 2:47:53 PM UTC+5:30, subhaba...@gmail.com wrote:
> > I was trying to implement the code, 
> > 
> > import nltk
> > import nltk.tag, nltk.chunk, itertools
> > def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
> >     words, ents = zip(*tree.pos())
> >     iobs = []
> >     prev = None
> >     for ent in ents:
> >         if ent == tree.node:
> >             iobs.append('O')
> >             prev = None
> >         elif prev == ent:
> >              iobs.append('I-%s' % ent)
> >         else:
> >              iobs.append('B-%s' % ent)
> >              prev = ent
> >     words, tags = zip(*tag(words))
> >     return itertools.izip(words, tags, iobs)
> > 
> > def ieer_chunked_sents(tag=nltk.tag.pos_tag):
> >     for doc in ieer.parsed_docs():
> >         tagged = ieertree2conlltags(doc.text, tag)
> >         yield nltk.chunk.conlltags2tree(tagged)
> > 
> >                 
> > from chunkers import ieer_chunked_sents, ClassifierChunker
> > from nltk.corpus import treebank_chunk
> > ieer_chunks = list(ieer_chunked_sents())
> > chunker = ClassifierChunker(ieer_chunks[:80])
> > print chunker.parse(treebank_chunk.tagged_sents()[0])
> > score = chunker.evaluate(ieer_chunks[80:])
> > print score.accuracy()
> > 
> > It is running fine. 
> > But as I am trying to rewrite the code as,
> > chunker = ClassifierChunker(list1),
> > where list1 is same value as,
> > ieer_chunks[:80]
> > only I am pasting the value as 
> > [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', [(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', [(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), (u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'), (u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE', [(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), (u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), (u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),....(u'(cm-kjd)', 'NN')])]
> > the value of whole list directly I am getting syntax error.
> 
> Dunno how literally you intend this but there is a "...." near the end 
> of the list. Intended?

It is intended. As actual list was large.
And most likely I could solve the problem,
with 
from nltk.tree import Tree
I missed in my code.

Thank you for your kind time and discussion.

Regards,
RP

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web