Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #103568 > unrolled thread
| Started by | subhabangalore@gmail.com |
|---|---|
| First post | 2016-02-27 01:17 -0800 |
| Last post | 2016-02-29 09:23 -0800 |
| Articles | 4 — 3 participants |
Back to article view | Back to comp.lang.python
Error in Tree Structure subhabangalore@gmail.com - 2016-02-27 01:17 -0800
Re: Error in Tree Structure Steven D'Aprano <steve@pearwood.info> - 2016-02-27 22:29 +1100
Re: Error in Tree Structure Rustom Mody <rustompmody@gmail.com> - 2016-02-27 08:13 -0800
Re: Error in Tree Structure subhabangalore@gmail.com - 2016-02-29 09:23 -0800
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2016-02-27 01:17 -0800 |
| Subject | Error in Tree Structure |
| Message-ID | <33c93316-0ae3-4b8f-b2f6-29f76c8b3f32@googlegroups.com> |
I was trying to implement the code,
import nltk
import nltk.tag, nltk.chunk, itertools
def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
words, ents = zip(*tree.pos())
iobs = []
prev = None
for ent in ents:
if ent == tree.node:
iobs.append('O')
prev = None
elif prev == ent:
iobs.append('I-%s' % ent)
else:
iobs.append('B-%s' % ent)
prev = ent
words, tags = zip(*tag(words))
return itertools.izip(words, tags, iobs)
def ieer_chunked_sents(tag=nltk.tag.pos_tag):
for doc in ieer.parsed_docs():
tagged = ieertree2conlltags(doc.text, tag)
yield nltk.chunk.conlltags2tree(tagged)
from chunkers import ieer_chunked_sents, ClassifierChunker
from nltk.corpus import treebank_chunk
ieer_chunks = list(ieer_chunked_sents())
chunker = ClassifierChunker(ieer_chunks[:80])
print chunker.parse(treebank_chunk.tagged_sents()[0])
score = chunker.evaluate(ieer_chunks[80:])
print score.accuracy()
It is running fine.
But as I am trying to rewrite the code as,
chunker = ClassifierChunker(list1),
where list1 is same value as,
ieer_chunks[:80]
only I am pasting the value as
[Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', [(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', [(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), (u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'), (u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE', [(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), (u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), (u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),....(u'(cm-kjd)', 'NN')])]
the value of whole list directly I am getting syntax error.
I tried to paste it in Python IDE outside code there also it is giving syntax error.
If I do not paste the value and and rename ieer_chunks[:80] as list1 there is no error.
I may be doing some problem while copying the value and pasting it.
But I did not change anything there.
Is it any error in Python part or in NLTK part?
Thanks in advance.
If any one may guide me what is the error I am doing and how may I solve it.
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-02-27 22:29 +1100 |
| Message-ID | <56d1888f$0$1586$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #103568 |
On Sat, 27 Feb 2016 08:17 pm, subhabangalore@gmail.com wrote:
> Is it any error in Python part or in NLTK part?
Neither.
Any time you think there is an error in Python, it is 99.9% sure that the
error is in your code, not Python.
If the error is a SyntaxError, that is 99.99999%.
> If any one may guide me what is the error I am doing and how may I solve
> it.
Look at the SyntaxError traceback and read what it says. Does it tell you
what the error is? Does it use a ^ as an arrow to point to the error, or
immediately after the error?
Chances are, the error is that you have added or deleted a bracket or
parenthesis somewhere. Dealing with nested lists like this is usually
awful, because they are unreadable. Can you avoid copying and pasting the
nested list?
If not, try re-formatting it so you can at least read it:
# Unreadable, bad:
thelist = [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','),
Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION',
[(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL',
[(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'),
(u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'),
(u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE',
[(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'),
(u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'),
(u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'),
(u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'),
(u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),
(u'(cm-kjd)', 'NN')])]
# Slightly more readable, good:
thelist = [
Tree('S', [
Tree(
'LOCATION', [
(u'NAIROBI', 'NNP')
]
),
(u',', ','),
Tree(
'LOCATION', [
(u'Kenya', 'NNP')
]
),
(u'(', '('),
Tree(
'ORGANIZATION', [
(u'AP', 'NNP')
]
),
(u')', ')'),
(u'_', 'NNP'),
Tree(
'CARDINAL', [
(u'Thousands', 'NNP')
]
),
(u'of', 'IN'),
(u'laborers,', 'JJ'),
(u'students', 'NNS'),
(u'and', 'CC'),
(u'opposition', 'NN'),
(u'politicians', 'NNS'),
(u'on', 'IN'),
Tree(
'DATE', [
(u'Saturday', 'NNP')
]
),
(u'protested', 'VBD'),
(u'tax', 'NN'),
(u'hikes', 'NNS'),
(u'imposed', 'VBN'),
(u'by', 'IN'),
(u'their', 'PRP$'),
(u'cash-strapped', 'JJ'),
(u'government,', 'NN'),
(u'which', 'WDT'),
(u'they', 'PRP'),
(u'accused', 'VBD'),
(u'of', 'IN'),
(u'failing', 'VBG'),
(u'to', 'TO'),
(u'provide', 'VB'),
(u'basic', 'JJ'),
(u'services.', 'NN'),
(u'(cm-kjd)', 'NN')
]
)
]
If you format your giant list so you can see the structure, then you will
likely find where there is a problem. Perhaps you have:
- missing or too many [ ( ) or ]
- a missing or extra comma
- a missing or extra quotation marks
- unexpected symbols like .... inside the list
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2016-02-27 08:13 -0800 |
| Message-ID | <573e58fa-36af-4e32-9460-5335d889df26@googlegroups.com> |
| In reply to | #103568 |
On Saturday, February 27, 2016 at 2:47:53 PM UTC+5:30, subhaba...@gmail.com wrote:
> I was trying to implement the code,
>
> import nltk
> import nltk.tag, nltk.chunk, itertools
> def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
> words, ents = zip(*tree.pos())
> iobs = []
> prev = None
> for ent in ents:
> if ent == tree.node:
> iobs.append('O')
> prev = None
> elif prev == ent:
> iobs.append('I-%s' % ent)
> else:
> iobs.append('B-%s' % ent)
> prev = ent
> words, tags = zip(*tag(words))
> return itertools.izip(words, tags, iobs)
>
> def ieer_chunked_sents(tag=nltk.tag.pos_tag):
> for doc in ieer.parsed_docs():
> tagged = ieertree2conlltags(doc.text, tag)
> yield nltk.chunk.conlltags2tree(tagged)
>
>
> from chunkers import ieer_chunked_sents, ClassifierChunker
> from nltk.corpus import treebank_chunk
> ieer_chunks = list(ieer_chunked_sents())
> chunker = ClassifierChunker(ieer_chunks[:80])
> print chunker.parse(treebank_chunk.tagged_sents()[0])
> score = chunker.evaluate(ieer_chunks[80:])
> print score.accuracy()
>
> It is running fine.
> But as I am trying to rewrite the code as,
> chunker = ClassifierChunker(list1),
> where list1 is same value as,
> ieer_chunks[:80]
> only I am pasting the value as
> [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', [(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', [(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), (u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'), (u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE', [(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), (u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), (u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),....(u'(cm-kjd)', 'NN')])]
> the value of whole list directly I am getting syntax error.
Dunno how literally you intend this but there is a "...." near the end
of the list. Intended?
[toc] | [prev] | [next] | [standalone]
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2016-02-29 09:23 -0800 |
| Message-ID | <07d3149e-9867-478a-bc1c-65263ad50659@googlegroups.com> |
| In reply to | #103589 |
On Saturday, February 27, 2016 at 9:43:56 PM UTC+5:30, Rustom Mody wrote:
> On Saturday, February 27, 2016 at 2:47:53 PM UTC+5:30, subhaba...@gmail.com wrote:
> > I was trying to implement the code,
> >
> > import nltk
> > import nltk.tag, nltk.chunk, itertools
> > def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
> > words, ents = zip(*tree.pos())
> > iobs = []
> > prev = None
> > for ent in ents:
> > if ent == tree.node:
> > iobs.append('O')
> > prev = None
> > elif prev == ent:
> > iobs.append('I-%s' % ent)
> > else:
> > iobs.append('B-%s' % ent)
> > prev = ent
> > words, tags = zip(*tag(words))
> > return itertools.izip(words, tags, iobs)
> >
> > def ieer_chunked_sents(tag=nltk.tag.pos_tag):
> > for doc in ieer.parsed_docs():
> > tagged = ieertree2conlltags(doc.text, tag)
> > yield nltk.chunk.conlltags2tree(tagged)
> >
> >
> > from chunkers import ieer_chunked_sents, ClassifierChunker
> > from nltk.corpus import treebank_chunk
> > ieer_chunks = list(ieer_chunked_sents())
> > chunker = ClassifierChunker(ieer_chunks[:80])
> > print chunker.parse(treebank_chunk.tagged_sents()[0])
> > score = chunker.evaluate(ieer_chunks[80:])
> > print score.accuracy()
> >
> > It is running fine.
> > But as I am trying to rewrite the code as,
> > chunker = ClassifierChunker(list1),
> > where list1 is same value as,
> > ieer_chunks[:80]
> > only I am pasting the value as
> > [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', [(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', [(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), (u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'), (u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE', [(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), (u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), (u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),....(u'(cm-kjd)', 'NN')])]
> > the value of whole list directly I am getting syntax error.
>
> Dunno how literally you intend this but there is a "...." near the end
> of the list. Intended?
It is intended. As actual list was large.
And most likely I could solve the problem,
with
from nltk.tree import Tree
I missed in my code.
Thank you for your kind time and discussion.
Regards,
RP
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web