Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #107610 > unrolled thread
| Started by | subhabangalore@gmail.com |
|---|---|
| First post | 2016-04-25 07:56 -0700 |
| Last post | 2016-04-25 20:31 +0000 |
| Articles | 7 — 4 participants |
Back to article view | Back to comp.lang.python
Question on List processing subhabangalore@gmail.com - 2016-04-25 07:56 -0700
Re: Question on List processing Steven D'Aprano <steve@pearwood.info> - 2016-04-26 02:36 +1000
Re: Question on List processing subhabangalore@gmail.com - 2016-04-26 08:38 -0700
Re: Question on List processing Random832 <random832@fastmail.com> - 2016-04-26 11:49 -0400
Re: Question on List processing Steven D'Aprano <steve@pearwood.info> - 2016-04-27 02:29 +1000
Re: Question on List processing subhabangalore@gmail.com - 2016-04-26 08:44 -0700
Re: Question on List processing Matt Wheeler <m@funkyhat.org> - 2016-04-25 20:31 +0000
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2016-04-25 07:56 -0700 |
| Subject | Question on List processing |
| Message-ID | <d0b4c737-3922-4b49-8f69-2564ba472950@googlegroups.com> |
Dear Group,
I have a list of tuples, as follows,
list1=[u"('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA philanthropic/NA activities/NA ','class1')", u"('koteeswaram/BHPERSN is/NA a/NA very/NA nice/NA person/NA ','class1')", u"('koteeswaram/BHPERSN came/NA to/NA mumbai/LOC but/NA could/NA not/NA attend/NA the/ARTDEF board/NA meeting/NA ','class1')", u"('the/ARTDEF people/NA of/NA the/ARTDEF company ABCOMP did/NA not/NA vote/NA for/NA koteeswaram/LOC ','class2')", u"('the/ARTDEF director AHT of/NA the/ARTDEF company,/NA koteeswaram/BHPERSN had/NA been/NA advised/NA to/NA take/NA rest/NA for/NA a/NA while/NA ','class2')", u"('animesh/BHPERSN chauhan/BHPERSN arrived/NA by/NA his/PRNM3PAS private/NA aircraft/NA in/NA mumbai/LOC ','class2')", u"('animesh/BHPERSN chauhan/BHPERSN met/NA the/ARTDEF prime/HPLPERST minister/AHT of/NA india/LOCC over/NA some/NA issues/NA ','class2')", u"('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA set/NA up/NA a/NA plant/NA in/NA uk/LOCC ','class3')", u"('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA launch/NA a/NA new/ABCOMP office/AHT in/NA burdwan/LOC ','class3')", u"('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA work/NA out/NA the/ARTDEF launch/NA of/NA a/NA new/ABCOMP product/NA in/NA india/LOCC ','class3')"]
I want to make it like,
[('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA philanthropic/NA activities/NA','class1'),
('koteeswaram/BHPERSN is/NA a/NA very/NA nice/NA person/NA ','class1'), ('koteeswaram/BHPERSN came/NA to/NA mumbai/LOC but/NA could/NA not/NA attend/NA the/ARTDEF board/NA meeting/NA','class1'), ('the/ARTDEF people/NA of/NA the/ARTDEF company ABCOMP did/NA not/NA vote/NA for/NA koteeswaram/LOC ','class2'), ('the/ARTDEF director AHT of/NA the/ARTDEF company,/NA koteeswaram/BHPERSN had/NA been/NA advised/NA to/NA take/NA rest/NA for/NA a/NA while/NA ','class2'), ('animesh/BHPERSN chauhan/BHPERSN arrived/NA by/NA his/PRNM3PAS private/NA aircraft/NA in/NA mumbai/LOC','class2'), ('animesh/BHPERSN chauhan/BHPERSN met/NA the/ARTDEF prime/HPLPERST minister/AHT of/NA india/LOCC over/NA some/NA issues/NA','class2'), ('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA set/NA up/NA a/NA plant/NA in/NA uk/LOCC','class3'), ('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA launch/NA a/NA new/ABCOMP office/AHT in/NA burdwan/LOC','class3'),
('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA work/NA out/NA the/ARTDEF launch/NA of/NA a/NA new/ABCOMP product/NA in/NA india/LOCC','class3')]
I tried to make it as follows,
list2=[]
for i in train_sents:
a1=unicodedata.normalize('NFKD', i).encode('ascii','ignore')
a2=a1.replace('"',"")
list2.append(a2)
and,
for i in list1:
a3=i[1:-1]
list2.append(a3)
but not helping.
If any one may kindly suggest how may I approach it?
Thanks in Advance,
Regards,
Subhabrata Banerjee.
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-26 02:36 +1000 |
| Message-ID | <571e47aa$0$1588$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #107610 |
On Tue, 26 Apr 2016 12:56 am, subhabangalore@gmail.com wrote:
> Dear Group,
>
> I have a list of tuples, as follows,
>
> list1=[u"('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA
[... 17 more lines of data ...]
Hi Subhabrata, and thanks for the question.
Please remember that we are offering help for free, in our own time. If you
want help from us, you should help us to help you.
It is very unlikely that many people will spend the time to study your data
in close enough detail to understand your requirements. Please give a
*simplified* example. Instead of 17 lines of repetitive data, use a "toy"
example that matches the format but without all the complicated details.
And format it so that it is easy to read:
input = [u"('a/b/ ','A')",
u"('z/x/ ','B')",
u"('b/d/ ','C')",
]
output = ????
> I tried to make it as follows,
[...]
> but not helping.
What do you mean, "not helping"? What happens when you try?
Please show a *simple* example, with no more than four or five lines of
*short, easy to read* text.
Remember, we are giving you advice and consulting for free. We are not paid
to do this. If your questions are too difficult, boring, tedious, or
unpleasant, we will just ignore them, so please help us to help you by
simplifying them as much as possible.
Thank you.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2016-04-26 08:38 -0700 |
| Message-ID | <594d024b-7880-4d45-911e-342ec90ac54c@googlegroups.com> |
| In reply to | #107617 |
On Monday, April 25, 2016 at 10:07:13 PM UTC+5:30, Steven D'Aprano wrote:
> On Tue, 26 Apr 2016 12:56 am, wrote:
>
> > Dear Group,
> >
> > I have a list of tuples, as follows,
> >
> > list1=[u"('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA
> [... 17 more lines of data ...]
>
> Hi Subhabrata, and thanks for the question.
>
> Please remember that we are offering help for free, in our own time. If you
> want help from us, you should help us to help you.
>
> It is very unlikely that many people will spend the time to study your data
> in close enough detail to understand your requirements. Please give a
> *simplified* example. Instead of 17 lines of repetitive data, use a "toy"
> example that matches the format but without all the complicated details.
> And format it so that it is easy to read:
>
> input = [u"('a/b/ ','A')",
> u"('z/x/ ','B')",
> u"('b/d/ ','C')",
> ]
>
> output = ????
>
>
>
> > I tried to make it as follows,
> [...]
> > but not helping.
>
> What do you mean, "not helping"? What happens when you try?
>
> Please show a *simple* example, with no more than four or five lines of
> *short, easy to read* text.
>
> Remember, we are giving you advice and consulting for free. We are not paid
> to do this. If your questions are too difficult, boring, tedious, or
> unpleasant, we will just ignore them, so please help us to help you by
> simplifying them as much as possible.
>
> Thank you.
>
>
>
>
> --
> Steven
Dear Steven,
Thank you for your kind suggestion.
I will keep it in mind.
I am trying to send you a revised example.
list1=[u"('koteeswaram/BHPERSN engaged/NA ','class1')", u"('koteeswaram/BHPERSN is/NA ','class1')"]
[('koteeswaram/BHPERSN engaged/NA ','class1'),
('koteeswaram/BHPERSN is/NA ','class1')]
I tried to make it as follows,
list2=[]
for i in list1:
a1=unicodedata.normalize('NFKD', i).encode('ascii','ignore')
a2=a1.replace('"',"")
list2.append(a2)
and,
for i in list1:
a3=i[1:-1]
list2.append(a3)
but I am not getting desired output.
If any one may kindly suggest how may I approach it?
Regards,
Subhabrata
[toc] | [prev] | [next] | [standalone]
| From | Random832 <random832@fastmail.com> |
|---|---|
| Date | 2016-04-26 11:49 -0400 |
| Message-ID | <mailman.111.1461685776.32212.python-list@python.org> |
| In reply to | #107658 |
On Tue, Apr 26, 2016, at 11:38, subhabangalore@gmail.com wrote:
> I am trying to send you a revised example.
> list1=[u"('koteeswaram/BHPERSN engaged/NA ','class1')",
> u"('koteeswaram/BHPERSN is/NA ','class1')"]
>
> [('koteeswaram/BHPERSN engaged/NA ','class1'),
> ('koteeswaram/BHPERSN is/NA ','class1')]
>
> I tried to make it as follows,
> list2=[]
> for i in list1:
> a1=unicodedata.normalize('NFKD', i).encode('ascii','ignore')
> a2=a1.replace('"',"")
> list2.append(a2)
I think you're still a bit confused. The values don't actually contain
'"' (or 'u'), that's just an indicator that they're strings. You can't
turn a string into something else just by removing the quotes. Look at
the ast.literal_eval function as others have recommended.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-04-27 02:29 +1000 |
| Message-ID | <571f976e$0$1619$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #107658 |
On Wed, 27 Apr 2016 01:38 am, subhabangalore@gmail.com wrote:
> I am trying to send you a revised example.
> list1=[u"('koteeswaram/BHPERSN engaged/NA ','class1')",
> u"('koteeswaram/BHPERSN is/NA ','class1')"]
Please don't use generic names that mean nothing like "list1". We can see it
is a list, but what is it for? Use a name that describes what the purpose
of the list is. Even "input" and "output" are better names.
> [('koteeswaram/BHPERSN engaged/NA ','class1'),
> ('koteeswaram/BHPERSN is/NA ','class1')]
What is this? The output? Don't make us guess what things are.
My *guess* is that you have a list of Unicode strings that look like this:
u"('aaa/TAG bbb/TAG ','class1')"
and you want to do six things:
- normalise the string;
- convert the Unicode string to ASCII, ignoring anything that isn't ASCII;
- delete the parentheses in the string;
- delete the leading and trailing single quotes;
- split the string on the comma;
- combine them into a tuple.
So let's make some functions:
# Untested
def remove_parentheses(string):
if string.startswith("(") and string.endswith(")"):
string = string[1:-1]
return string
def remove_single_quotes(string):
if string.startswith("'") and string.endswith("'"):
string = string[1:-1]
return string
def convert(string):
if not isinstance(string, unicode):
raise TypeError("expected unicode, but got %s"
% type(string).__name__)
string = unicodedata.normalize('NFKD', string)
string = string.encode('ascii','ignore')
string = remove_parentheses(string)
first_part, second_part = string.split(",")
first_part = remove_single_quotes(first_part)
second_part = remove_single_quotes(second_part)
return (first_part, second_part)
input = [ ... ] # your input strings
output = []
for string in input:
output.append(convert(string))
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2016-04-26 08:44 -0700 |
| Message-ID | <513f320f-c4ca-406b-b203-b21fdbf0c1fc@googlegroups.com> |
| In reply to | #107617 |
On Monday, April 25, 2016 at 10:07:13 PM UTC+5:30, Steven D'Aprano wrote:
>
>
> > Dear Group,
> >
> > I have a list of tuples, as follows,
> >
> > list1=[u"('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA
> [... 17 more lines of data ...]
>
> Hi Subhabrata, and thanks for the question.
>
> Please remember that we are offering help for free, in our own time. If you
> want help from us, you should help us to help you.
>
> It is very unlikely that many people will spend the time to study your data
> in close enough detail to understand your requirements. Please give a
> *simplified* example. Instead of 17 lines of repetitive data, use a "toy"
> example that matches the format but without all the complicated details.
> And format it so that it is easy to read:
>
> input = [u"('a/b/ ','A')",
> u"('z/x/ ','B')",
> u"('b/d/ ','C')",
> ]
>
> output = ????
>
>
>
> > I tried to make it as follows,
> [...]
> > but not helping.
>
> What do you mean, "not helping"? What happens when you try?
>
> Please show a *simple* example, with no more than four or five lines of
> *short, easy to read* text.
>
> Remember, we are giving you advice and consulting for free. We are not paid
> to do this. If your questions are too difficult, boring, tedious, or
> unpleasant, we will just ignore them, so please help us to help you by
> simplifying them as much as possible.
>
> Thank you.
>
>
>
>
> --
> Steven
Dear Steven,
Thank you for your kind suggestion.
I am trying to send you a revised example.
I have a list as,
list1=[u"('koteeswaram/BHPERSN engaged/NA ','class1')", u"('koteeswaram/BHPERSN is/NA ','class1')"]
I like to convert it as,
list1=[('koteeswaram/BHPERSN engaged/NA ','class1'),
('koteeswaram/BHPERSN is/NA ','class1')]
I tried to make it as follows,
list2=[]
for i in list1:
a1=unicodedata.normalize('NFKD', i).encode('ascii','ignore')
a2=a1.replace('"',"")
list2.append(a2)
and,
for i in list1:
a3=i[1:-1]
list2.append(a3)
but I am not getting desired output.
If any one may kindly suggest how may I approach it?
Regards,
Subhabrata
[toc] | [prev] | [next] | [standalone]
| From | Matt Wheeler <m@funkyhat.org> |
|---|---|
| Date | 2016-04-25 20:31 +0000 |
| Message-ID | <mailman.93.1461616308.32212.python-list@python.org> |
| In reply to | #107610 |
On Mon, 25 Apr 2016 15:56 , <subhabangalore@gmail.com> wrote:
> Dear Group,
>
> I have a list of tuples, as follows,
>
> list1=[u"('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA
> philanthropic/NA activities/NA ','class1')", u"('koteeswaram/BHPERSN is/NA
> a/NA very/NA nice/NA person/NA ','class1')", u"('koteeswaram/BHPERSN
> came/NA to/NA mumbai/LOC but/NA could/NA not/NA attend/NA the/ARTDEF
> board/NA meeting/NA ','class1')", u"('the/ARTDEF people/NA of/NA
> the/ARTDEF company ABCOMP did/NA not/NA vote/NA for/NA koteeswaram/LOC
> ','class2')", u"('the/ARTDEF director AHT of/NA the/ARTDEF company,/NA
> koteeswaram/BHPERSN had/NA been/NA advised/NA to/NA take/NA rest/NA for/NA
> a/NA while/NA ','class2')", u"('animesh/BHPERSN chauhan/BHPERSN arrived/NA
> by/NA his/PRNM3PAS private/NA aircraft/NA in/NA mumbai/LOC ','class2')",
> u"('animesh/BHPERSN chauhan/BHPERSN met/NA the/ARTDEF prime/HPLPERST
> minister/AHT of/NA india/LOCC over/NA some/NA issues/NA ','class2')",
> u"('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA set/NA up/NA a/NA
> plant/NA in/NA uk/LOCC ','class3')", u"('animesh/BHPERSN chauh
> an/BHPERSN is/NA trying/NA to/NA launch/NA a/NA new/ABCOMP office/AHT
> in/NA burdwan/LOC ','class3')", u"('animesh/BHPERSN chauhan/BHPERSN is/NA
> trying/NA to/NA work/NA out/NA the/ARTDEF launch/NA of/NA a/NA new/ABCOMP
> product/NA in/NA india/LOCC ','class3')"]
>
What you have is a list of strings, not tuples.
>
> I want to make it like,
>
> [('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA
> philanthropic/NA activities/NA','class1'),
> ('koteeswaram/BHPERSN is/NA a/NA very/NA nice/NA person/NA ','class1'),
> ('koteeswaram/BHPERSN came/NA to/NA mumbai/LOC but/NA could/NA not/NA
> attend/NA the/ARTDEF board/NA meeting/NA','class1'), ('the/ARTDEF people/NA
> of/NA the/ARTDEF company ABCOMP did/NA not/NA vote/NA for/NA
> koteeswaram/LOC ','class2'), ('the/ARTDEF director AHT of/NA
> the/ARTDEF company,/NA koteeswaram/BHPERSN had/NA been/NA advised/NA to/NA
> take/NA rest/NA for/NA a/NA while/NA ','class2'), ('animesh/BHPERSN
> chauhan/BHPERSN arrived/NA by/NA his/PRNM3PAS private/NA aircraft/NA in/NA
> mumbai/LOC','class2'), ('animesh/BHPERSN chauhan/BHPERSN met/NA the/ARTDEF
> prime/HPLPERST minister/AHT of/NA india/LOCC over/NA some/NA
> issues/NA','class2'), ('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA
> to/NA set/NA up/NA a/NA plant/NA in/NA uk/LOCC','class3'),
> ('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA launch/NA a/NA
> new/ABCOMP office/AHT in/NA burdwan/LOC','class3'),
> ('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA work/NA out/NA
> the/ARTDEF launch/NA of/NA a/NA new/ABCOMP product/NA in/NA
> india/LOCC','class3')]
>
> I tried to make it as follows,
> list2=[]
> for i in train_sents:
> a1=unicodedata.normalize('NFKD', i).encode('ascii','ignore')
> a2=a1.replace('"',"")
> list2.append(a2)
> and,
>
> for i in list1:
> a3=i[1:-1]
> list2.append(a3)
>
In both of these you seem to be trying to remove the double quote marks
from the strings, but they aren't part of the strings in the first place,
just delimiters.
>
>
> but not helping.
> If any one may kindly suggest how may I approach it?
>
Check out the documentation for ast.literal_eval
>
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web