Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Matt Wheeler Newsgroups: comp.lang.python Subject: Re: Question on List processing Date: Mon, 25 Apr 2016 20:31:28 +0000 Lines: 79 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: news.uni-berlin.de xRmUKV0ScTQ7DFXKMsJFLgYAgL7i75i9ALXJTrPwD5qw== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.024 X-Spam-Evidence: '*H*': 0.95; '*S*': 0.00; 'skip:[ 20': 0.03; 'subject:Question': 0.05; 'suggest': 0.15; '2016': 0.16; 'helping.': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'to:name:python list': 0.16; 'wrote:': 0.16; 'skip:l 30': 0.18; 'email addr:gmail.com>': 0.18; "aren't": 0.22; 'trying': 0.22; 'tried': 0.24; 'header:In-Reply-To:1': 0.24; 'mon,': 0.24; 'message-id:@mail.gmail.com': 0.27; "skip:' 10": 0.28; 'skip:u 20': 0.28; 'skip:( 20': 0.28; 'strings,': 0.29; 'subject:List': 0.33; 'list': 0.34; 'received:google.com': 0.35; 'but': 0.36; 'skip:i 20': 0.36; 'received:209.85': 0.36; 'to:addr:python-list': 0.36; 'subject:: ': 0.37; 'seem': 0.37; 'received:209': 0.38; 'to:addr:python.org': 0.40; 'company': 0.60; 'group,': 0.60; 'skip:u 10': 0.61; 'director': 0.63; 'kindly': 0.64; 'skip:\xc2 10': 0.67; 'dear': 0.67; 'from:addr:m': 0.84 X-Virus-Scanned: Debian amavisd-new at membrane.funkyhat.net X-Gm-Message-State: AOPr4FW9n8Vyi886sdQWIoZTqYiroDgI/17ESLf0JRVNdkh52MXDpWofDBnz1Eut5WnNW+VRl8pRrV5sOyYpvQ== X-Received: by 10.25.19.198 with SMTP id 67mr10642080lft.58.1461616299013; Mon, 25 Apr 2016 13:31:39 -0700 (PDT) In-Reply-To: X-Gmail-Original-Message-ID: X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Mailman-Original-Message-ID: X-Mailman-Original-References: Xref: csiph.com comp.lang.python:107628 On Mon, 25 Apr 2016 15:56 , wrote: > Dear Group, > > I have a list of tuples, as follows, > > list1=[u"('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA > philanthropic/NA activities/NA ','class1')", u"('koteeswaram/BHPERSN is/NA > a/NA very/NA nice/NA person/NA ','class1')", u"('koteeswaram/BHPERSN > came/NA to/NA mumbai/LOC but/NA could/NA not/NA attend/NA the/ARTDEF > board/NA meeting/NA ','class1')", u"('the/ARTDEF people/NA of/NA > the/ARTDEF company ABCOMP did/NA not/NA vote/NA for/NA koteeswaram/LOC > ','class2')", u"('the/ARTDEF director AHT of/NA the/ARTDEF company,/NA > koteeswaram/BHPERSN had/NA been/NA advised/NA to/NA take/NA rest/NA for/NA > a/NA while/NA ','class2')", u"('animesh/BHPERSN chauhan/BHPERSN arrived/NA > by/NA his/PRNM3PAS private/NA aircraft/NA in/NA mumbai/LOC ','class2')", > u"('animesh/BHPERSN chauhan/BHPERSN met/NA the/ARTDEF prime/HPLPERST > minister/AHT of/NA india/LOCC over/NA some/NA issues/NA ','class2')", > u"('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA set/NA up/NA a/NA > plant/NA in/NA uk/LOCC ','class3')", u"('animesh/BHPERSN chauh > an/BHPERSN is/NA trying/NA to/NA launch/NA a/NA new/ABCOMP office/AHT > in/NA burdwan/LOC ','class3')", u"('animesh/BHPERSN chauhan/BHPERSN is/NA > trying/NA to/NA work/NA out/NA the/ARTDEF launch/NA of/NA a/NA new/ABCOMP > product/NA in/NA india/LOCC ','class3')"] > What you have is a list of strings, not tuples. > > I want to make it like, > > [('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA > philanthropic/NA activities/NA','class1'), > ('koteeswaram/BHPERSN is/NA a/NA very/NA nice/NA person/NA ','class1'), > ('koteeswaram/BHPERSN came/NA to/NA mumbai/LOC but/NA could/NA not/NA > attend/NA the/ARTDEF board/NA meeting/NA','class1'), ('the/ARTDEF people/NA > of/NA the/ARTDEF company ABCOMP did/NA not/NA vote/NA for/NA > koteeswaram/LOC ','class2'), ('the/ARTDEF director AHT of/NA > the/ARTDEF company,/NA koteeswaram/BHPERSN had/NA been/NA advised/NA to/NA > take/NA rest/NA for/NA a/NA while/NA ','class2'), ('animesh/BHPERSN > chauhan/BHPERSN arrived/NA by/NA his/PRNM3PAS private/NA aircraft/NA in/NA > mumbai/LOC','class2'), ('animesh/BHPERSN chauhan/BHPERSN met/NA the/ARTDEF > prime/HPLPERST minister/AHT of/NA india/LOCC over/NA some/NA > issues/NA','class2'), ('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA > to/NA set/NA up/NA a/NA plant/NA in/NA uk/LOCC','class3'), > ('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA launch/NA a/NA > new/ABCOMP office/AHT in/NA burdwan/LOC','class3'), > ('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA work/NA out/NA > the/ARTDEF launch/NA of/NA a/NA new/ABCOMP product/NA in/NA > india/LOCC','class3')] > > I tried to make it as follows, > list2=[] > for i in train_sents: > a1=unicodedata.normalize('NFKD', i).encode('ascii','ignore') > a2=a1.replace('"',"") > list2.append(a2) > and, > > for i in list1: > a3=i[1:-1] > list2.append(a3) > In both of these you seem to be trying to remove the double quote marks from the strings, but they aren't part of the strings in the first place, just delimiters. > > > but not helping. > If any one may kindly suggest how may I approach it? > Check out the documentation for ast.literal_eval >