Re: Question on Python Split

Path	csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!newsfeed.eweka.nl!eweka.nl!feeder3.eweka.nl!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path	<python@mrabarnett.plus.com>
X-Original-To	python-list@python.org
Delivered-To	python-list@mail.python.org
X-Spam-Status	OK 0.004
X-Spam-Evidence	'H': 0.99; 'S': 0.00; 'subject:Python': 0.05; 'assign': 0.07; 'function,': 0.07; 'subject:Question': 0.07; 'suppose': 0.07; 'splitting': 0.09; 'tuple': 0.09; 'result.': 0.15; "'for',": 0.16; '2],': 0.16; 'comma': 0.16; 'for,': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:addr:python': 0.16; 'from:name:mrab': 0.16; 'message-id:@mrabarnett.plus.com': 0.16; 'pairs': 0.16; 'pairs,': 0.16; 'received:84.93': 0.16; 'received:84.93.230': 0.16; 'tuple,': 0.16; 'string': 0.17; 'wrote:': 0.17; 'odd': 0.17; '>>>': 0.18; 'split': 0.23; 'statement': 0.23; 'this:': 0.23; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'skip:[ 10': 0.26; 'i.e.': 0.27; '>>>>': 0.29; 'received:192.168.1.3': 0.29; 'convert': 0.29; 'words': 0.29; "skip:' 10": 0.30; 'generally': 0.32; 'received:84': 0.32; 'print': 0.32; 'getting': 0.33; 'point,': 0.33; 'to:addr:python-list': 0.33; 'project': 0.34; 'done': 0.34; 'list': 0.35; 'so,': 0.35; 'except': 0.36; 'but': 0.36; '(for': 0.37; 'subject:: ': 0.38; 'to:addr:python.org': 0.39; 'received:192': 0.39; 'received:192.168': 0.40; 'group,': 0.60; 'free': 0.61; 'email addr:gmail.com': 0.63; 'here': 0.65; 'learned': 0.65; 'dear': 0.66; 'kindly': 0.67; 'header:Reply- To:1': 0.68; 'combining': 0.71; 'reply-to:no real name:2**0': 0.72; 'ebooks': 0.84; 'ipad,': 0.84; 'kindle': 0.84; 'reply- to:addr:python.org': 0.84; 'ipad': 0.95
X-CM-Score	0.00
X-CNFS-Analysis	v=2.0 cv=YaM/Fntf c=1 sm=1 a=0nF1XD0wxitMEM03M9B4ZQ==:17 a=AAvI7MrX_rgA:10 a=ihvODaAuJD4A:10 a=OUOv7kDek9cA:10 a=8nJEP1OIZ-IA:10 a=EBOSESyhAAAA:8 a=8AHkEIZyAAAA:8 a=OqtNrKlpssIA:10 a=pGLkceISAAAA:8 a=1MO106TKKPbucE1ugX8A:9 a=wPNLvfGTeEIA:10 a=MSl-tDqOz04A:10 a=uGG6VRdAzS8P_xxu:21 a=-yGiIqTyf_t6Zsg5:21 a=0nF1XD0wxitMEM03M9B4ZQ==:117
X-AUTH	mrabarnett:2500
Date	Sun, 07 Oct 2012 21:01:07 +0100
From	MRAB <python@mrabarnett.plus.com>
User-Agent	Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20120907 Thunderbird/15.0.1
MIME-Version	1.0
To	python-list@python.org
Subject	Re: Question on Python Split
References	<68fc8fcb-b356-4fce-8541-e2abf371fecf@googlegroups.com>
In-Reply-To	<68fc8fcb-b356-4fce-8541-e2abf371fecf@googlegroups.com>
Content-Type	text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding	7bit
X-BeenThere	python-list@python.org
X-Mailman-Version	2.1.15
Precedence	list
Reply-To	python-list@python.org
List-Id	General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe	<http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive	<http://mail.python.org/pipermail/python-list/>
List-Post	<mailto:python-list@python.org>
List-Help	<mailto:python-list-request@python.org?subject=help>
List-Subscribe	<http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups	comp.lang.python
Message-ID	<mailman.1933.1349640075.27098.python-list@python.org> (permalink)
Lines	95
NNTP-Posting-Host	2001:888:2000:d::a6
X-Trace	1349640075 news.xs4all.nl 6905 [2001:888:2000:d::a6]:38688
X-Complaints-To	abuse@xs4all.nl
Xref	csiph.com comp.lang.python:30934

Show key headers only | View raw

On 2012-10-07 20:30, subhabangalore@gmail.com wrote:
> Dear Group,
>
> Suppose I have a string as,
>
> "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
>
> I am terming it as,
>
> str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
>
> I am working now with a split function,
>
> str_words=str1.split()
> so, I would get the result as,
> ['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']
>
> But I am looking for,
>
> ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']
>
> This can be done if we assign the string as,
>
> str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"
>
> and then assign the split statement as,
>
> str1_word=str1.split(",")
>
> would produce,
>
> ['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']
>
It can also be done like this:

 >>> str1 = "Project Gutenberg has 36000 free ebooks for Kindle Android 
iPad iPhone."
 >>> # Splitting into words:
 >>> s = str1.split()
 >>> s
['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 
'Kindle', 'Android', 'iPad', 'iPhone.']
 >>> # Using slicing with a stride of 2 gives:
 >>> s[0 : : 2]
['Project', 'has', 'free', 'for', 'Android', 'iPhone.']
 >>> # Similarly for the other words gives:
 >>> s[1 : : 2]
['Gutenberg', '36000', 'ebooks', 'Kindle', 'iPad']
 >>> # Combining them in pairs, and adding an extra empty string in case 
there's an odd number of words:
 >>> [(x + ' ' + y).rstrip() for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]
['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android 
iPad', 'iPhone.']

> My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,
>
> [(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']
>
> as I see if I assign it as
>
> for i in str1_word:
>         print i
>         ti=tuple(i)
>         print ti
>
> I am not getting the desired result.
>
> If I work again from tuple point, I get it as,
>>>> tup1=('Project Gutenberg')
>>>> tup2=('has 36000')
>>>> tup3=('free ebooks')
>>>> tup4=('for Kindle')
>>>> tup5=('Android iPad')
>>>> tup6=tup1+tup2+tup3+tup4+tup5
>>>> print tup6
> Project Gutenberghas 36000free ebooksfor KindleAndroid iPad
>
It's the comma that makes the tuple, not the parentheses, except for the 
empty tuple which is just empty parentheses, i.e. ().

> Then how may I achieve it? If any one of the learned members can kindly guide me.

 >>> [((x + ' ' + y).rstrip(), ) for x, y in zip(s[0 : : 2], s[1 : : 2] 
+ [''])]
[('Project Gutenberg',), ('has 36000',), ('free ebooks',), ('for 
Kindle',), ('Android iPad',), ('iPhone.',)]

Is this what you want?

If you want it to be a list of pairs of words, then:

 >>> [(x, y) for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]
[('Project', 'Gutenberg'), ('has', '36000'), ('free', 'ebooks'), ('for', 
'Kindle'), ('Android', 'iPad'), ('iPhone.', '')]

Thread

Question on Python Split subhabangalore@gmail.com - 2012-10-07 12:30 -0700
  Re: Question on Python Split MRAB <python@mrabarnett.plus.com> - 2012-10-07 21:01 +0100
  Re: Question on Python Split Terry Reedy <tjreedy@udel.edu> - 2012-10-07 16:08 -0400
  Re: Question on Python Split Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-10-07 20:34 -0400
  Re: Question on Python Split subhabangalore@gmail.com - 2012-10-08 07:45 -0700

csiph-web