Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #60590 > unrolled thread
| Started by | amjadcsu@gmail.com |
|---|---|
| First post | 2013-11-27 02:46 -0800 |
| Last post | 2013-11-27 08:12 -0800 |
| Articles | 9 — 7 participants |
Back to article view | Back to comp.lang.python
Wrapping around a list amjadcsu@gmail.com - 2013-11-27 02:46 -0800
Re: Wrapping around a list Chris Angelico <rosuav@gmail.com> - 2013-11-27 21:53 +1100
Re: Wrapping around a list Amjad Syed <amjadcsu@gmail.com> - 2013-11-27 03:07 -0800
Re: Wrapping around a list Chris Angelico <rosuav@gmail.com> - 2013-11-27 22:14 +1100
Re: Wrapping around a list Amjad Syed <amjadcsu@gmail.com> - 2013-11-27 10:14 -0800
Re: Wrapping around a list Ned Batchelder <ned@nedbatchelder.com> - 2013-11-27 06:26 -0500
Re: Wrapping around a list Neil Cerutti <mr.cerutti@gmail.com> - 2013-11-27 08:33 -0500
Re: Wrapping around a list Peter Pearson <ppearson@nowhere.invalid> - 2013-11-27 17:42 +0000
Re: Wrapping around a list rusi <rustompmody@gmail.com> - 2013-11-27 08:12 -0800
| From | amjadcsu@gmail.com |
|---|---|
| Date | 2013-11-27 02:46 -0800 |
| Subject | Wrapping around a list |
| Message-ID | <35a56651-33b3-454e-a936-439196989d3b@googlegroups.com> |
Hello,
I am working on a problem (Bioinformatics domain) where all possible combinations of input string needs to be printed as sublist
For example:
Input string : "LEQN"
Output= "[L","E","Q","N"]["LE","EQ","QN","NL"] ["LEQ","EQN","QNE","NLE"]
["LEQN"]
The code i have written for this is as follows:
from itertools import chain, repeat, islice
from collections import deque
def sliding_window(iterable, n, fill=False, fillvalue=None):
it = iter(iterable)
if fill:
it = chain(it, repeat(fillvalue, n - 1))
w = deque(islice(it, n - 1))
for x in it:
w.append(x)
yield w
w.popleft()
input="LENQ"
lstinput= list(input)
lenlstinput=len(lstinput)
list1=[''.join(x) for x in sliding_window(lstinput, 2)]
list2= [''.join(x) for x in sliding_window(lstinput, 3)]
list3= [''.join(x) for x in sliding_window(lstinput, 4)]
The output i get as follows:
List 1 is ['LE', 'EN', 'NQ'] Should be ['LE','EN','NQ','QL']
List 2 is ['LEN', 'ENQ'] Should be ['LEN','ENQ','NQL','QLE']
So the question i am asking , how can i add wrapping around sublist in my sliding window function.
Thanks
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-27 21:53 +1100 |
| Message-ID | <mailman.3288.1385549643.18130.python-list@python.org> |
| In reply to | #60590 |
On Wed, Nov 27, 2013 at 9:46 PM, <amjadcsu@gmail.com> wrote: > So the question i am asking , how can i add wrapping around sublist in my sliding window function. By the look of what you now have, the easiest way would probably be to duplicate the list before you slide. So instead of working on "LEQN", you work on "LEQNLEQN" - or possibly "LEQNLEQ", or other shortened versions, if it's easier than pulling out the duplicates. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Amjad Syed <amjadcsu@gmail.com> |
|---|---|
| Date | 2013-11-27 03:07 -0800 |
| Message-ID | <f01581f8-25b8-4217-aec6-2b93fcc12cbb@googlegroups.com> |
| In reply to | #60591 |
On Wednesday, November 27, 2013 1:53:54 PM UTC+3, Chris Angelico wrote: > On Wed, Nov 27, 2013 at 9:46 PM, <amjadcsu@gmail.com> wrote: > > > So the question i am asking , how can i add wrapping around sublist in my sliding window function. > > > > By the look of what you now have, the easiest way would probably be to > > duplicate the list before you slide. So instead of working on "LEQN", > > you work on "LEQNLEQN" - or possibly "LEQNLEQ", or other shortened > > versions, if it's easier than pulling out the duplicates. > > > > ChrisA Thanks Chris for the reply. But i would like sliding function to be scalable, as input string can be of 100 letters.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-27 22:14 +1100 |
| Message-ID | <mailman.3289.1385550867.18130.python-list@python.org> |
| In reply to | #60592 |
On Wed, Nov 27, 2013 at 10:07 PM, Amjad Syed <amjadcsu@gmail.com> wrote: > Thanks Chris for the reply. But i would like sliding function to be scalable, as input string can be of 100 letters. A hundred isn't much to work with, and your code will be fairly simple. Give it a try with small strings, see how it goes; then try it on your full-size - I doubt there'll be a problem. Now, if you were talking about a hundred million letters, then maybe there'd be a problem. But even there, I'd start with the simple and clean approach, and only optimize for performance when it becomes obviously necessary (like when my MUD client was able to saturate one core of my CPU, just by me typing a lot of commands very rapidly - that needed fixing!). ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Amjad Syed <amjadcsu@gmail.com> |
|---|---|
| Date | 2013-11-27 10:14 -0800 |
| Message-ID | <62234fd2-2f37-4750-b1e5-64813e51fa5e@googlegroups.com> |
| In reply to | #60594 |
On Wednesday, November 27, 2013 2:14:18 PM UTC+3, Chris Angelico wrote: > On Wed, Nov 27, 2013 at 10:07 PM, Amjad Syed <amjadcsu@gmail.com> wrote: > > > Thanks Chris for the reply. But i would like sliding function to be scalable, as input string can be of 100 letters. > > > > A hundred isn't much to work with, and your code will be fairly > > simple. Give it a try with small strings, see how it goes; then try it > > on your full-size - I doubt there'll be a problem. > > > > Now, if you were talking about a hundred million letters, then maybe > > there'd be a problem. But even there, I'd start with the simple and > > clean approach, and only optimize for performance when it becomes > > obviously necessary (like when my MUD client was able to saturate one > > core of my CPU, just by me typing a lot of commands very rapidly - > > that needed fixing!). > > > > ChrisA Thanks Chris, the duplication did solve the problem.
[toc] | [prev] | [next] | [standalone]
| From | Ned Batchelder <ned@nedbatchelder.com> |
|---|---|
| Date | 2013-11-27 06:26 -0500 |
| Message-ID | <mailman.3290.1385551604.18130.python-list@python.org> |
| In reply to | #60592 |
On 11/27/13 6:14 AM, Chris Angelico wrote:
> On Wed, Nov 27, 2013 at 10:07 PM, Amjad Syed <amjadcsu@gmail.com> wrote:
>> Thanks Chris for the reply. But i would like sliding function to be scalable, as input string can be of 100 letters.
>
> A hundred isn't much to work with, and your code will be fairly
> simple. Give it a try with small strings, see how it goes; then try it
> on your full-size - I doubt there'll be a problem.
>
> Now, if you were talking about a hundred million letters, then maybe
> there'd be a problem. But even there, I'd start with the simple and
> clean approach, and only optimize for performance when it becomes
> obviously necessary (like when my MUD client was able to saturate one
> core of my CPU, just by me typing a lot of commands very rapidly -
> that needed fixing!).
>
> ChrisA
>
Using ChrisA's idea:
def sliding_window(iterable, n, fill=None):
values = list(iterable)
num_values = len(values)
if fill is not None:
values.extend([fill]*(n-1))
need_more = (2*num_values-1) - len(values)
values.extend(values[:need_more])
for start in range(num_values):
yield values[start:start+n]
l = list("LEQNABC")
for n in range(2, len(l)+1):
print [''.join(x) for x in sliding_window(l, n)]
for n in range(2, len(l)+1):
print [''.join(x) for x in sliding_window(l, n, fill="_")]
Produces:
['LE', 'EQ', 'QN', 'NA', 'AB', 'BC', 'CL']
['LEQ', 'EQN', 'QNA', 'NAB', 'ABC', 'BCL', 'CLE']
['LEQN', 'EQNA', 'QNAB', 'NABC', 'ABCL', 'BCLE', 'CLEQ']
['LEQNA', 'EQNAB', 'QNABC', 'NABCL', 'ABCLE', 'BCLEQ', 'CLEQN']
['LEQNAB', 'EQNABC', 'QNABCL', 'NABCLE', 'ABCLEQ', 'BCLEQN', 'CLEQNA']
['LEQNABC', 'EQNABCL', 'QNABCLE', 'NABCLEQ', 'ABCLEQN', 'BCLEQNA',
'CLEQNAB']
['LE', 'EQ', 'QN', 'NA', 'AB', 'BC', 'C_']
['LEQ', 'EQN', 'QNA', 'NAB', 'ABC', 'BC_', 'C__']
['LEQN', 'EQNA', 'QNAB', 'NABC', 'ABC_', 'BC__', 'C___']
['LEQNA', 'EQNAB', 'QNABC', 'NABC_', 'ABC__', 'BC___', 'C____']
['LEQNAB', 'EQNABC', 'QNABC_', 'NABC__', 'ABC___', 'BC____', 'C_____']
['LEQNABC', 'EQNABC_', 'QNABC__', 'NABC___', 'ABC____', 'BC_____',
'C______']
100 elements is really nothing, don't try to over-optimize it. And if
your inputs are really strings, not general iterables, you can use the
same logic but with string operations, and you'll likely have better
performance anyway. Less general, true, but better for your actual problem.
--Ned.
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <mr.cerutti@gmail.com> |
|---|---|
| Date | 2013-11-27 08:33 -0500 |
| Message-ID | <mailman.3299.1385559225.18130.python-list@python.org> |
| In reply to | #60590 |
On Wed, Nov 27, 2013 at 5:46 AM, <amjadcsu@gmail.com> wrote:
> I am working on a problem (Bioinformatics domain) where all
> possible combinations of input string needs to be printed as
> sublist
>
> For example:
> Input string : "LEQN"
> Output= "[L","E","Q","N"]
> ["LE","EQ","QN","NL"]["LEQ","EQN","QNE","NLE"]["LEQN"]
How about itertools.combinations?
import itertools
s = "LEQN"
for i in range(len(s)):
for comb in itertools.combinations(s, i+1):
print(comb)
Output:
('L',)
('E',)
('Q',)
('N',)
('L', 'E')
('L', 'Q')
('L', 'N')
('E', 'Q')
('E', 'N')
('Q', 'N')
('L', 'E', 'Q')
('L', 'E', 'N')
('L', 'Q', 'N')
('E', 'Q', 'N')
('L', 'E', 'Q', 'N')
For some reason I've got more 2-character combinations than you,
though.
--
Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | Peter Pearson <ppearson@nowhere.invalid> |
|---|---|
| Date | 2013-11-27 17:42 +0000 |
| Message-ID | <bfmp86F1fl3U1@mid.individual.net> |
| In reply to | #60606 |
On Wed, 27 Nov 2013 08:33:43 -0500, Neil Cerutti <mr.cerutti@gmail.com> wrote:
> On Wed, Nov 27, 2013 at 5:46 AM, <amjadcsu@gmail.com> wrote:
>> I am working on a problem (Bioinformatics domain) where all
>> possible combinations of input string needs to be printed as
>> sublist
>>
>> For example:
>> Input string : "LEQN"
>> Output= "[L","E","Q","N"]
>> ["LE","EQ","QN","NL"]["LEQ","EQN","QNE","NLE"]["LEQN"]
>
> How about itertools.combinations?
>
> import itertools
>
> s = "LEQN"
> for i in range(len(s)):
> for comb in itertools.combinations(s, i+1):
> print(comb)
> Output:
> ('L',)
> ('E',)
> ('Q',)
> ('N',)
> ('L', 'E')
> ('L', 'Q')
> ('L', 'N')
> ('E', 'Q')
> ('E', 'N')
> ('Q', 'N')
> ('L', 'E', 'Q')
> ('L', 'E', 'N')
> ('L', 'Q', 'N')
> ('E', 'Q', 'N')
> ('L', 'E', 'Q', 'N')
>
> For some reason I've got more 2-character combinations than you,
> though.
The original poster's combinations comprise letters that are *contiguous*,
in a circular sense. Yours include non-contiguous sets, like LQ.
--
To email me, substitute nowhere->spamcop, invalid->net.
[toc] | [prev] | [next] | [standalone]
| From | rusi <rustompmody@gmail.com> |
|---|---|
| Date | 2013-11-27 08:12 -0800 |
| Message-ID | <eab8a5a4-8413-4baa-9340-e13c515f3bb0@googlegroups.com> |
| In reply to | #60590 |
On Wednesday, November 27, 2013 4:16:50 PM UTC+5:30, Amjad Syed wrote:
> Hello,
>
> I am working on a problem (Bioinformatics domain) where all possible combinations of input string needs to be printed as sublist
If we take the standard combinations (Pascal triangle) result
nCr + nCr-1 = n+1Cr
and make it into a recursive python function we get:
def c(n,r):
return (1 if r == 0 else
0 if n == 0 else
c(n-1,r) + c(n-1,r-1))
(yeah the base cases are tricky)
Now instead of enumerating the *number* of combinations if we want the
combinations *themselves* we can do almost isomorphically:
[Writing d analogous to c]
def d(l,r):
if r == 0: return [[]]
if not l: return []
x = l[0]
xs = l[1:]
return d(xs,r) + [[x]+c for c in d(xs,(r-1))]
Note the difference in types:
>>> c(4,2)
6
>>> d("LEQN", 2)
[['Q', 'N'], ['E', 'N'], ['E', 'Q'], ['L', 'N'], ['L', 'Q'], ['L', 'E']]
>>>
Now we can generator-ize it like so:
def e(l,r):
if r == 0:
yield []
elif l:
x = l[0]
xs = l[1:]
for y in chain(e(xs,r),([x]+c for c in e(xs,(r-1)))):
yield y
# python 3: yield from chain(e(xs,r),([x]+c for c in e(xs,(r-1))))
called as
>>> list(e("LEQN", 2))
[['Q', 'N'], ['E', 'N'], ['E', 'Q'], ['L', 'N'], ['L', 'Q'], ['L', 'E']]
>>>
BTW: This is neater in Haskell than in Python.
c n 0 = 1
c 0 r = 0
c n r = c (n-1) r + c (n-1) (r-1)
d l 0 = [[]]
d [] r = []
d (x:xs) r = d xs r ++ [x:c | c <- d xs (r-1)]
In particular the 'd' has list elegance of the python 'd' and generator
efficiency of python 'e'
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web