Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #74147

Re: finditer

References <d580e76b-793e-435d-917b-613ae912a93f@googlegroups.com>
Date 2014-07-07 21:38 -0600
Subject Re: finditer
From Jason Friedman <jsf80238@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.11615.1404790710.18130.python-list@python.org> (permalink)

Show all headers | View raw


On Mon, Jul 7, 2014 at 1:19 AM, gintare <g.statkute@gmail.com> wrote:
> If smbd has time, maybe you could advice how to accomplish this task in faster way.
>
> I have a text = """ word{vb}
> wordtransl {vb}
>
> sent1.
>
> sent1trans.
>
> sent2
>
> sent2trans... """
>
> I need to match once wordtransl, and than many times repeating patterns consisting of sent and senttrans.

You might try itertools.groupby
(https://docs.python.org/3/library/itertools.html#module-itertools).

text = """ word{vb}
wordtransl {vb}

sent1

sent1trans

sent2

sent2trans
"""

import itertools
import re
result_list = list()
lines = text.split("\n")

for line in lines[:]:
    if line.startswith("sent"):
        break
    lines.pop(0)

def is_start(x):
    pattern = re.compile(r"sent\d+$")
    if re.search(pattern, x):
        return True

for key, mygroup in itertools.groupby(lines, is_start):
    result_list.append(list(mygroup))

print(result_list)

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

finditer gintare <g.statkute@gmail.com> - 2014-07-07 00:19 -0700
  Re: finditer Jason Friedman <jsf80238@gmail.com> - 2014-07-07 21:38 -0600

csiph-web