Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #2326

Re: Extracting subsequences composed of the same character

Date 2011-04-01 02:16 +0100
From MRAB <python@mrabarnett.plus.com>
Subject Re: Extracting subsequences composed of the same character
References <4d952008$0$3943$426a74cc@news.free.fr>
Newsgroups comp.lang.python
Message-ID <mailman.59.1301620676.2990.python-list@python.org> (permalink)

Show all headers | View raw


On 01/04/2011 01:43, candide wrote:
> Suppose you have a string, for instance
>
> "pyyythhooonnn ---> ++++"
>
> and you search for the subquences composed of the same character, here
> you get :
>
> 'yyy', 'hh', 'ooo', 'nnn', '---', '++++'
>
> It's not difficult to write a Python code that solves the problem, for
> instance :
>
[snip]
>
> I should confess that this code is rather cumbersome so I was looking
> for an alternative. I imagine that a regular expressions approach could
> provide a better method. Does a such code exist ? Note that the string
> is not restricted to the ascii charset.

 >>> import re
 >>> re.findall(r"((.)\2+)", s)
[('yyy', 'y'), ('hh', 'h'), ('ooo', 'o'), ('nnn', 'n'), ('---', '-'), 
('++++', '+')]
 >>> [m[0] for m in re.findall(r"((.)\2+)", s)]
['yyy', 'hh', 'ooo', 'nnn', '---', '++++']

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Extracting subsequences composed of the same character candide <candide@free.invalid> - 2011-04-01 02:43 +0200
  Re: Extracting subsequences composed of the same character MRAB <python@mrabarnett.plus.com> - 2011-04-01 02:16 +0100
  Re: Extracting subsequences composed of the same character Roy Smith <roy@panix.com> - 2011-03-31 21:40 -0400
  Re: Extracting subsequences composed of the same character Tim Chase <python.list@tim.thechases.com> - 2011-03-31 20:58 -0500
  Re: Extracting subsequences composed of the same character Tim Chase <python.list@tim.thechases.com> - 2011-03-31 21:20 -0500
  Re: Extracting subsequences composed of the same character Terry Reedy <tjreedy@udel.edu> - 2011-04-01 00:18 -0400
  Re: Extracting subsequences composed of the same character candide <candide@free.invalid> - 2011-04-01 21:39 +0200

csiph-web