Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #107440

Re: Detecting repeated subsequences of identical items

Path csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From Michael Selik <michael.selik@gmail.com>
Newsgroups comp.lang.python
Subject Re: Detecting repeated subsequences of identical items
Date Thu, 21 Apr 2016 07:05:29 +0000
Lines 24
Message-ID <mailman.6.1461222341.23626.python-list@python.org> (permalink)
References <571843f9$0$1585$c3e8da3$5496439d@news.astraweb.com> <CAHzaPEPGvxoMAO7f0=DLBDf=dS1AhTP55Pzm_5ANbkOQ-sZAUA@mail.gmail.com> <CAGgTfkN9T6m51gMuOHAGDEvHi26MAyRVoUC_YX__RV8S7+SsEw@mail.gmail.com>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
X-Trace news.uni-berlin.de kNGN27XfkYjnRAGaFFQLYAAFDFT+EJ+hhpthPJHtpeRA==
Return-Path <michael.selik@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.007
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'patterns': 0.04; '21,': 0.07; 'repeated': 0.07; 'thu,': 0.15; '2016': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'repetitions': 0.16; 'subject:Detecting': 0.16; 'subsequences': 0.16; 'wrote:': 0.16; 'obviously': 0.16; 'nested': 0.18; '&gt;': 0.18; 'email addr:gmail.com&gt;': 0.18; 'to:2**1': 0.21; 'context.': 0.22; 'latter': 0.22; 'seems': 0.23; 'header:In-Reply-To:1': 0.24; 'skip:" 20': 0.26; 'right.': 0.27; 'message-id:@mail.gmail.com': 0.27; 'grouping': 0.29; 'skip:& 30': 0.30; 'maybe': 0.33; 'problem': 0.33; "d'aprano": 0.33; 'steven': 0.33; 'skip:& 20': 0.35; 'received:google.com': 0.35; 'received:74.125.82': 0.35; 'but': 0.36; 'should': 0.36; 'depends': 0.36; 'to:addr:python- list': 0.36; 'subject:: ': 0.37; 'say': 0.37; 'missing': 0.37; 'rather': 0.39; 'to:addr:python.org': 0.40; 'more': 0.63; 'biggest': 0.67; 'or:': 0.84; 'to:name:python': 0.84; 'abc': 0.91
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=ml/gewk3bIli0eIUPsxcgwpnnkdBniF64fYn+Bus9Kw=; b=UdnEpDrzWVusmuf6eIrwan469WcQ1Yl0rQgeO6vHQFeR8hifwVgpu2ql2ulaH7XQY+ 7LVuN6uxME/d3Kl7Biq3/HMazgSPfXjfWDdLWy7B1tWNno6TRiBJHrRIUhLxurCiqIP1 F8UFzAB8nkZM5dqHmfy4QM/jz1YwgXHDMj6QsK19H8Yan5Po+xqBqq5f7KWxhdiW7SJ7 CVzaxTACG2VuWIJJxuPAbuTXGXLcSHngqMGTNcUgdza5dVFAvQpDOLYPJ3s6e55yWqlU rAQ5JVatBcsXWDYlZWEr3JFH3ILGZ6+QP0Px1Te2r8rBAH0nDWsPRr6dqAUEXMBxw8ra SfFw==
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=ml/gewk3bIli0eIUPsxcgwpnnkdBniF64fYn+Bus9Kw=; b=OTl6GgsEjqI6S6Eha5Gt5AiWb6EGfIr/BvkXxY7+T98q1FRXvnJUv6m6Liv0eyKubS jzlLQwCFoMNnnjv0zpucaym6/PuozwiJB5kvdKHlPiOdZUpTDux5OwOF/pvpMS1nOACm Fjce+E9jpV0paZgJYsnX4DYTpCKTZxvVTW9St0a0l0DRH0x2USEuqB0b1zO2C2aqWHD+ RrfLgH7H68EIPUtw8KOZjruz98LqNxmA5wb0pA5T3Q0xxS9ChB05UZd9REOIgcwTBcoR ix9m1diBDB8rqLreU5sEE9Dmpe2cBoKOBxjKK44Tkj+YE2s2Z0CrqxmtXIvLC92QhW4j Eb5w==
X-Gm-Message-State AOPr4FVwrtXen148lSpuaTLvJaxJTpVUX7YqlXuea+qvHEDbKM4cP0Sp9C/rKeXRS3uXNTlb4XQqzCTVfkVcrw==
X-Received by 10.194.184.38 with SMTP id er6mr11935306wjc.93.1461222339548; Thu, 21 Apr 2016 00:05:39 -0700 (PDT)
In-Reply-To <CAHzaPEPGvxoMAO7f0=DLBDf=dS1AhTP55Pzm_5ANbkOQ-sZAUA@mail.gmail.com>
X-Content-Filtered-By Mailman/MimeDel 2.1.22
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.22
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID <CAGgTfkN9T6m51gMuOHAGDEvHi26MAyRVoUC_YX__RV8S7+SsEw@mail.gmail.com>
X-Mailman-Original-References <571843f9$0$1585$c3e8da3$5496439d@news.astraweb.com> <CAHzaPEPGvxoMAO7f0=DLBDf=dS1AhTP55Pzm_5ANbkOQ-sZAUA@mail.gmail.com>
Xref csiph.com comp.lang.python:107440

Show key headers only | View raw


On Thu, Apr 21, 2016 at 2:55 AM Vlastimil Brom <vlastimil.brom@gmail.com>
wrote:

> 2016-04-21 5:07 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:
> > I want to group subsequences.
> > "ABCABCABCDEABCDEFABCABCABCB"
> > ABC ABC ABCDE ABCDE F ABC ABC ABC B
> > or:
> > ABC ABC ABC D E A B C D E F ABC ABC ABC B
>
> if I am not missing something, the latter form of grouping might be
> achieved with the following regex: [snip]
> The former one seems to be more tricky...
>

Right. If the problem is constrained to say that repeated subsequences can
have no nested repeated subsequences, it's much easier to solve.

If you had "ABCABCABCABC" should that result in
ABC ABC ABC ABC, with 4 repetitions
or ABCABC ABCABC with 2 repetitions?
In this example, one might say the higher count is obviously better, but I
think it depends on the context. Maybe the user is looking for the biggest
patterns rather than the biggest counts.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Detecting repeated subsequences of identical items Steven D'Aprano <steve@pearwood.info> - 2016-04-21 13:07 +1000
  Re: Detecting repeated subsequences of identical items Ethan Furman <ethan@stoneleaf.us> - 2016-04-20 20:57 -0700
  Re: Detecting repeated subsequences of identical items Ethan Furman <ethan@stoneleaf.us> - 2016-04-20 21:15 -0700
  Re: Detecting repeated subsequences of identical items Chris Angelico <rosuav@gmail.com> - 2016-04-21 15:37 +1000
  Re: Detecting repeated subsequences of identical items Michael Selik <michael.selik@gmail.com> - 2016-04-21 06:35 +0000
    Re: Detecting repeated subsequences of identical items Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-04-21 18:05 +1000
      Re: Detecting repeated subsequences of identical items Nobody <nobody@nowhere.invalid> - 2016-04-21 13:02 +0100
  Re: Detecting repeated subsequences of identical items Michael Selik <michael.selik@gmail.com> - 2016-04-21 06:49 +0000
  Re: Detecting repeated subsequences of identical items Vlastimil Brom <vlastimil.brom@gmail.com> - 2016-04-21 08:54 +0200
  Re: Detecting repeated subsequences of identical items Michael Selik <michael.selik@gmail.com> - 2016-04-21 07:05 +0000
  Re: Detecting repeated subsequences of identical items Alain Ketterlin <alain@universite-de-strasbourg.fr.invalid> - 2016-04-21 09:25 +0200
  Re: Detecting repeated subsequences of identical items Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-04-21 09:53 +0100
    Re: Detecting repeated subsequences of identical items Steven D'Aprano <steve@pearwood.info> - 2016-04-21 22:15 +1000
      Re: Detecting repeated subsequences of identical items Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-04-21 15:01 +0100
      Re: Detecting repeated subsequences of identical items Chris Angelico <rosuav@gmail.com> - 2016-04-22 00:12 +1000
        Re: Detecting repeated subsequences of identical items Steven D'Aprano <steve@pearwood.info> - 2016-04-23 01:00 +1000
      Re: Detecting repeated subsequences of identical items Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-04-21 15:30 +0100
      Re: Detecting repeated subsequences of identical items Chris Angelico <rosuav@gmail.com> - 2016-04-22 01:02 +1000
  Re: Detecting repeated subsequences of identical items Serhiy Storchaka <storchaka@gmail.com> - 2016-04-21 14:56 +0300

csiph-web