Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #67180

Re: Extracting parts of string between anchor points

Path csiph.com!usenet.pasdenom.info!aioe.org!eternal-september.org!feeder.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From Denis McMahon <denismfmcmahon@gmail.com>
Newsgroups comp.lang.python
Subject Re: Extracting parts of string between anchor points
Date Fri, 28 Feb 2014 00:55:01 +0000 (UTC)
Organization A noiseless patient Spider
Lines 42
Message-ID <leomp5$7ev$1@dont-email.me> (permalink)
References <mailman.7437.1393531705.18130.python-list@python.org>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding 8bit
Injection-Date Fri, 28 Feb 2014 00:55:01 +0000 (UTC)
Injection-Info mx05.eternal-september.org; posting-host="66ffcfa4470a58bcddbdcd1913f98ab4"; logging-data="7647"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19XoqRJvZ1egCo5UOQPm9GFIiGCVCZgaXw="
User-Agent Pan/0.136 (I'm far too busy being delicious; GIT 926a150 git://git.gnome.org/pan2)
Cancel-Lock sha1:N2kDz7XsfWip2FOqUoUPDsV0JUk=
Xref csiph.com comp.lang.python:67180

Show key headers only | View raw


On Thu, 27 Feb 2014 20:07:56 +0000, Jignesh Sutar wrote:

> I've kind of got this working but my code is very ugly. I'm sure it's
> regular expression I need to achieve this more but not very familiar
> with use regex, particularly retaining part of the string that is being
> searched/matched for.
> 
> Notes and code below to demonstrate what I am trying to achieve. Any
> help,
> much appreciated.

It seems you have a string which may be split into between 1 and 3 
substrings by the presence of up to 2 delimeters, and that if both 
delimeters are present, they are in a specified order.

You have several possible cases which, broadly speaking, break down into 
4 groups:

(a) no delimiters present
(b) delimiter 1 present
(c) delimiter 2 present
(d) both delimiters present

It is important when coding for such scenarios to consider the possible 
cases that are not specified, as well as the ones that are.

For example, consider the string:

"<delim1><delim2>"

where you have both delims, in sequence, but no other data elements.

I believe there are at least 17 possible combinations, and maybe another 
8 if you allow for the delims being out of sequence.

The code in the file at the url below processes 17 different cases. It 
may help, or it may confuse.

http://www.sined.co.uk/tmp/strparse.py.txt

-- 
Denis McMahon, denismfmcmahon@gmail.com

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Extracting parts of string between anchor points Jignesh Sutar <jsutar@gmail.com> - 2014-02-27 20:07 +0000
  Re: Extracting parts of string between anchor points Denis McMahon <denismfmcmahon@gmail.com> - 2014-02-28 00:55 +0000
    Re: Extracting parts of string between anchor points Denis McMahon <denismfmcmahon@gmail.com> - 2014-02-28 01:25 +0000

csiph-web