Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #94692

Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter?

X-Received by 10.13.219.3 with SMTP id d3mr37226025ywe.51.1438091708906; Tue, 28 Jul 2015 06:55:08 -0700 (PDT)
X-Received by 10.50.66.141 with SMTP id f13mr51730igt.4.1438091708874; Tue, 28 Jul 2015 06:55:08 -0700 (PDT)
Path csiph.com!optima2.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!69no2427868qgl.1!news-out.google.com!a16ni31100ign.0!nntp.google.com!pg9no4689643igb.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups comp.lang.python
Date Tue, 28 Jul 2015 06:55:08 -0700 (PDT)
Complaints-To groups-abuse@google.com
Injection-Info glegroupsg2000goo.googlegroups.com; posting-host=58.84.194.252; posting-account=LMx4fAoAAAAL3v616YFvkt1ueXc1H63-
NNTP-Posting-Host 58.84.194.252
User-Agent G2/1.0
MIME-Version 1.0
Message-ID <fed7bab5-db18-45c3-9ba2-4b7fbfa80602@googlegroups.com> (permalink)
Subject Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter?
From Victor Hooi <victorhooi@gmail.com>
Injection-Date Tue, 28 Jul 2015 13:55:08 +0000
Content-Type text/plain; charset=ISO-8859-1
Xref csiph.com comp.lang.python:94692

Show key headers only | View raw


I have a line that looks like this:

    14     *0    330     *0     760   411|0       0   770g  1544g   117g   1414 computedshopcartdb:103.5%          0      30|0     0|1    19m    97m  1538 ComputedCartRS  PRI   09:40:26

I'd like to split this line on multiple separators - in this case, consecutive whitespace, as well as the pipe symbol (|).

If I run .split() on the line, it will split on consecutive whitespace:

In [17]: f.split()
Out[17]:
['14',
 '*0',
 '330',
 '*0',
 '760',
 '411|0',
 '0',
 '770g',
 '1544g',
 '117g',
 '1414',
 'computedshopcartdb:103.5%',
 '0',
 '30|0',
 '0|1',
 '19m',
 '97m',
 '1538',
 'ComputedCartRS',
 'PRI',
 '09:40:26']

If I try to run .split(' |'), however, I get:

f.split(' |')
Out[18]: ['    14     *0    330     *0     760   411|0       0   770g  1544g   117g   1414 computedshopcartdb:103.5%          0      30|0     0|1    19m    97m  1538 ComputedCartRS  PRI   09:40:26']

I know the regex library also has a split, unfortunately, that does not collapse consecutive whitespace:

In [19]: re.split(' |', f)
Out[19]:
['',
 '',
 '',
 '',
 '14',
 '',
 '',
 '',
 '',
 '*0',
 '',
 '',
 '',
 '330',
 '',
 '',
 '',
 '',
 '*0',
 '',
 '',
 '',
 '',
 '760',
 '',
 '',
 '411|0',
 '',
 '',
 '',
 '',
 '',
 '',
 '0',
 '',
 '',
 '770g',
 '',
 '1544g',
 '',
 '',
 '117g',
 '',
 '',
 '1414',
 'computedshopcartdb:103.5%',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '0',
 '',
 '',
 '',
 '',
 '',
 '30|0',
 '',
 '',
 '',
 '',
 '0|1',
 '',
 '',
 '',
 '19m',
 '',
 '',
 '',
 '97m',
 '',
 '1538',
 'ComputedCartRS',
 '',
 'PRI',
 '',
 '',
 '09:40:26']

Is there an easy way to split on multiple characters, and also treat consecutive delimiters as a single delimiter?

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter? Victor Hooi <victorhooi@gmail.com> - 2015-07-28 06:55 -0700
  Re: Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter? m <mvoicem@gmail.com> - 2015-07-28 15:59 +0200
    Re: Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter? Victor Hooi <victorhooi@gmail.com> - 2015-07-28 07:09 -0700
      Re: Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter? MRAB <python@mrabarnett.plus.com> - 2015-07-28 15:30 +0100
  Re: Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter? Chris Angelico <rosuav@gmail.com> - 2015-07-29 10:08 +1000
    Re: Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter? Rustom Mody <rustompmody@gmail.com> - 2015-07-28 19:41 -0700
  Re: Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter? Joel Goldstick <joel.goldstick@gmail.com> - 2015-07-28 21:28 -0400

csiph-web