Path: csiph.com!optima2.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.038 X-Spam-Evidence: '*H*': 0.92; '*S*': 0.00; 'cc:addr:python-list': 0.09; 'python': 0.10; 'mathematics': 0.15; 'subject: \n ': 0.15; '97m': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'whitespace.': 0.16; 'wrote:': 0.16; '2015': 0.20; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'pipe': 0.22; 'split': 0.23; 'this:': 0.23; 'header:In-Reply-To:1': 0.24; 'message-id:@mail.gmail.com': 0.27; 'correct': 0.28; 'looks': 0.29; 'job:': 0.29; "i'm": 0.30; "i'd": 0.31; 'another': 0.32; 'lets': 0.33; 'symbol': 0.33; 'case,': 0.34; 'tue,': 0.34; 'received:google.com': 0.35; 'but': 0.36; 'subject:?': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'turn': 0.37; 'anything': 0.38; 'or,': 0.38; 'does': 0.39; 'jul': 0.72; 'sounds': 0.76; 'chrisa': 0.84; 'differently:': 0.84; 'victor': 0.84; 'to:none': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=4P4/6TabVeUzV60ONVKMkKd06cP+03vtgOUUWWisgzo=; b=mR3+D9G38aLFYDKGojSSzOnnpQvDCJ1xGD7jFmesB4h901cAXx/0eRIuqSaI0wxI+1 OKVWsojdQng2rQCRlhb5TUFDqQ+cV83cP3c0/i11243cuLuSQ2Od7NQzd+K+/G2xn9JB YaD3lUTg7rzqUFb8Qqd3lL29w+d9jdE0YwGTwGFsBLl5yxcinu0trXzUVqhsBECx5y1n v37ThFKNybVEDRmDG3VrStRyj5idAW9J17CDaoJR5NBSN+N6Fnk6KpuSLVFC2IK8x6zq 25yn3p9+kijBPoQYJpgkDS/RDTMfL9SDjcDGDdI5jQMaj11GHc0W3crggaXkR/rd5vYW sjGQ== MIME-Version: 1.0 X-Received: by 10.107.132.7 with SMTP id g7mr59005981iod.9.1438128501290; Tue, 28 Jul 2015 17:08:21 -0700 (PDT) In-Reply-To: References: Date: Wed, 29 Jul 2015 10:08:21 +1000 Subject: Re: Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter? From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 18 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1438132534 news.xs4all.nl 2831 [2001:888:2000:d::a6]:42871 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:94713 On Tue, Jul 28, 2015 at 11:55 PM, Victor Hooi wrote: > I have a line that looks like this: > > 14 *0 330 *0 760 411|0 0 770g 1544g 117g 1414 computedshopcartdb:103.5% 0 30|0 0|1 19m 97m 1538 ComputedCartRS PRI 09:40:26 > > I'd like to split this line on multiple separators - in this case, consecutive whitespace, as well as the pipe symbol (|). Correct me if I'm misanalyzing this, but it sounds to me like a simple transform-then-split would do the job: f.replace("|"," ").split() Turn those pipe characters into spaces, then split on whitespace. Or, reading it differently: Declare that pipe is another form of whitespace, then split on whitespace. Python lets you declare anything you like, same as mathematics does :) ChrisA