Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #13252

Re: Keeping the split token in a Java regular expression

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!news.musoftware.de!wum.musoftware.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From Robert Klemme <shortcutter@googlemail.com>
Newsgroups comp.lang.java.programmer
Subject Re: Keeping the split token in a Java regular expression
Date Wed, 28 Mar 2012 07:28:13 +0200
Lines 78
Message-ID <9tflrdF259U1@mid.individual.net> (permalink)
References <48d35bc3-a391-4ccf-a222-dac64775a2f2@oq7g2000pbb.googlegroups.com> <split-20120327011513@ram.dialup.fu-berlin.de> <21500379.296.1332804401740.JavaMail.geo-discussion-forums@pbbpk10> <9tepmvFhjvU1@mid.individual.net>
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-Trace individual.net 1/oz8lfBhP8WdMwigs72nQt0wNHWud0Ue0reD/+c6cNqx/XqY=
Cancel-Lock sha1:8Q4RHnbDPRFhYfNLkO/L3IxlrJo=
User-Agent Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120310 Thunderbird/11.0
In-Reply-To <9tepmvFhjvU1@mid.individual.net>
Xref csiph.com comp.lang.java.programmer:13252

Show key headers only | View raw


On 03/27/2012 11:27 PM, Robert Klemme wrote:
> On 03/27/2012 01:26 AM, Lew wrote:
>> Stefan Ram wrote:
>>> laredotornado writes:
>>>> What I would like to do is split the expression wherever I have an
>>>
>>> public class Main
...
>>
>> This excellent (except for layout) example deserves to be archived.
>
> What do you find excellent about this? I find it has some deficiencies
>
> - the separator is included in the match (which goes against the
> requirements despite the thread subject)
> - spaces after a separator comma are included in the next token as
> leading text
> - the method really does more than splitting (namely printing), so the
> name does not reflect what's going on
> - the Pattern is compiled on _every_ invocation of the method
> - the method is unnecessary restricted, argument type CharSequence is
> sufficient
>
> Test output for
> "Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM"
> "Fri 8 PM, Sat 1, 3, and 5 PM"
>
> Fri 7:30 PM,
> Sat 2 PM,
> Sun 2:30 PM
> ---
> Fri 8 PM,
> Sat 1, 3, and 5 PM
> ---
>
> I would change that to

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
     private static final Pattern SPLIT_PATTERN = Pattern.compile(
             "(\\S.*?[ap]m)(?:,\\s*)?", Pattern.CASE_INSENSITIVE);

     public static void splitPrint(final CharSequence text) {
         for (final Matcher m = SPLIT_PATTERN.matcher(text); m.find();) {
             System.out.println(m.group(1));
         }
     }

     public static List<String> split(final CharSequence text) {
         final List<String> result = new ArrayList<String>();

         for (final Matcher m = SPLIT_PATTERN.matcher(text); m.find();) {
             result.add(m.group(1));
         }

         return result;
     }

     public static void main(final java.lang.String[] args) {
         splitPrint("Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM");
         System.out.println("---");
         splitPrint("Fri 8 PM, Sat 1, 3, and 5 PM");
         System.out.println("---");
     }
}

I had overlooked a fairly obvious improvement with regards to am/pm parsing.

> I might even sneak a "\\s*" in between "pm)" and "(?:," to even catch
> cases where there are spaces before the separator.

Kind regards

	robert

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Keeping the split token in a Java regular expression laredotornado <laredotornado@zipmail.com> - 2012-03-26 11:54 -0700
  Re: Keeping the split token in a Java regular expression Lew <lewbloch@gmail.com> - 2012-03-26 12:22 -0700
    Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-26 22:01 +0200
      Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-26 21:46 -0400
        Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-27 23:01 +0200
          Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-27 17:18 -0400
          Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 14:21 -0700
            Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-28 07:38 +0200
              Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-28 10:24 -0700
  Re: Keeping the split token in a Java regular expression markspace <-@.> - 2012-03-26 13:49 -0700
  Re: Keeping the split token in a Java regular expression laredotornado@gmail.com - 2012-03-26 14:21 -0700
    Re: Keeping the split token in a Java regular expression markspace <-@.> - 2012-03-26 15:02 -0700
    Re: Keeping the split token in a Java regular expression Knute Johnson <nospam@knutejohnson.com> - 2012-03-26 15:56 -0700
      Re: Keeping the split token in a Java regular expression markspace <-@.> - 2012-03-26 16:02 -0700
        Re: Keeping the split token in a Java regular expression Knute Johnson <nospam@knutejohnson.com> - 2012-03-26 17:33 -0700
          Re: Keeping the split token in a Java regular expression Martin Gregorie <martin@address-in-sig.invalid> - 2012-03-27 01:17 +0000
            Re: Keeping the split token in a Java regular expression Martin Gregorie <martin@address-in-sig.invalid> - 2012-03-27 21:57 +0000
    Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-26 18:26 -0700
      Re: Keeping the split token in a Java regular expression Lew <lewbloch@gmail.com> - 2012-03-26 19:07 -0700
        Re: Keeping the split token in a Java regular expression Knute Johnson <nospam@knutejohnson.com> - 2012-03-26 20:40 -0700
          Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 09:10 -0700
            Re: Keeping the split token in a Java regular expression Lew <lewbloch@gmail.com> - 2012-03-27 11:09 -0700
              Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 13:32 -0700
                Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 14:29 -0700
                Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 16:22 -0700
                Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 18:20 -0700
                Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 18:27 -0700
                Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 21:31 -0700
                Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-28 07:41 +0200
                Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-28 10:28 -0700
  Re: Keeping the split token in a Java regular expression Lew <lewbloch@gmail.com> - 2012-03-26 16:26 -0700
    Re: Keeping the split token in a Java regular expression Knute Johnson <nospam@knutejohnson.com> - 2012-03-26 17:36 -0700
    Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-27 23:27 +0200
      Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-28 07:28 +0200
  Re: Keeping the split token in a Java regular expression "John B. Matthews" <nospam@nospam.invalid> - 2012-03-26 20:49 -0400
  Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-26 21:58 -0400
    Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-26 21:14 -0700
      Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-27 17:21 -0400
        Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 15:20 -0700
          Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-27 18:48 -0400
            Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 17:07 -0700
          Re: Keeping the split token in a Java regular expression Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-03-27 21:49 -0300
            Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-27 20:56 -0400
              Re: Keeping the split token in a Java regular expression Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-03-27 22:01 -0300
                Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 18:27 -0700
  Re: Keeping the split token in a Java regular expression Jim Janney <jjanney@shell.xmission.com> - 2012-03-27 08:15 -0600
    Re: Keeping the split token in a Java regular expression laredotornado <laredotornado@zipmail.com> - 2012-03-27 07:58 -0700
      Re: Keeping the split token in a Java regular expression Jim Janney <jjanney@shell.xmission.com> - 2012-03-27 09:21 -0600
        Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 09:43 -0700
          Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-28 07:51 +0200

csiph-web