Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #13190 > unrolled thread
| Started by | laredotornado <laredotornado@zipmail.com> |
|---|---|
| First post | 2012-03-26 11:54 -0700 |
| Last post | 2012-03-28 07:51 +0200 |
| Articles | 20 on this page of 50 — 13 participants |
Back to article view | Back to comp.lang.java.programmer
Keeping the split token in a Java regular expression laredotornado <laredotornado@zipmail.com> - 2012-03-26 11:54 -0700
Re: Keeping the split token in a Java regular expression Lew <lewbloch@gmail.com> - 2012-03-26 12:22 -0700
Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-26 22:01 +0200
Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-26 21:46 -0400
Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-27 23:01 +0200
Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-27 17:18 -0400
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 14:21 -0700
Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-28 07:38 +0200
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-28 10:24 -0700
Re: Keeping the split token in a Java regular expression markspace <-@.> - 2012-03-26 13:49 -0700
Re: Keeping the split token in a Java regular expression laredotornado@gmail.com - 2012-03-26 14:21 -0700
Re: Keeping the split token in a Java regular expression markspace <-@.> - 2012-03-26 15:02 -0700
Re: Keeping the split token in a Java regular expression Knute Johnson <nospam@knutejohnson.com> - 2012-03-26 15:56 -0700
Re: Keeping the split token in a Java regular expression markspace <-@.> - 2012-03-26 16:02 -0700
Re: Keeping the split token in a Java regular expression Knute Johnson <nospam@knutejohnson.com> - 2012-03-26 17:33 -0700
Re: Keeping the split token in a Java regular expression Martin Gregorie <martin@address-in-sig.invalid> - 2012-03-27 01:17 +0000
Re: Keeping the split token in a Java regular expression Martin Gregorie <martin@address-in-sig.invalid> - 2012-03-27 21:57 +0000
Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-26 18:26 -0700
Re: Keeping the split token in a Java regular expression Lew <lewbloch@gmail.com> - 2012-03-26 19:07 -0700
Re: Keeping the split token in a Java regular expression Knute Johnson <nospam@knutejohnson.com> - 2012-03-26 20:40 -0700
Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 09:10 -0700
Re: Keeping the split token in a Java regular expression Lew <lewbloch@gmail.com> - 2012-03-27 11:09 -0700
Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 13:32 -0700
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 14:29 -0700
Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 16:22 -0700
Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 18:20 -0700
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 18:27 -0700
Re: Keeping the split token in a Java regular expression Gene Wirchenko <genew@ocis.net> - 2012-03-27 21:31 -0700
Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-28 07:41 +0200
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-28 10:28 -0700
Re: Keeping the split token in a Java regular expression Lew <lewbloch@gmail.com> - 2012-03-26 16:26 -0700
Re: Keeping the split token in a Java regular expression Knute Johnson <nospam@knutejohnson.com> - 2012-03-26 17:36 -0700
Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-27 23:27 +0200
Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-28 07:28 +0200
Re: Keeping the split token in a Java regular expression "John B. Matthews" <nospam@nospam.invalid> - 2012-03-26 20:49 -0400
Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-26 21:58 -0400
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-26 21:14 -0700
Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-27 17:21 -0400
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 15:20 -0700
Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-27 18:48 -0400
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 17:07 -0700
Re: Keeping the split token in a Java regular expression Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-03-27 21:49 -0300
Re: Keeping the split token in a Java regular expression Arne Vajhøj <arne@vajhoej.dk> - 2012-03-27 20:56 -0400
Re: Keeping the split token in a Java regular expression Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-03-27 22:01 -0300
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 18:27 -0700
Re: Keeping the split token in a Java regular expression Jim Janney <jjanney@shell.xmission.com> - 2012-03-27 08:15 -0600
Re: Keeping the split token in a Java regular expression laredotornado <laredotornado@zipmail.com> - 2012-03-27 07:58 -0700
Re: Keeping the split token in a Java regular expression Jim Janney <jjanney@shell.xmission.com> - 2012-03-27 09:21 -0600
Re: Keeping the split token in a Java regular expression Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-03-27 09:43 -0700
Re: Keeping the split token in a Java regular expression Robert Klemme <shortcutter@googlemail.com> - 2012-03-28 07:51 +0200
Page 2 of 3 — ← Prev page 1 [2] 3 Next page →
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2012-03-27 09:10 -0700 |
| Message-ID | <pfp3n79hbjpqt1d7h3cbtbf8gggms38ud8@4ax.com> |
| In reply to | #13219 |
On Mon, 26 Mar 2012 20:40:24 -0700, Knute Johnson
<nospam@knutejohnson.com> wrote:
>On 3/26/2012 7:07 PM, Lew wrote:
>> Gene Wirchenko wrote:
>>> What about "Sun 9, 11 AM, and 1 PM"?
>>> Or "Sun 9 and 11 AM, and 1 and 3 PM"?
>>>
>>> I think you had better be quite sure of all of the variants. For
>>> that matter, people often omit the comma before "and" which would give
>>> "Sun 9, 11 AM and 1 PM" for my first example. Such people have
>>> probably not seen
>>> http://www.outsidethebeltway.com/oxford-comma-cartoon/
>>> or other such references.
>>
>> The point is that you need a precise, perhaps formal statement of the
exact rules to parse the input, and what to do when the input format
fails quality checks.
>>
>> Parsing is a Dark Art in programming - not really the hardest of them,
but worthy of close attention.
>>
>> It does require a careful, methodical approach.
>You've been awfully poetic lately Lew.
I prefer the "new" Lew. He has dropped the antagonism that I
often saw, and it has made his posts much more readable and useful.
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | Lew <lewbloch@gmail.com> |
|---|---|
| Date | 2012-03-27 11:09 -0700 |
| Message-ID | <16745393.406.1332871796181.JavaMail.geo-discussion-forums@pbij6> |
| In reply to | #13228 |
Gene Wirchenko wrote: > I prefer the "new" Lew. He has dropped the antagonism that I > often saw, and it has made his posts much more readable and useful. I give your preference all the consideration that it is due. -- Lew
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2012-03-27 13:32 -0700 |
| Message-ID | <9j84n7htui6ahhd7fd6e1rudl1cnuatfjr@4ax.com> |
| In reply to | #13230 |
On Tue, 27 Mar 2012 11:09:56 -0700 (PDT), Lew <lewbloch@gmail.com>
wrote:
>Gene Wirchenko wrote:
>> I prefer the "new" Lew. He has dropped the antagonism that I
>> often saw, and it has made his posts much more readable and useful.
>
>I give your preference all the consideration that it is due.
As manners are a social lubricant and a fairly inexpensive one,
that would be quite a lot. Thank you. If you did not mean that,
consider meaning that. You are quite knowledgeable, and without an
antagonistic curve, your posts are very good indeed. This same
statement applies to many people posting on USENET.
Call my preference the USENET Manners Project if you want.
Disagreeing is one thing; being disagreeable is quite another.
http://xkcd.com/386/
is a good joke but a poor reality.
I look forward to your next politely informative post, Lew. Your
recent one clarifying a sentence of yours was very nice indeed.
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | Daniel Pitts <newsgroup.nospam@virtualinfinity.net> |
|---|---|
| Date | 2012-03-27 14:29 -0700 |
| Message-ID | <2jqcr.27255$QC3.7246@newsfe16.iad> |
| In reply to | #13232 |
On 3/27/12 1:32 PM, Gene Wirchenko wrote: > On Tue, 27 Mar 2012 11:09:56 -0700 (PDT), Lew<lewbloch@gmail.com> > wrote: > >> Gene Wirchenko wrote: >>> I prefer the "new" Lew. He has dropped the antagonism that I >>> often saw, and it has made his posts much more readable and useful. >> >> I give your preference all the consideration that it is due. > > As manners are a social lubricant and a fairly inexpensive one, > that would be quite a lot. Thank you. If you did not mean that, > consider meaning that. You are quite knowledgeable, and without an > antagonistic curve, your posts are very good indeed. This same > statement applies to many people posting on USENET. At the same time, it is ones personal loss to ignore something because of who said it or how it was said. Part of the problem is the jadedness that some of the old-timers on this group have, due to certain trolls-who-shall-not-be-named. Lew is a very analytical and structured person, arguing facts logically, with references is more likely to persuade him than talking about feelings. I'm very much the same way, though I have tried to include my understanding of psychology in my responses. > Call my preference the USENET Manners Project if you want. > Disagreeing is one thing; being disagreeable is quite another. > http://xkcd.com/386/ > is a good joke but a poor reality. > > I look forward to your next politely informative post, Lew. Your > recent one clarifying a sentence of yours was very nice indeed. I just want to point out that while your intentions *may* be good, the tone of your message comes off just as smug as what you're attempting to decry. I'm not trying to stir up a flame war, but I'm hoping that you can see the other side of this as well. Lew has been a long time contributor to the Java newsgroups, and I have never found any of this posts personally distasteful in any way. This is the internet, and some slight thickness of skin is expected. So, please, stop baiting each other, and keep these messages on topic.
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2012-03-27 16:22 -0700 |
| Message-ID | <dpi4n7tdmt9pe7l9me3heubt1g8ii3v0nm@4ax.com> |
| In reply to | #13238 |
On Tue, 27 Mar 2012 14:29:33 -0700, Daniel Pitts
<newsgroup.nospam@virtualinfinity.net> wrote:
[snip]
>At the same time, it is ones personal loss to ignore something because
>of who said it or how it was said. Part of the problem is the jadedness
One must balance the loss of missing something with the loss of
spending time trying to uncurve a response.
[snip]
>I just want to point out that while your intentions *may* be good, the
>tone of your message comes off just as smug as what you're attempting to
>decry. I'm not trying to stir up a flame war, but I'm hoping that you
>can see the other side of this as well. Lew has been a long time
>contributor to the Java newsgroups, and I have never found any of this
>posts personally distasteful in any way. This is the internet, and some
>slight thickness of skin is expected.
"slight". And that does mean that being rude is good.
>So, please, stop baiting each other, and keep these messages on topic.
I am not baiting him. I like the polite Lew. There is no reason
why people can not be polite on USENET. They just have to decide to
do so.
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2012-03-27 18:20 -0700 |
| Message-ID | <0qp4n7pohramm6lrbvhvc08k0v5cj5lg7e@4ax.com> |
| In reply to | #13242 |
On Tue, 27 Mar 2012 16:22:29 -0700, Gene Wirchenko <genew@ocis.net>
wrote:
>On Tue, 27 Mar 2012 14:29:33 -0700, Daniel Pitts
><newsgroup.nospam@virtualinfinity.net> wrote:
>
>[snip]
>
>>At the same time, it is ones personal loss to ignore something because
>>of who said it or how it was said. Part of the problem is the jadedness
>
> One must balance the loss of missing something with the loss of
>spending time trying to uncurve a response.
>
>[snip]
>
>>I just want to point out that while your intentions *may* be good, the
>>tone of your message comes off just as smug as what you're attempting to
>>decry. I'm not trying to stir up a flame war, but I'm hoping that you
>>can see the other side of this as well. Lew has been a long time
>>contributor to the Java newsgroups, and I have never found any of this
>>posts personally distasteful in any way. This is the internet, and some
>>slight thickness of skin is expected.
>
> "slight". And that does mean that being rude is good.
^
I missed a "not" here.
>>So, please, stop baiting each other, and keep these messages on topic.
>
> I am not baiting him. I like the polite Lew. There is no reason
>why people can not be polite on USENET. They just have to decide to
>do so.
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | Daniel Pitts <newsgroup.nospam@virtualinfinity.net> |
|---|---|
| Date | 2012-03-27 18:27 -0700 |
| Message-ID | <yOtcr.6782$V94.4319@newsfe19.iad> |
| In reply to | #13247 |
On 3/27/12 6:20 PM, Gene Wirchenko wrote: > On Tue, 27 Mar 2012 16:22:29 -0700, Gene Wirchenko<genew@ocis.net> > wrote: >> "slight". And that does mean that being rude is good. > ^ > I missed a "not" here. I had wondered ;-)
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2012-03-27 21:31 -0700 |
| Message-ID | <vv45n7dtnoq0k3u696eeed3sbrobig31nv@4ax.com> |
| In reply to | #13249 |
On Tue, 27 Mar 2012 18:27:58 -0700, Daniel Pitts
<newsgroup.nospam@virtualinfinity.net> wrote:
>On 3/27/12 6:20 PM, Gene Wirchenko wrote:
>> On Tue, 27 Mar 2012 16:22:29 -0700, Gene Wirchenko<genew@ocis.net>
>> wrote:
>>> "slight". And that does mean that being rude is good.
>> ^
>> I missed a "not" here.
>I had wondered ;-)
I have noted over the years, that if there is one word that
people will miss in posts, it is "not".
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2012-03-28 07:41 +0200 |
| Message-ID | <9tfmk0F5ooU2@mid.individual.net> |
| In reply to | #13251 |
On 03/28/2012 06:31 AM, Gene Wirchenko wrote: > On Tue, 27 Mar 2012 18:27:58 -0700, Daniel Pitts > <newsgroup.nospam@virtualinfinity.net> wrote: > >> On 3/27/12 6:20 PM, Gene Wirchenko wrote: >>> On Tue, 27 Mar 2012 16:22:29 -0700, Gene Wirchenko<genew@ocis.net> >>> wrote: >>>> "slight". And that does mean that being rude is good. >>> ^ >>> I missed a "not" here. >> I had wondered ;-) > > I have noted over the years, that if there is one word that > people will miss in posts, it is "not". I don't remember the details but I once heard that people cannot remember "not" - seems to be a psychological thing or a "feature" of the mind. You kind of focus on the main message and then you forget to store the negation as well. Kind regards robert
[toc] | [prev] | [next] | [standalone]
| From | Daniel Pitts <newsgroup.nospam@virtualinfinity.net> |
|---|---|
| Date | 2012-03-28 10:28 -0700 |
| Message-ID | <DSHcr.14713$532.10656@newsfe14.iad> |
| In reply to | #13254 |
On 3/27/12 10:41 PM, Robert Klemme wrote: > On 03/28/2012 06:31 AM, Gene Wirchenko wrote: >> On Tue, 27 Mar 2012 18:27:58 -0700, Daniel Pitts >> <newsgroup.nospam@virtualinfinity.net> wrote: >> >>> On 3/27/12 6:20 PM, Gene Wirchenko wrote: >>>> On Tue, 27 Mar 2012 16:22:29 -0700, Gene Wirchenko<genew@ocis.net> >>>> wrote: >>>>> "slight". And that does mean that being rude is good. >>>> ^ >>>> I missed a "not" here. >>> I had wondered ;-) >> >> I have noted over the years, that if there is one word that >> people will miss in posts, it is "not". > > I don't remember the details but I once heard that people cannot > remember "not" - seems to be a psychological thing or a "feature" of the > mind. You kind of focus on the main message and then you forget to store > the negation as well. I wonder if this is really a true phenomena, or even if it is frequent enough to contort your point to avoid negating the text of it. If there is any chance that your point will be pulled out of context, (such as with dubious reporters), then you may want to choose your words in such a way that the "not" isn't elided. However, on the day-to-day conversation, I think some concepts are so much easier to convey as what they are not, instead of what they are.
[toc] | [prev] | [next] | [standalone]
| From | Lew <lewbloch@gmail.com> |
|---|---|
| Date | 2012-03-26 16:26 -0700 |
| Message-ID | <21500379.296.1332804401740.JavaMail.geo-discussion-forums@pbbpk10> |
| In reply to | #13190 |
Stefan Ram wrote:
> laredotornado writes:
>>What I would like to do is split the expression wherever I have an
>
> public class Main
> {
> public static void split
> ( final java.lang.String text )
> { java.util.regex.Pattern pattern =
> java.util.regex.Pattern.compile
> ( ".*?(?:am|pm),?", java.util.regex.Pattern.CASE_INSENSITIVE );
> java.util.regex.Matcher matcher = pattern.matcher( text );
> while( matcher.find() )
> java.lang.System.out.println( matcher.group( 0 )); }
>
> public static void main( final java.lang.String[] args )
> { split( "Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM" ); }}
This excellent (except for layout) example deserves to be archived.
--
Lew
[toc] | [prev] | [next] | [standalone]
| From | Knute Johnson <nospam@knutejohnson.com> |
|---|---|
| Date | 2012-03-26 17:36 -0700 |
| Message-ID | <jkr21q$iql$2@dont-email.me> |
| In reply to | #13206 |
On 3/26/2012 4:26 PM, Lew wrote:
> Stefan Ram wrote:
>> laredotornado writes:
>>> What I would like to do is split the expression wherever I have an
>>
>> public class Main
>> {
>> public static void split
>> ( final java.lang.String text )
>> { java.util.regex.Pattern pattern =
>> java.util.regex.Pattern.compile
>> ( ".*?(?:am|pm),?", java.util.regex.Pattern.CASE_INSENSITIVE );
>> java.util.regex.Matcher matcher = pattern.matcher( text );
>> while( matcher.find() )
>> java.lang.System.out.println( matcher.group( 0 )); }
>>
>> public static void main( final java.lang.String[] args )
>> { split( "Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM" ); }}
>
> This excellent (except for layout) example deserves to be archived.
>
I like that too. I tried it but I didn't get this.
--
Knute Johnson
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2012-03-27 23:27 +0200 |
| Message-ID | <9tepmvFhjvU1@mid.individual.net> |
| In reply to | #13206 |
On 03/27/2012 01:26 AM, Lew wrote:
> Stefan Ram wrote:
>> laredotornado writes:
>>> What I would like to do is split the expression wherever I have an
>>
>> public class Main
>> {
>> public static void split
>> ( final java.lang.String text )
>> { java.util.regex.Pattern pattern =
>> java.util.regex.Pattern.compile
>> ( ".*?(?:am|pm),?", java.util.regex.Pattern.CASE_INSENSITIVE );
>> java.util.regex.Matcher matcher = pattern.matcher( text );
>> while( matcher.find() )
>> java.lang.System.out.println( matcher.group( 0 )); }
>>
>> public static void main( final java.lang.String[] args )
>> { split( "Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM" ); }}
>
> This excellent (except for layout) example deserves to be archived.
What do you find excellent about this? I find it has some deficiencies
- the separator is included in the match (which goes against the
requirements despite the thread subject)
- spaces after a separator comma are included in the next token as
leading text
- the method really does more than splitting (namely printing), so the
name does not reflect what's going on
- the Pattern is compiled on _every_ invocation of the method
- the method is unnecessary restricted, argument type CharSequence is
sufficient
Test output for
"Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM"
"Fri 8 PM, Sat 1, 3, and 5 PM"
Fri 7:30 PM,
Sat 2 PM,
Sun 2:30 PM
---
Fri 8 PM,
Sat 1, 3, and 5 PM
---
I would change that to
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
private static final Pattern SPLIT_PATTERN = Pattern.compile(
"(\\S.*?(?:am|pm))(?:,\\s*)?", Pattern.CASE_INSENSITIVE);
public static void splitPrint(final CharSequence text) {
for (final Matcher m = SPLIT_PATTERN.matcher(text); m.find();) {
System.out.println(m.group(1));
}
}
public static List<String> split(final CharSequence text) {
final List<String> result = new ArrayList<String>();
for (final Matcher m = SPLIT_PATTERN.matcher(text); m.find();) {
result.add(m.group(1));
}
return result;
}
public static void main(final java.lang.String[] args) {
splitPrint("Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM");
System.out.println("---");
splitPrint("Fri 8 PM, Sat 1, 3, and 5 PM");
System.out.println("---");
}
}
I might even sneak a "\\s*" in between "pm)" and "(?:," to even catch
cases where there are spaces before the separator.
Kind regards
robert
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2012-03-28 07:28 +0200 |
| Message-ID | <9tflrdF259U1@mid.individual.net> |
| In reply to | #13237 |
On 03/27/2012 11:27 PM, Robert Klemme wrote:
> On 03/27/2012 01:26 AM, Lew wrote:
>> Stefan Ram wrote:
>>> laredotornado writes:
>>>> What I would like to do is split the expression wherever I have an
>>>
>>> public class Main
...
>>
>> This excellent (except for layout) example deserves to be archived.
>
> What do you find excellent about this? I find it has some deficiencies
>
> - the separator is included in the match (which goes against the
> requirements despite the thread subject)
> - spaces after a separator comma are included in the next token as
> leading text
> - the method really does more than splitting (namely printing), so the
> name does not reflect what's going on
> - the Pattern is compiled on _every_ invocation of the method
> - the method is unnecessary restricted, argument type CharSequence is
> sufficient
>
> Test output for
> "Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM"
> "Fri 8 PM, Sat 1, 3, and 5 PM"
>
> Fri 7:30 PM,
> Sat 2 PM,
> Sun 2:30 PM
> ---
> Fri 8 PM,
> Sat 1, 3, and 5 PM
> ---
>
> I would change that to
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
private static final Pattern SPLIT_PATTERN = Pattern.compile(
"(\\S.*?[ap]m)(?:,\\s*)?", Pattern.CASE_INSENSITIVE);
public static void splitPrint(final CharSequence text) {
for (final Matcher m = SPLIT_PATTERN.matcher(text); m.find();) {
System.out.println(m.group(1));
}
}
public static List<String> split(final CharSequence text) {
final List<String> result = new ArrayList<String>();
for (final Matcher m = SPLIT_PATTERN.matcher(text); m.find();) {
result.add(m.group(1));
}
return result;
}
public static void main(final java.lang.String[] args) {
splitPrint("Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM");
System.out.println("---");
splitPrint("Fri 8 PM, Sat 1, 3, and 5 PM");
System.out.println("---");
}
}
I had overlooked a fairly obvious improvement with regards to am/pm parsing.
> I might even sneak a "\\s*" in between "pm)" and "(?:," to even catch
> cases where there are spaces before the separator.
Kind regards
robert
[toc] | [prev] | [next] | [standalone]
| From | "John B. Matthews" <nospam@nospam.invalid> |
|---|---|
| Date | 2012-03-26 20:49 -0400 |
| Message-ID | <nospam-884B21.20492426032012@news.aioe.org> |
| In reply to | #13190 |
In article <48d35bc3-a391-4ccf-a222-dac64775a2f2@oq7g2000pbb.googlegroups.com>, laredotornado <laredotornado@zipmail.com> wrote: > I'm using Java 6. I want to split a Java string on a regular > expression, but I would like to keep part of the string used to split > in the results. What I have are Strings like > > Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM > > What I would like to do is split the expression wherever I have an > expression matching /(am|pm),?/i . Hopefully I got that right. In > the above example, I would like the results to be > > Fri 7:30 PM > Sat 2 PM > Sun 2:30 PM > > But with String.split, the split token is not kept within the > results. How would I write a Java parsing expression to do what I > want? Instead of split, why not parse and format? -- John B. Matthews trashgod at gmail dot com <http://sites.google.com/site/drjohnbmatthews>
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-03-26 21:58 -0400 |
| Message-ID | <4f711ee1$0$294$14726298@news.sunsite.dk> |
| In reply to | #13190 |
On 3/26/2012 2:54 PM, laredotornado wrote:
> I'm using Java 6. I want to split a Java string on a regular
> expression, but I would like to keep part of the string used to split
> in the results. What I have are Strings like
>
> Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM
>
> What I would like to do is split the expression wherever I have an
> expression matching /(am|pm),?/i . Hopefully I got that right. In
> the above example, I would like the results to be
>
> Fri 7:30 PM
> Sat 2 PM
> Sun 2:30 PM
>
> But with String.split, the split token is not kept within the
> results. How would I write a Java parsing expression to do what I
> want?
A hackish solution:
String[] p = s.replaceAll("[AP]M", "$0X$0").split("X[AP]M");
Arne
[toc] | [prev] | [next] | [standalone]
| From | Daniel Pitts <newsgroup.nospam@virtualinfinity.net> |
|---|---|
| Date | 2012-03-26 21:14 -0700 |
| Message-ID | <K8bcr.41913$%P4.6823@newsfe05.iad> |
| In reply to | #13215 |
On 3/26/12 6:58 PM, Arne Vajhøj wrote:
> On 3/26/2012 2:54 PM, laredotornado wrote:
>> I'm using Java 6. I want to split a Java string on a regular
>> expression, but I would like to keep part of the string used to split
>> in the results. What I have are Strings like
>>
>> Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM
>>
>> What I would like to do is split the expression wherever I have an
>> expression matching /(am|pm),?/i . Hopefully I got that right. In
>> the above example, I would like the results to be
>>
>> Fri 7:30 PM
>> Sat 2 PM
>> Sun 2:30 PM
>>
>> But with String.split, the split token is not kept within the
>> results. How would I write a Java parsing expression to do what I
>> want?
>
> A hackish solution:
>
> String[] p = s.replaceAll("[AP]M", "$0X$0").split("X[AP]M");
>
> Arne
>
Nice. As far as hackish, using "split" for this purpose at all is
hackish. Stefan Ram had the right algorithm (though strange formatting)
Stefan Ram wrote:
> public class Main
> {
> public static void split
> ( final java.lang.String text )
> { java.util.regex.Pattern pattern =
> java.util.regex.Pattern.compile
> ( ".*?(?:am|pm),?", java.util.regex.Pattern.CASE_INSENSITIVE );
> java.util.regex.Matcher matcher = pattern.matcher( text );
> while( matcher.find() )
> java.lang.System.out.println( matcher.group( 0 )); }
>
> public static void main( final java.lang.String[] args )
> { split( "Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM" ); }}
>
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-03-27 17:21 -0400 |
| Message-ID | <4f722f41$0$290$14726298@news.sunsite.dk> |
| In reply to | #13220 |
On 3/27/2012 12:14 AM, Daniel Pitts wrote:
> On 3/26/12 6:58 PM, Arne Vajhøj wrote:
>> On 3/26/2012 2:54 PM, laredotornado wrote:
>>> I'm using Java 6. I want to split a Java string on a regular
>>> expression, but I would like to keep part of the string used to split
>>> in the results. What I have are Strings like
>>>
>>> Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM
>>>
>>> What I would like to do is split the expression wherever I have an
>>> expression matching /(am|pm),?/i . Hopefully I got that right. In
>>> the above example, I would like the results to be
>>>
>>> Fri 7:30 PM
>>> Sat 2 PM
>>> Sun 2:30 PM
>>>
>>> But with String.split, the split token is not kept within the
>>> results. How would I write a Java parsing expression to do what I
>>> want?
>>
>> A hackish solution:
>>
>> String[] p = s.replaceAll("[AP]M", "$0X$0").split("X[AP]M");
>
> Nice. As far as hackish, using "split" for this purpose at all is
> hackish.
That type of split is the typical way in most modern languages
(though usually in a non regex flavor).
Arne
[toc] | [prev] | [next] | [standalone]
| From | Daniel Pitts <newsgroup.nospam@virtualinfinity.net> |
|---|---|
| Date | 2012-03-27 15:20 -0700 |
| Message-ID | <y2rcr.42984$%P4.35732@newsfe05.iad> |
| In reply to | #13235 |
On 3/27/12 2:21 PM, Arne Vajhøj wrote:
> On 3/27/2012 12:14 AM, Daniel Pitts wrote:
>> On 3/26/12 6:58 PM, Arne Vajhøj wrote:
>>> On 3/26/2012 2:54 PM, laredotornado wrote:
>>>> I'm using Java 6. I want to split a Java string on a regular
>>>> expression, but I would like to keep part of the string used to split
>>>> in the results. What I have are Strings like
>>>>
>>>> Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM
>>>>
>>>> What I would like to do is split the expression wherever I have an
>>>> expression matching /(am|pm),?/i . Hopefully I got that right. In
>>>> the above example, I would like the results to be
>>>>
>>>> Fri 7:30 PM
>>>> Sat 2 PM
>>>> Sun 2:30 PM
>>>>
>>>> But with String.split, the split token is not kept within the
>>>> results. How would I write a Java parsing expression to do what I
>>>> want?
>>>
>>> A hackish solution:
>>>
>>> String[] p = s.replaceAll("[AP]M", "$0X$0").split("X[AP]M");
> >
>> Nice. As far as hackish, using "split" for this purpose at all is
>> hackish.
>
> That type of split is the typical way in most modern languages
> (though usually in a non regex flavor).
For functional languages, yes, but those modern languages don't
necessarily return an array. Ideally they would return "iterable" of
some sort.
And in any case, this particular problem is not a "split" kind of
problem, but a "parse" kind of problem. So, split for this is hackish,
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-03-27 18:48 -0400 |
| Message-ID | <4f7243cf$0$289$14726298@news.sunsite.dk> |
| In reply to | #13240 |
On 3/27/2012 6:20 PM, Daniel Pitts wrote:
> On 3/27/12 2:21 PM, Arne Vajhøj wrote:
>> On 3/27/2012 12:14 AM, Daniel Pitts wrote:
>>> On 3/26/12 6:58 PM, Arne Vajhøj wrote:
>>>> On 3/26/2012 2:54 PM, laredotornado wrote:
>>>>> I'm using Java 6. I want to split a Java string on a regular
>>>>> expression, but I would like to keep part of the string used to split
>>>>> in the results. What I have are Strings like
>>>>>
>>>>> Fri 7:30 PM, Sat 2 PM, Sun 2:30 PM
>>>>>
>>>>> What I would like to do is split the expression wherever I have an
>>>>> expression matching /(am|pm),?/i . Hopefully I got that right. In
>>>>> the above example, I would like the results to be
>>>>>
>>>>> Fri 7:30 PM
>>>>> Sat 2 PM
>>>>> Sun 2:30 PM
>>>>>
>>>>> But with String.split, the split token is not kept within the
>>>>> results. How would I write a Java parsing expression to do what I
>>>>> want?
>>>>
>>>> A hackish solution:
>>>>
>>>> String[] p = s.replaceAll("[AP]M", "$0X$0").split("X[AP]M");
>> >
>>> Nice. As far as hackish, using "split" for this purpose at all is
>>> hackish.
>>
>> That type of split is the typical way in most modern languages
>> (though usually in a non regex flavor).
> For functional languages, yes, but those modern languages don't
> necessarily return an array. Ideally they would return "iterable" of
> some sort.
.NET String Split return string[] (non regex)
.NET Regex Split return string[] (regex)
PHP split return array (regex)
PHP explode return array (non regex)
PHP preg_split return array (regex)
> And in any case, this particular problem is not a "split" kind of
> problem, but a "parse" kind of problem. So, split for this is hackish,
I think it would be rather common in practice.
Arne
[toc] | [prev] | [next] | [standalone]
Page 2 of 3 — ← Prev page 1 [2] 3 Next page →
Back to top | Article view | comp.lang.java.programmer
csiph-web