Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #2846
| From | Eric Sosman <esosman@ieee-dot-org.invalid> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: regex capability |
| Date | 2011-04-04 08:03 -0400 |
| Organization | A noiseless patient Spider |
| Message-ID | <incc59$dn$1@dont-email.me> (permalink) |
| References | <u3sip61i207oesd83ckbrt3vjm66p948kd@4ax.com> <jYqdnTTVm5ob6QTQnZ2dnUVZ87-dnZ2d@telenor.com> <futip6thcc2sshjanfkv9s18hqeo69qrsn@4ax.com> |
On 4/4/2011 3:50 AM, Roedy Green wrote:
> On Mon, 04 Apr 2011 02:34:30 -0500, Leif Roar Moldskred
> <leifm@dimnakorr.com> wrote, quoted or indirectly quoted someone who
> said :
>
>>
>> Easiest is to just use split. You can always do a regex of the type
>> "(\\d+)/((\\d+)/)?((\\d+)/)?((\\d+)/)?" but that's just pointlessly
>> complicated. There's no reason why you should use a regex when "normal"
>> string parsing is simpler and easier to read.
>
> (xxx|yyy)+ seems to generate only one group item, no matter how many
> repetitions there are. That strikes me as a bug, but likely someone
> can explain why it is a feature or inevitability.
A (section of a) regex matches a (section of a) string, and the
Matcher machinery can tell you what substring was matched. The
machinery has no provision for doing further processing on that
matched substring, like saying "Oh, your regex didn't match a
string this time, but an array of strings."
You could, perhaps, cook up substitutes for Pattern and Matcher
to do such a thing. But I'm not sure you'd want to, because it
could make the API rather complicated. For example, consider a
fanex (for "fancy expression," like "regular expression" only
more so) along the lines of "(pat1)(pat2)" where "pat1" and "pat2"
can match and return arrays of substrings. The FancyMatcher says
"I matched five substrings." So you call group(3) to get the
third of them -- was it matched by "pat1" or by "pat2"? Yes, you
could invent an API to deal with this -- maybe FancyMatcher returns
a tree of nodes that point to other nodes and/or to substrings --
but I'm not confident this would be an unqualified improvement.
--
Eric Sosman
esosman@ieee-dot-org.invalid
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar
regex capability Roedy Green <see_website@mindprod.com.invalid> - 2011-04-04 00:19 -0700
Re: regex capability Leif Roar Moldskred <leifm@dimnakorr.com> - 2011-04-04 02:34 -0500
Re: regex capability Roedy Green <see_website@mindprod.com.invalid> - 2011-04-04 00:50 -0700
Re: regex capability Eric Sosman <esosman@ieee-dot-org.invalid> - 2011-04-04 08:03 -0400
Re: regex capability Patricia Shanahan <pats@acm.org> - 2011-04-04 05:25 -0700
Re: regex capability David Lamb <dalamb@cs.queensu.ca> - 2011-04-04 18:51 -0400
Re: regex capability Jim Gibson <jimsgibson@gmail.com> - 2011-04-04 17:07 -0700
Re: regex capability bugbear <bugbear@trim_papermule.co.uk_trim> - 2011-04-04 09:26 +0100
Re: regex capability Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2011-04-04 19:20 +0200
Re: regex capability Robert Klemme <shortcutter@googlemail.com> - 2011-04-04 22:13 +0200
Re: regex capability markspace <-@.> - 2011-04-04 18:35 -0700
Re: regex capability bugbear <bugbear@trim_papermule.co.uk_trim> - 2011-04-05 09:09 +0100
Re: regex capability Paul Cager <paul.cager@googlemail.com> - 2011-04-05 02:10 -0700
Re: regex capability Patricia Shanahan <pats@acm.org> - 2011-04-05 05:28 -0700
Re: regex capability Robert Klemme <shortcutter@googlemail.com> - 2011-04-05 06:33 -0700
Re: regex capability markspace <-@.> - 2011-04-05 10:07 -0700
csiph-web