Groups | Search | Server Info | Login | Register
Groups > comp.lang.awk > #9936
| From | Janis Papanagnou <janis_papanagnou+ng@hotmail.com> |
|---|---|
| Newsgroups | comp.lang.awk |
| Subject | Re: Experiences with match() subexpressions? |
| Date | 2025-04-10 13:55 +0200 |
| Organization | A noiseless patient Spider |
| Message-ID | <vt8bit$2uiq5$1@dont-email.me> (permalink) |
| References | <vt7qlq$2ge70$1@dont-email.me> <vt7qs4$2gior$1@dont-email.me> <vt88s7$1ghd2$1@news.xmission.com> |
On 10.04.2025 13:08, Kenny McCormack wrote:
> In article <vt7qs4$2gior$1@dont-email.me>,
> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
>> On 10.04.2025 09:06, Janis Papanagnou wrote:
>>> I'm looking for subexpressions of regexp-matches using GNU Awk's
>>> third parameter of match(). For example
>>>
>>> data = "R=r1,R=r2,R=r3,E=e"
>>> match (data, /^(R=([^,]+),){2,5}E=(.+)$/, arr)
>>>
>>> The result stored in 'arr' seems to be determined by the static
>>> parenthesis structure, so with the pattern repetition {2,5} only
>>> the last matched data in the subexpression (r3) seems to persist
>>> in arr. - I suppose there's no cute way to achieve what I wanted?
>>
>> To clarify; what I wanted is access of the values "r1", "r2", "r3",
>> and "e" through 'arr'.
>
> I have to admit that I (still) don't really understand how this match third
> arg stuff works.
I've never used that before but it seems to be quite simple; for every
parenthesis group expression in the regexp it provides (statically, as
the parentheses are written, from left to right) an array element with
the expanded matched subexpression.
> I.e., I can never predict what will happen, so I always
> just dump out the array and try to reverse-engineer it each time I need to
> use it.
>
> I adapted your code into the following test script:
>
> --- Cut Here ---
> #!/bin/sh
> gawk 'BEGIN {
> data = "R=r1,R=r2,R=r3,E=e"
> match (data, /^(R=([^,]+),){2,5}E=(.+)$/, arr)
> for (i in arr) print i,arr[i]
> }'
>
> # To clarify; what I wanted is access of the values "r1", "r2", "r3",
> # and "e" through 'arr'.
> --- Cut Here ---
>
> The output I get is:
>
> --- Cut Here ---
> 0start 1
> 0length 18
> 3start 18
> 1start 11
> 2start 13
> 3length 1
> 2length 2
> 1length 5
Above output stuff appears because in 'arr' there's additional elements
about the pattern positions stored.
I don't need that so I'm just interested in the data patterns below and
iterate with a index-counted loop...
> 0 R=r1,R=r2,R=r3,E=e
the whole expression
> 1 R=r3,
the expression in the first parenthesis
> 2 r3
the expression in the second, embedded parenthesis
> 3 e
the expression in the final parenthesis
> --- Cut Here ---
>
> After playing around a bit, I could not come up with any sensible way of
> getting what you want to get.
Yeah, Arnold just told me the same; that it's impossible because the
underlying GNU regexp library doesn't support what I'm looking for.
What I considered a possible workaround (in this case) is to sequence
the (...){2,5} expression by using sequences of (...)? expressions.
(But in the general case, for larger ranges than 2-5, that's neither
feasible nor sensible any more.)
>
> As an alternative, it sounds like you could just could just split the
> string on the comma; that would get you:
Yes, that was also how I did such things in the past. Only when I saw
that "third argument" to match() I hoped the two-level parsing could
be simplified in one step. The reason was that I thought to have seen
other languages (Perl, maybe?) that supported such a feature.
>
> R=r1
> R=r2
> R=r3
> E=e
>
> Or, for finer control, you could use patsplit().
I think I'll do the parsing the straightforward two-step way as I did
before the GNU Awk specific functions were available; it's probably
also the clearest way to program that functionality.
Janis
Back to comp.lang.awk | Previous | Next — Previous in thread | Next in thread | Find similar
Experiences with match() subexpressions? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-04-10 09:06 +0200
Re: Experiences with match() subexpressions? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-04-10 09:09 +0200
Re: Experiences with match() subexpressions? gazelle@shell.xmission.com (Kenny McCormack) - 2025-04-10 11:08 +0000
Re: Experiences with match() subexpressions? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-04-10 13:55 +0200
Re: Experiences with match() subexpressions? gazelle@shell.xmission.com (Kenny McCormack) - 2025-04-10 14:04 +0000
Re: Experiences with match() subexpressions? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-04-10 23:39 +0200
Re: Experiences with match() subexpressions? arnold@freefriends.org (Aharon Robbins) - 2025-04-11 06:33 +0000
Re: Experiences with match() subexpressions? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-04-11 09:10 +0200
Re: Experiences with match() subexpressions? Kaz Kylheku <643-408-1753@kylheku.com> - 2025-04-11 08:22 +0000
Re: Experiences with match() subexpressions? Manuel Collado <mcollado2011@gmail.com> - 2025-04-18 12:03 +0200
Re: Experiences with match() subexpressions? gazelle@shell.xmission.com (Kenny McCormack) - 2025-04-18 12:01 +0000
Re: Experiences with match() subexpressions? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-04-18 14:24 +0200
Re: Experiences with match() subexpressions? Kaz Kylheku <643-408-1753@kylheku.com> - 2025-04-11 07:40 +0000
The new matcher (Was: Experiences with match() subexpressions?) gazelle@shell.xmission.com (Kenny McCormack) - 2025-04-11 08:57 +0000
Re: The new matcher (Was: Experiences with match() subexpressions?) Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-04-11 15:50 +0200
Re: Experiences with match() subexpressions? Kaz Kylheku <643-408-1753@kylheku.com> - 2025-04-11 17:54 +0000
Re: Experiences with match() subexpressions? Ed Morton <mortonspam@gmail.com> - 2025-04-10 20:07 -0500
Re: Experiences with match() subexpressions? Ed Morton <mortonspam@gmail.com> - 2025-04-13 12:52 -0500
Nitpicking the code (Was: Experiences with match() subexpressions?) gazelle@shell.xmission.com (Kenny McCormack) - 2025-04-14 18:20 +0000
Re: Nitpicking the code (Was: Experiences with match() subexpressions?) Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-04-14 20:53 +0200
Re: Nitpicking the code (Was: Experiences with match() subexpressions?) Ed Morton <mortonspam@gmail.com> - 2025-04-14 18:55 -0500
Re: Nitpicking the code (Was: Experiences with match() subexpressions?) Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-04-15 05:35 +0200
csiph-web