Groups | Search | Server Info | Login | Register


Groups > comp.lang.awk > #9983

Re: substr() - copying or not copying, that is here the question.

From Janis Papanagnou <janis_papanagnou+ng@hotmail.com>
Newsgroups comp.lang.awk
Subject Re: substr() - copying or not copying, that is here the question.
Date 2025-06-01 13:43 +0200
Organization A noiseless patient Spider
Message-ID <101hecq$22ab2$1@dont-email.me> (permalink)
References <101f9oo$18edp$1@dont-email.me> <683b5389$0$683$14726298@news.sunsite.dk> <101fv4s$1g5c8$1@dont-email.me> <87h60zrbea.fsf@bsb.me.uk>

Show all headers | View raw


On 01.06.2025 12:42, Ben Bacarisse wrote:
> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
> 
>> On 31.05.2025 21:07, Mack The Knife wrote:
>>> In article <101f9oo$18edp$1@dont-email.me>,
>>> Janis Papanagnou  <janis_papanagnou+ng@hotmail.com> wrote:
>>>> In the context   p=index(substr(t,s),r)
>>>> it would not be necessary to copy the substr(t,s),
>>>> the index() function could operate on the original
>>>> using some access "descriptor" (say, a pointer and
>>>> a length) in read-only mode.
>>>>
>>>> Will (GNU) Awk do a copy of the data value or does
>>>> it use a read-only descriptor access to the already
>>>> existing substring of variable "t"?
>>>>
>>>> Currently I'm playing with some huge data and copies
>>>> of MB sized data is costly (if it's repeatedly done
>>>> with various substr() subscripts).
>>>
>>> substr() makes a copy. This is clear in the code.
>>
>> Okay. Thanks for checking that!
> ...
>> Okay, maybe I could write an extension to work on memory
>> mapped files - the data originally stems from a file -
>> and seek/read through "C" mechanisms. (But that's huge
>> effort compared to some natively available function. And
>> then I'd probably better implement that straightly in "C"
>> instead of using Awk, in the first place, since I'd have
>> to implement the GNU Awk Extension anyway in "C".)
> 
> An alternative (depending on the context) would be to consider an
> extension that provides an index function with a third argument giving
> the initial offset.  I've not looked at how extensions get access to
> GAWK strings, so this many not be as easy as it sounds, but I would
> guess that it might be relatively simple to do.

This, first of all, sounds like a good idea! It would make it
unnecessary to (mis-)use the substr() function as (sort of) a
costly copying-descriptor.[*]

I'm unsure about using an extension here. Would there be a name
clash between an built-in index(haystack,needle) function and an
extension index(haystack,needle,start) function? Should they be
separate functions in the first place? (I don't think so.)

In the past new extended functionality was supported by additional
optional parameters in the core Awk code. (Which seems to be the
best place [for optional controlling arguments].) - There's quite
some examples where it seems to have worked well with optional
parameters in the core functions. The changes were obviously local
and the frightening side-effects were not arising, it seems.

But we've read the recent links (with the interview) or already
know what Arnold thinks about that; and it is ambivalent. For one
there was a complaint about quality issues of contributed code
and the maintainer's reluctance to add such code - which is very
understandable! But then there's also the problem that maintainers
don't want to "jump" when arbitrary wishes on functionality arise.

Janis

[*] N.B.: To be consistent it should probably support a substring,
as in index(haystack,needle [,start[,end]]), since the application
example given above p=index(substr(t,s),r) in its generalized form
would have been p=index(substr(t,s,e),r).

Back to comp.lang.awk | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

substr() - copying or not copying, that is here the question. Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-05-31 18:12 +0200
  Re: substr() - copying or not copying, that is here the question. mack@the-knife.org (Mack The Knife) - 2025-05-31 19:07 +0000
    Re: substr() - copying or not copying, that is here the question. Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-06-01 00:16 +0200
      Re: substr() - copying or not copying, that is here the question. Ben Bacarisse <ben@bsb.me.uk> - 2025-06-01 11:42 +0100
        Re: substr() - copying or not copying, that is here the question. Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-06-01 13:43 +0200
          Re: substr() - copying or not copying, that is here the question. gazelle@shell.xmission.com (Kenny McCormack) - 2025-06-01 12:06 +0000
            Re: substr() - copying or not copying, that is here the question. Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-06-01 15:27 +0200
        Re: substr() - copying or not copying, that is here the question. gazelle@shell.xmission.com (Kenny McCormack) - 2025-06-01 11:53 +0000
          Re: substr() - copying or not copying, that is here the question. Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-06-01 15:47 +0200
            Re: substr() - copying or not copying, that is here the question. gazelle@shell.xmission.com (Kenny McCormack) - 2025-06-01 14:17 +0000
              Re: substr() - copying or not copying, that is here the question. Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-06-08 00:01 +0200
      Re: substr() - copying or not copying, that is here the question. mack@the-knife.org (Mack The Knife) - 2025-06-03 06:56 +0000
        Re: substr() - copying or not copying, that is here the question. gazelle@shell.xmission.com (Kenny McCormack) - 2025-06-03 11:04 +0000
        Re: substr() - copying or not copying, that is here the question. Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-06-08 00:05 +0200
          Re: substr() - copying or not copying, that is here the question. mack@the-knife.org (Mack The Knife) - 2025-06-08 12:35 +0000
            Re: substr() - copying or not copying, that is here the question. Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-06-11 11:07 +0200
              Meta chat (Was: substr() - copying or not copying, that is here the question.) gazelle@shell.xmission.com (Kenny McCormack) - 2025-06-11 12:11 +0000
    Re: substr() - copying or not copying, that is here the question. Kaz Kylheku <643-408-1753@kylheku.com> - 2025-06-01 00:01 +0000

csiph-web