Groups | Search | Server Info | Login | Register


Groups > comp.lang.awk > #172

Re: getline timeout (revisited)

From Janis Papanagnou <janis_papanagnou@hotmail.com>
Newsgroups comp.lang.awk
Subject Re: getline timeout (revisited)
Date 2011-04-20 14:20 +0200
Organization Aioe.org NNTP Server
Message-ID <iomj2j$j9o$1@speranza.aioe.org> (permalink)
References <inmun6$lq1$1@speranza.aioe.org> <CsqdnTvy49IoCjDQnZ2dnUVZ_t2dnZ2d@mchsi.com> <ioka93$13e$1@news.m-online.net> <xpKdne1Tg-IuVTPQnZ2dnUVZ_hCdnZ2d@mchsi.com>

Show all headers | View raw


Am 20.04.2011 13:51, schrieb j.eh@mchsi.com:
> In article<ioka93$13e$1@news.m-online.net>, Janis Papanagnou wrote:
>> On 19.04.2011 16:09, j.eh@mchsi.com wrote:
>>> In article<inmun6$lq1$1@speranza.aioe.org>, Janis Papanagnou wrote:
>>>> I've been currently looking for a timeout option for getline in
>>>> the context of an /inet/tcp/... socket communication with gawk.
>>>>
>>>> This topic had already been addressed here in c.l.a many many
>>>> years ago, as my google search showed, but I haven't found any
>>>> positive answers. Has there been something incorporated in gawk
>>>> or xgawk, meanwhile, or is the status unchanged. I suppose the
>>>> latter, but asking for a confirmation anyway.
>>>
>>> [ Janis, I hit the wrong button. My intention was to post to
>>> the group, not to send a personal email. Sorry about that. ]
>>
>> Don't worry; in my mailbox only spam isn't welcome. :-)
>>
>>>
>>>
>>> I find the discussions in the old threads interesting, but I have
>>> a question. Let's assume we use PROCINFO to specify timeout like
>>> this:
>>>
>>>       PROCINFO["/inet/tcp/..", "TIMEOUT"] = 1000 (ms)
>>>
>>> and a value of 0 means the default behavior i.e. no timeout.
>>
>> Yes.
>>
>>>
>>> Also, assume the existence of a builtin either in the gawk source
>>> or in a extension:
>>>
>>>       readline("/inet/tcp/..")
>>
>> Hmm.. - why a new function or builtin 'readline()' and not extending
>> the functionality of 'getline'? (Shouldn't conflict given the PROCINFO
>> approach.)
>
> I don't know what extensions of 'getline' (or 'RS', or any other
> builtin feature of the language) functionality will be required,
> but all I know is that if those are in conflict with the required
> semantics and rules of the language, Arnold isn't simply going to
> accept it. So, one has to prove that the required functionality can't be
> provided by a seperate builtin, and/or any proposed changes to 'getline'
> functionality aren't going to violet the rules of the language etc. etc.
> Besides, a seperate function provides lot more flexibility; Just
> consider:
>
>    readline("/inet/tcp/...", var, timeout)
>
> 'var' is pass by reference. You won't need PROCINFO or environment
> variable for timeout specification, and won't even be limited to
> returning only -1, 0 or 1.

Yes, if you can easily avoid PROCINFO that would be fine.
But if, as an alternative, you'd introducde an new function
that just implements 99% of an existing language construct;
that wouldn't be my preferred choice.

>
>>
>>>
>>> which other than being a function call behaves exactly like getline
>>> w.r.t. RS, RT and setting fields. We modify 'readline' to handle
>>> timeout anyway you think is suitable to serve our purpose. The question
>>> is then, if a script using 'readline' going to look any different
>>> than from the one that uses getline with exactly the same
>>> modifications.
>>
>> I don't understand what you're saying above, and what you're aiming at.
>>
>
> If the use of 'getline' does not simplify things in the script compared
> to a new function, then why bother trying to extend 'getline'?

What I least would like to see are half a dozen new functions,
*if* we can use the existing interface without breaking anything,
and if it fits nicely (as I think it does) in the existing concepts.

>
> Ok, let's just stick with getline and see if it can provide a
> general solution;

(LOL - you certainly cannot :-)

> A solution specific to a particular problem
> at hand isn't going to be good enough. The following example from
> gawk-inet documentation, I believe, illustrates the most general
> usage of getline. There isn't any need to throw input file and
> related pattern/action, or even fields into the mix.
>
>       BEGIN {
>         RS = ORS = "\r\n"
>         HttpService = "/inet/tcp/0/proxy/80"
>         print "GET http://www.yahoo.com" |&  HttpService
>         PROCINFO[HttpService, "TIMEOUT"] = 1000
>         while ((HttpService |&  getline var)>  0)
>            print var
>         close(HttpService)
>       }

Change the while() loop to something more appropriate.

   while (whatever_condition) {
     if ((Service |& getline var) >= 0) do_sth_w( var )
     else report_a_provided_error_by_any_means();
     # or distinguish return values <0 and ==0
   }

>
> I added the line with the PROCINFO entry, and used 'getline var'
> instead of 'getline'.

(The latter doesn't contribute to the question.)

> What other changes are needed so that we still
> get the desired output with timeouts? If the answer is none whatsoever,
> then probably won't need to consider the following:
>
> 1. Dealing with partial output in case of a timeout.

Be aware that you *need* some channel to pass an error from
the underlying OS or language level anyway!

Getline will fill $0 or the provided var, so the user can
decide what to do with the (partial or not) data.

> 2. Handling of non-recoverable errors. What is getline
> supposed to return if a timeout occurs?

An error indication (" <0 ") and some hint WRT the error;
this can be a coded error number ("<0" is differenciated
to "-1","-2",...,"-n") and/or some error text (any channel
to provide that is necessary; PROCINFO, predefined variable,
...).

> 3. Making sure there are finite number of retries in the
> event of timeouts.

On application level see "whatever_condition" in my example
above. On awk/library level see Arnold's reply upthread.

Janis

> 4. ?
>
> Thanks,
>
> John
>
>>> If possible, please consider providing the outline
>>> of such a script illustrating the usage of 'readline' or getline
>>> with timeout.
>>
>> In my specific primitive application - which is by no means meant to be
>> a general example, covering all needs, or considering any corner cases -
>> it was just something like (simplified code)...
>>
>>    BEGIN { P = "/inet/tcp/0/a.b.c.d/e" }
>>          { print some_funct($0) |&   P }
>>    /pat/ { P |&  getline ; do_sth_w($0) }
>>    END   { close(P) }
>>
>> which might, with timeouts, morph to something like...
>>
>>    BEGIN { P = "/inet/tcp/0/a.b.c.d/e"
>>            PROCINFO[ P, "TIMEOUT"] = 1000
>>          }
>>
>>          { print some_funct($0) |&   P }
>>
>>    /pat/ { if( (P |&  getline)>  0) do_sth_w($0)
>>            else print "Error:", PROCINFO[ P, "ERROR"]
>>          }
>>
>>    END   { close(P) }
>>
>>
>> Using PROCINFO as a means to return the error is just an ad hoc thought
>> based on your timeout setting example. (So feel free to ignore it and
>> introduce something else.)
>>
>>> To keep things uniform, assume we parse ERRNO to
>>> find out the cause of error. Not very gifted when it comes to writing
>>> awk scripts that doesn't look like C, so I hope you understand my
>>> request in this context.
>>
>> If I've misunderstood your request, please clarify.
>>
>> Thanks.
>>
>> Janis
>>
>>>
>>> Thanks,
>>>
>>> John
>>>
>>>>
>>>> Thanks.
>>>>
>>>> Janis
>>

Back to comp.lang.awk | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

getline timeout (revisited) Janis Papanagnou <janis_papanagnou@hotmail.com> - 2011-04-08 14:23 +0200
  Re: getline timeout (revisited) arnold@skeeve.com (Aharon Robbins) - 2011-04-10 18:49 +0000
  Re: getline timeout (revisited) j.eh@mchsi.com - 2011-04-19 09:09 -0500
    Re: getline timeout (revisited) Janis Papanagnou <janis_papanagnou@hotmail.com> - 2011-04-19 17:38 +0200
      Re: getline timeout (revisited) j.eh@mchsi.com - 2011-04-20 06:51 -0500
        Re: getline timeout (revisited) Janis Papanagnou <janis_papanagnou@hotmail.com> - 2011-04-20 14:20 +0200
          Re: getline timeout (revisited) j.eh@mchsi.com - 2011-04-20 08:21 -0500
            Re: getline timeout (revisited) j.eh@mchsi.com - 2011-04-21 05:46 -0500
            Re: getline timeout (revisited) arnold@skeeve.com (Aharon Robbins) - 2011-04-22 13:55 +0000
              Re: getline timeout (revisited) Grant <omg@grrr.id.au> - 2011-04-23 06:50 +1000

csiph-web