Groups | Search | Server Info | Login | Register
| From | Janis Papanagnou <janis_papanagnou@hotmail.com> |
|---|---|
| Newsgroups | comp.lang.awk |
| Subject | Re: getline timeout (revisited) |
| Date | 2011-04-20 14:20 +0200 |
| Organization | Aioe.org NNTP Server |
| Message-ID | <iomj2j$j9o$1@speranza.aioe.org> (permalink) |
| References | <inmun6$lq1$1@speranza.aioe.org> <CsqdnTvy49IoCjDQnZ2dnUVZ_t2dnZ2d@mchsi.com> <ioka93$13e$1@news.m-online.net> <xpKdne1Tg-IuVTPQnZ2dnUVZ_hCdnZ2d@mchsi.com> |
Am 20.04.2011 13:51, schrieb j.eh@mchsi.com:
> In article<ioka93$13e$1@news.m-online.net>, Janis Papanagnou wrote:
>> On 19.04.2011 16:09, j.eh@mchsi.com wrote:
>>> In article<inmun6$lq1$1@speranza.aioe.org>, Janis Papanagnou wrote:
>>>> I've been currently looking for a timeout option for getline in
>>>> the context of an /inet/tcp/... socket communication with gawk.
>>>>
>>>> This topic had already been addressed here in c.l.a many many
>>>> years ago, as my google search showed, but I haven't found any
>>>> positive answers. Has there been something incorporated in gawk
>>>> or xgawk, meanwhile, or is the status unchanged. I suppose the
>>>> latter, but asking for a confirmation anyway.
>>>
>>> [ Janis, I hit the wrong button. My intention was to post to
>>> the group, not to send a personal email. Sorry about that. ]
>>
>> Don't worry; in my mailbox only spam isn't welcome. :-)
>>
>>>
>>>
>>> I find the discussions in the old threads interesting, but I have
>>> a question. Let's assume we use PROCINFO to specify timeout like
>>> this:
>>>
>>> PROCINFO["/inet/tcp/..", "TIMEOUT"] = 1000 (ms)
>>>
>>> and a value of 0 means the default behavior i.e. no timeout.
>>
>> Yes.
>>
>>>
>>> Also, assume the existence of a builtin either in the gawk source
>>> or in a extension:
>>>
>>> readline("/inet/tcp/..")
>>
>> Hmm.. - why a new function or builtin 'readline()' and not extending
>> the functionality of 'getline'? (Shouldn't conflict given the PROCINFO
>> approach.)
>
> I don't know what extensions of 'getline' (or 'RS', or any other
> builtin feature of the language) functionality will be required,
> but all I know is that if those are in conflict with the required
> semantics and rules of the language, Arnold isn't simply going to
> accept it. So, one has to prove that the required functionality can't be
> provided by a seperate builtin, and/or any proposed changes to 'getline'
> functionality aren't going to violet the rules of the language etc. etc.
> Besides, a seperate function provides lot more flexibility; Just
> consider:
>
> readline("/inet/tcp/...", var, timeout)
>
> 'var' is pass by reference. You won't need PROCINFO or environment
> variable for timeout specification, and won't even be limited to
> returning only -1, 0 or 1.
Yes, if you can easily avoid PROCINFO that would be fine.
But if, as an alternative, you'd introducde an new function
that just implements 99% of an existing language construct;
that wouldn't be my preferred choice.
>
>>
>>>
>>> which other than being a function call behaves exactly like getline
>>> w.r.t. RS, RT and setting fields. We modify 'readline' to handle
>>> timeout anyway you think is suitable to serve our purpose. The question
>>> is then, if a script using 'readline' going to look any different
>>> than from the one that uses getline with exactly the same
>>> modifications.
>>
>> I don't understand what you're saying above, and what you're aiming at.
>>
>
> If the use of 'getline' does not simplify things in the script compared
> to a new function, then why bother trying to extend 'getline'?
What I least would like to see are half a dozen new functions,
*if* we can use the existing interface without breaking anything,
and if it fits nicely (as I think it does) in the existing concepts.
>
> Ok, let's just stick with getline and see if it can provide a
> general solution;
(LOL - you certainly cannot :-)
> A solution specific to a particular problem
> at hand isn't going to be good enough. The following example from
> gawk-inet documentation, I believe, illustrates the most general
> usage of getline. There isn't any need to throw input file and
> related pattern/action, or even fields into the mix.
>
> BEGIN {
> RS = ORS = "\r\n"
> HttpService = "/inet/tcp/0/proxy/80"
> print "GET http://www.yahoo.com" |& HttpService
> PROCINFO[HttpService, "TIMEOUT"] = 1000
> while ((HttpService |& getline var)> 0)
> print var
> close(HttpService)
> }
Change the while() loop to something more appropriate.
while (whatever_condition) {
if ((Service |& getline var) >= 0) do_sth_w( var )
else report_a_provided_error_by_any_means();
# or distinguish return values <0 and ==0
}
>
> I added the line with the PROCINFO entry, and used 'getline var'
> instead of 'getline'.
(The latter doesn't contribute to the question.)
> What other changes are needed so that we still
> get the desired output with timeouts? If the answer is none whatsoever,
> then probably won't need to consider the following:
>
> 1. Dealing with partial output in case of a timeout.
Be aware that you *need* some channel to pass an error from
the underlying OS or language level anyway!
Getline will fill $0 or the provided var, so the user can
decide what to do with the (partial or not) data.
> 2. Handling of non-recoverable errors. What is getline
> supposed to return if a timeout occurs?
An error indication (" <0 ") and some hint WRT the error;
this can be a coded error number ("<0" is differenciated
to "-1","-2",...,"-n") and/or some error text (any channel
to provide that is necessary; PROCINFO, predefined variable,
...).
> 3. Making sure there are finite number of retries in the
> event of timeouts.
On application level see "whatever_condition" in my example
above. On awk/library level see Arnold's reply upthread.
Janis
> 4. ?
>
> Thanks,
>
> John
>
>>> If possible, please consider providing the outline
>>> of such a script illustrating the usage of 'readline' or getline
>>> with timeout.
>>
>> In my specific primitive application - which is by no means meant to be
>> a general example, covering all needs, or considering any corner cases -
>> it was just something like (simplified code)...
>>
>> BEGIN { P = "/inet/tcp/0/a.b.c.d/e" }
>> { print some_funct($0) |& P }
>> /pat/ { P |& getline ; do_sth_w($0) }
>> END { close(P) }
>>
>> which might, with timeouts, morph to something like...
>>
>> BEGIN { P = "/inet/tcp/0/a.b.c.d/e"
>> PROCINFO[ P, "TIMEOUT"] = 1000
>> }
>>
>> { print some_funct($0) |& P }
>>
>> /pat/ { if( (P |& getline)> 0) do_sth_w($0)
>> else print "Error:", PROCINFO[ P, "ERROR"]
>> }
>>
>> END { close(P) }
>>
>>
>> Using PROCINFO as a means to return the error is just an ad hoc thought
>> based on your timeout setting example. (So feel free to ignore it and
>> introduce something else.)
>>
>>> To keep things uniform, assume we parse ERRNO to
>>> find out the cause of error. Not very gifted when it comes to writing
>>> awk scripts that doesn't look like C, so I hope you understand my
>>> request in this context.
>>
>> If I've misunderstood your request, please clarify.
>>
>> Thanks.
>>
>> Janis
>>
>>>
>>> Thanks,
>>>
>>> John
>>>
>>>>
>>>> Thanks.
>>>>
>>>> Janis
>>
Back to comp.lang.awk | Previous | Next — Previous in thread | Next in thread | Find similar
getline timeout (revisited) Janis Papanagnou <janis_papanagnou@hotmail.com> - 2011-04-08 14:23 +0200
Re: getline timeout (revisited) arnold@skeeve.com (Aharon Robbins) - 2011-04-10 18:49 +0000
Re: getline timeout (revisited) j.eh@mchsi.com - 2011-04-19 09:09 -0500
Re: getline timeout (revisited) Janis Papanagnou <janis_papanagnou@hotmail.com> - 2011-04-19 17:38 +0200
Re: getline timeout (revisited) j.eh@mchsi.com - 2011-04-20 06:51 -0500
Re: getline timeout (revisited) Janis Papanagnou <janis_papanagnou@hotmail.com> - 2011-04-20 14:20 +0200
Re: getline timeout (revisited) j.eh@mchsi.com - 2011-04-20 08:21 -0500
Re: getline timeout (revisited) j.eh@mchsi.com - 2011-04-21 05:46 -0500
Re: getline timeout (revisited) arnold@skeeve.com (Aharon Robbins) - 2011-04-22 13:55 +0000
Re: getline timeout (revisited) Grant <omg@grrr.id.au> - 2011-04-23 06:50 +1000
csiph-web