Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.os.linux.development.apps > #681

Re: Linux O_NONBLOCK bug/ quirk

From unixb4coffee <unixb4coffee@gmail.com>
Newsgroups comp.os.linux.development.apps
Subject Re: Linux O_NONBLOCK bug/ quirk
Date 2014-04-03 11:43 -0700
Organization Netfront http://www.netfront.net/
Message-ID <lhka4f$11fe$1@adenine.netfront.net> (permalink)
References <878urvu0gx.fsf@sable.mobileactivedefense.com> <87siq23wwg.fsf@sable.mobileactivedefense.com> <lhhk2o$2rhs$1@adenine.netfront.net> <87d2gyzaq4.fsf@sable.mobileactivedefense.com>

Show all headers | View raw


On 04/03/2014 08:36 AM, Rainer Weikusat wrote:
> unixb4coffee <unixb4coffee@gmail.com> writes:
>> On 03/28/2014 01:12 PM, Rainer Weikusat wrote:
>>> Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
>>>
>>> [...]
>>>
>>>> a receive operation on a socket in
>>>> non-blocking mode can actually be blocked forever on Linux, example
>>>> code:
>
> [...]
>
>>> ------------
>>> #include <fcntl.h>
>>> #include <string.h>
>>> #include <sys/socket.h>
>>> #include <sys/un.h>
>>>
>>> int main(void)
>>> {
>>>       struct sockaddr_un sun;
>>>       int fd;
>>>
>>>       fd = socket(AF_UNIX, SOCK_DGRAM, 0);
>>>       sun.sun_family = AF_UNIX;
>>>       strncpy(sun.sun_path, "/tmp/bla", sizeof(sun.sun_path));
>>>       bind(fd, (struct sockaddr *)&sun, sizeof(sun));
>>>
>>>       if (fork() == 0) recv(fd, &fd, sizeof(fd), 0);
>>>
>>>       sleep(1);
>>>
>>>       recv(fd, &fd, sizeof(fd), MSG_DONTWAIT);
>>>
>>>       return 0;
>>> }
>>> -------------
>>>
>>> What happens here is that the 2nd recv stops on a mutex held by the
>>> first and thus, effectively blocks until a message is received on this
>>> socket, IOW, the implementation of MSG_DONTWAIT is completely broken for
>>> datagram sockets (certainly AF_UNIX, likely, AF_INET, too). Further, it
>>> is not going to be fixed because the so-called 'networking maintainer'
>>> 'completely disagrees' with the idea that 'non-blocking operation'
>>> actually means 'the call will not block' (that's the state of 2 days
>>> ago).
>>>
>> If the MSG_DONTWAIT triggers an EAGAIN or  EWOULDBLOCK, it's easier to
>> maintain a connection in a stable state - as long as the forking
>> process knows the status of the forked process.  Communicating the
>> status of a child process is not normally performed by the socket
>> connection (though somebody probably has and will say that it can be
>> done) - The wait () and waitpid () man pages are just one place to
>> begin reading about communicating between endpoints.
>
> I don't quite understand this in the given context. The reason I
> used fork in the example above was just because it's the easiest way to
> create an independent thread of execution which has access to the same
> socket.
>
>> If I'm reading this correctly, then the mutex would cause a hang.
>
> It will cause the supposedly non-blocking second recv to block until a
> message is received on the socket. This message will then be returned by
> the first recv and the second will return an EAGAIN error (unless
> another message becomes available 'quickly'). This should work like the
> pipe-handling code in fs/pipe.c does -- a thread acquires the mutex
> using mutex_lock instead of mutex_lock_interruptible, looks for
> available messages and in case there are none, it unlocks the mutex and
> either goes to sleep (reacquiring the mutex after it was woken up) if it
> was a blocking call or returns EAGAIN if it wasn't. That's a better
> solution then the 'trylock' I suggested 'elsewhere' but I didn't think of
> it until the next morning (and checked pipe.c afterwards because I
> couldn't imagine this to be broken everywhere). Not that this had made
> any difference (pointing at pipe.c might have 'worked' in the sense that
> 'appeals to authority' tend to be less futile than 'appeals to reason'
> ...).
>
>> But hangs aren't that unusual if a program doesn't take care to monitor
>> the forked processes' status.  It also doesn't really say anything about the
>> connection's state, or whether it's still possible to monitor it.
>
> There is no connection here, just a datagram socket shared by multiple
> threads of execution.
>

The bind () call gives the connection endpoints.  A more basic approach 
might be to use sendto () and recvfrom (), though I'm not sure that 
would have any effect on socket sharing - it might cause less worry 
about handling an EAGAIN errno or "quickly" arriving packets when 
returning to a waking state.

-- 
Ctalk - An Object Oriented Language for C Programmers
Download:  http://sourceforge.net/projects/ctalk/files
Wiki:      http://sourceforge.net/projects/ctalk/Wiki/home

--- news://freenews.netfront.net/ - complaints: news@netfront.net ---

Back to comp.os.linux.development.apps | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Linux O_NONBLOCK bug/ quirk Rainer Weikusat <rweikusat@mobileactivedefense.com> - 2014-03-27 15:26 +0000
  Re: Linux O_NONBLOCK bug/ quirk crankypuss <crankypuss@nomail.invalid> - 2014-03-28 02:08 -0600
    Re: Linux O_NONBLOCK bug/ quirk Rainer Weikusat <rweikusat@mobileactivedefense.com> - 2014-03-28 12:45 +0000
      Re: Linux O_NONBLOCK bug/ quirk crankypuss <crankypuss@nomail.invalid> - 2014-03-29 05:19 -0600
  Re: Linux O_NONBLOCK bug/ quirk Rainer Weikusat <rweikusat@mobileactivedefense.com> - 2014-03-28 20:12 +0000
    Re: Linux O_NONBLOCK bug/ quirk unixb4coffee <unixb4coffee@gmail.com> - 2014-04-02 11:13 -0700
      Re: Linux O_NONBLOCK bug/ quirk Rainer Weikusat <rweikusat@mobileactivedefense.com> - 2014-04-03 16:36 +0100
        Re: Linux O_NONBLOCK bug/ quirk unixb4coffee <unixb4coffee@gmail.com> - 2014-04-03 11:43 -0700
          Re: Linux O_NONBLOCK bug/ quirk Rainer Weikusat <rweikusat@mobileactivedefense.com> - 2014-04-03 21:31 +0100
  Re: Linux O_NONBLOCK bug/ quirk Lusotec <nomail@nomail.not> - 2014-03-28 23:13 +0000
    Re: Linux O_NONBLOCK bug/ quirk Richard Kettlewell <rjk@greenend.org.uk> - 2014-03-29 11:15 +0000
    Re: Linux O_NONBLOCK bug/ quirk Rainer Weikusat <rweikusat@mobileactivedefense.com> - 2014-03-30 19:42 +0100
    Re: Linux O_NONBLOCK bug/ quirk Rainer Weikusat <rweikusat@mobileactivedefense.com> - 2014-04-16 12:42 +0100
      Re: Linux O_NONBLOCK bug/ quirk Rainer Weikusat <rweikusat@mobileactivedefense.com> - 2014-04-16 13:36 +0100

csiph-web