Path: csiph.com!xmission!news.glorb.com!usenet.stanford.edu!not-for-mail
From: Stephane Chazelas <stephane.chazelas@gmail.com>
Newsgroups: gnu.bash.bug
Subject: Re: SIGINT handling
Date: Sun, 20 Sep 2015 16:52:19 +0100
Lines: 98
Approved: bug-bash@gnu.org
Message-ID: <mailman.1457.1442764350.19560.bug-bash@gnu.org>
References: <20150918151439.GA16455@chaz.gmail.com> <55FDC8B4.4000505@case.edu> <20150919213101.GA4393@chaz.gmail.com> <55FE0BB8.8040500@case.edu>
NNTP-Posting-Host: lists.gnu.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: bug-bash <bug-bash@gnu.org>
To: Chet Ramey <chet.ramey@case.edu>
Envelope-to: bug-bash@gnu.org
Content-Disposition: inline
In-Reply-To: <55FE0BB8.8040500@case.edu>
User-Agent: Mutt/1.5.21 (2010-09-15)
Precedence: list
Xref: csiph.com gnu.bash.bug:11514

2015-09-19 21:28:24 -0400, Chet Ramey:
> On 9/19/15 5:31 PM, Stephane Chazelas wrote:
> > 2015-09-19 16:42:28 -0400, Chet Ramey:
> > [...]
> >> I'm surprised you've managed to avoid the dozen or so discussions on the
> >> topic.
> >>
> >> http://lists.gnu.org/archive/html/bug-bash/2014-03/msg00108.html
> > [...]
> > 
> > Thanks for the links. I still think the comments on the second
> > article I sent
> > (http://thread.gmane.org/gmane.comp.shells.bash.bugs/24178/focus=24183)
> > still hold though and from a quick read I don't see those points
> > being mentioned in the past discussions (but that was a quick
> > read).
> > 
> > I notice that you mention the race conditions have been fixed,
> > but I'm still seeing some non-deterministic behaviour.
> 
> I can't reproduce this on Mac OS X and RHEL 6 and 7, the systems I have
> readily available today.
> 
> The shell notes when it sees SIGINT and whether or not waitpid returns
> -1/EINTR.  If the sleep exits due to SIGINT, even after the waitpid
> returns -1, the shell assumes it didn't catch and handle the SIGINT and
> the shell calls the trap handler.
[...]

To clarify,

In

bash -c 'sh -c "trap exit INT; sleep 99; :"; echo hi'

The command under test is "bash", not "sh". The "sh" is just
there as a cmd that does exit() upon receiving SIGINT.

It's just:

bash -c 'cmd; echo hi'

You can replace "cmd" with:

perl -e '$SIG{INT}= sub{exit}; sleep'

(or

mksh -c 'sleep 10; :'

(which does an exit(130) upon receiving SIGINT))

The problem here is that when you press CTRL-C, SIGINT is sent
to all the processes in the process group, so to "bash" and
"cmd".

Now, bash works as expected only if it handles its own SIGINT
before the child has caught its own one and exited.

When the above code exits without printing "hi", we see this
call stack for instance (breakpoint on kill() in gdb):

#0  kill () at ../sysdeps/unix/syscall-template.S:81
#1  0x000000000045dd8e in termsig_handler (sig=<optimized out>) at sig.c:588
#2  0x000000000045ddef in termsig_handler (sig=<optimized out>) at sig.c:554
#3  0x00000000004466bb in set_job_status_and_cleanup (job=0) at jobs.c:3539
#4  waitchld (block=block@entry=1, wpid=20802) at jobs.c:3316
#5  0x000000000044733b in wait_for (pid=20802) at jobs.c:2485
#6  0x0000000000437992 in execute_command_internal (command=command@entry=0x70aa48, asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1,
    fds_to_close=fds_to_close@entry=0x70bb68) at execute_cmd.c:829
#7  0x0000000000437b0e in execute_command (command=0x70aa48) at execute_cmd.c:390
#8  0x0000000000435f23 in execute_connection (fds_to_close=0x70bb48, pipe_out=-1, pipe_in=-1, asynchronous=0, command=0x70bb08) at execute_cmd.c:2494
#9  execute_command_internal (command=0x70bb08, asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1, fds_to_close=fds_to_close@entry=0x70bb48)
    at execute_cmd.c:945
#10 0x000000000047955b in parse_and_execute (string=<optimized out>, from_file=from_file@entry=0x4b5f96 "-c", flags=flags@entry=4) at evalstring.c:387
#11 0x00000000004205d7 in run_one_command (command=<optimized out>) at shell.c:1348
#12 0x000000000041f524 in main (argc=3, argv=0x7fffffffe198, env=0x7fffffffe1b8) at shell.c:695

That is, SIGINT is being handled *after* the SIGINT handler has
been restored to its default of exiting the shell.

Now, I'm not sure how to best fix that as I suppose we don't get
any guarantee of when SIGINT will be delivered (it may be why
ksh93 ignores SIGINT altogether and relies solely on
WIFSIGNALED)

The above scenario suggests SIGCHLD is being delivered before
SIGINT which is strange. I'd expect SIGINT to be inserted by the
kernel in both cmd and bash queues upon CTRL-C, and the SIGCHLD
would necesarily come after those SIGINT. Could it be that
SIGCHLD jumps the queue?

Note that I'm not seeing that as often on every system. It seems
I can make it more likely by making the system busier.

-- 
Stephane