Path: csiph.com!xmission!news.glorb.com!usenet.stanford.edu!not-for-mail From: Stephane Chazelas Newsgroups: gnu.bash.bug Subject: Re: SIGINT handling Date: Sun, 20 Sep 2015 16:52:19 +0100 Lines: 98 Approved: bug-bash@gnu.org Message-ID: References: <20150918151439.GA16455@chaz.gmail.com> <55FDC8B4.4000505@case.edu> <20150919213101.GA4393@chaz.gmail.com> <55FE0BB8.8040500@case.edu> NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: usenet.stanford.edu 1442764350 11170 208.118.235.17 (20 Sep 2015 15:52:30 GMT) X-Complaints-To: action@cs.stanford.edu Cc: bug-bash To: Chet Ramey Envelope-to: bug-bash@gnu.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=hn4BtsTzDTQytDnWcIqXk/DKBTs22J6uzTCwQLMnMDY=; b=VPxtz2NHtHdoU4ytgbq6exVB4tXz6ZvTSTTK2WzWQWcDS7CtxM7aa8ej1ZPfwSJ8Cq hWSAbvdIDmdPPhRfa19BJoNxTtNQVc/NaRz/rtG097YM9sSlxoqxhiSEToeJZ86Oh6e+ KG5yOnlWiOrPvUSnFTGkychNJ+0LaOJVduGfP4h/JYxHbG+rcGWTQ6F+qFdeKKnTTvej RF7X6u/kA3YE9//CbqKr+hbFlg71U9q9Xm7ZdQpKAbiSOJ+gMI9uR4pn6w3xtBP0DxuY LiuuWT+hLFyKSiKVZX7pg2eprqASths3jS9YbyI/dP3EpiyZCa9zxC5DgS9uwzJoRCpn N2bw== X-Received: by 10.194.92.68 with SMTP id ck4mr20493050wjb.141.1442764341565; Sun, 20 Sep 2015 08:52:21 -0700 (PDT) Content-Disposition: inline In-Reply-To: <55FE0BB8.8040500@case.edu> User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2a00:1450:400c:c05::22a X-BeenThere: bug-bash@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Bug reports for the GNU Bourne Again SHell List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com gnu.bash.bug:11514 2015-09-19 21:28:24 -0400, Chet Ramey: > On 9/19/15 5:31 PM, Stephane Chazelas wrote: > > 2015-09-19 16:42:28 -0400, Chet Ramey: > > [...] > >> I'm surprised you've managed to avoid the dozen or so discussions on the > >> topic. > >> > >> http://lists.gnu.org/archive/html/bug-bash/2014-03/msg00108.html > > [...] > > > > Thanks for the links. I still think the comments on the second > > article I sent > > (http://thread.gmane.org/gmane.comp.shells.bash.bugs/24178/focus=24183) > > still hold though and from a quick read I don't see those points > > being mentioned in the past discussions (but that was a quick > > read). > > > > I notice that you mention the race conditions have been fixed, > > but I'm still seeing some non-deterministic behaviour. > > I can't reproduce this on Mac OS X and RHEL 6 and 7, the systems I have > readily available today. > > The shell notes when it sees SIGINT and whether or not waitpid returns > -1/EINTR. If the sleep exits due to SIGINT, even after the waitpid > returns -1, the shell assumes it didn't catch and handle the SIGINT and > the shell calls the trap handler. [...] To clarify, In bash -c 'sh -c "trap exit INT; sleep 99; :"; echo hi' The command under test is "bash", not "sh". The "sh" is just there as a cmd that does exit() upon receiving SIGINT. It's just: bash -c 'cmd; echo hi' You can replace "cmd" with: perl -e '$SIG{INT}= sub{exit}; sleep' (or mksh -c 'sleep 10; :' (which does an exit(130) upon receiving SIGINT)) The problem here is that when you press CTRL-C, SIGINT is sent to all the processes in the process group, so to "bash" and "cmd". Now, bash works as expected only if it handles its own SIGINT before the child has caught its own one and exited. When the above code exits without printing "hi", we see this call stack for instance (breakpoint on kill() in gdb): #0 kill () at ../sysdeps/unix/syscall-template.S:81 #1 0x000000000045dd8e in termsig_handler (sig=) at sig.c:588 #2 0x000000000045ddef in termsig_handler (sig=) at sig.c:554 #3 0x00000000004466bb in set_job_status_and_cleanup (job=0) at jobs.c:3539 #4 waitchld (block=block@entry=1, wpid=20802) at jobs.c:3316 #5 0x000000000044733b in wait_for (pid=20802) at jobs.c:2485 #6 0x0000000000437992 in execute_command_internal (command=command@entry=0x70aa48, asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1, fds_to_close=fds_to_close@entry=0x70bb68) at execute_cmd.c:829 #7 0x0000000000437b0e in execute_command (command=0x70aa48) at execute_cmd.c:390 #8 0x0000000000435f23 in execute_connection (fds_to_close=0x70bb48, pipe_out=-1, pipe_in=-1, asynchronous=0, command=0x70bb08) at execute_cmd.c:2494 #9 execute_command_internal (command=0x70bb08, asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1, fds_to_close=fds_to_close@entry=0x70bb48) at execute_cmd.c:945 #10 0x000000000047955b in parse_and_execute (string=, from_file=from_file@entry=0x4b5f96 "-c", flags=flags@entry=4) at evalstring.c:387 #11 0x00000000004205d7 in run_one_command (command=) at shell.c:1348 #12 0x000000000041f524 in main (argc=3, argv=0x7fffffffe198, env=0x7fffffffe1b8) at shell.c:695 That is, SIGINT is being handled *after* the SIGINT handler has been restored to its default of exiting the shell. Now, I'm not sure how to best fix that as I suppose we don't get any guarantee of when SIGINT will be delivered (it may be why ksh93 ignores SIGINT altogether and relies solely on WIFSIGNALED) The above scenario suggests SIGCHLD is being delivered before SIGINT which is strange. I'd expect SIGINT to be inserted by the kernel in both cmd and bash queues upon CTRL-C, and the SIGCHLD would necesarily come after those SIGINT. Could it be that SIGCHLD jumps the queue? Note that I'm not seeing that as often on every system. It seems I can make it more likely by making the system busier. -- Stephane