Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > gnu.bash.bug > #15025
| From | Vladimir Marek <Vladimir.Marek@oracle.com> |
|---|---|
| Newsgroups | gnu.bash.bug |
| Subject | Re: FIFO race condition on SunOS kernels |
| Date | 2019-01-01 23:47 +0100 |
| Message-ID | <mailman.6666.1546382888.1284.bug-bash@gnu.org> (permalink) |
| References | <eda3efb7-98d4-6a2d-a9d1-cb82f43a5f02@inlv.org> |
Hi,
> You'd think that establishing a pipe between two processes is a very basic
> UNIX feature that should work reliably on all UNIX variants.
One would think that _opening_ a file is a very basic UNIX feature ...
:)
> But the following script seems to break consistently on Solaris and variants
> (SunOS kernels) when executed by bash, ksh93, or dash. All it does is make
> 100 FIFOs and read a line from each -- it should be trivial.
>
> And it does work fine on (recent versions of) Linux, macOS, and all the
> BSDs, on all shells.
In a horror I have quickly tested on S11.3 SRU34 x86, S11.3 SRU18 sparc,
S11.4 FCS x86, S11.4 FCS sparc, S10 mostly FCS and even development
Solaris version. I have tested it on S11.3 SRU 35 on LDOM. In all cases
it prints lines
this is FIFO 1
...
this is FIFO 100
I haven't seen single issue. Then I tried VirtualBox. And it printed
only two lines. Trussing it it gets stuck at
1546/1: open("/tmp/FIFOs1546/FIFO4", O_RDONLY|O_XPG4OPEN) (sleeping...)
That said, I do use VirtualBox 5.1.22r115126 which is pretty old.
Putting 0.5s delay anywhere in the loop makes the problem disappear.
Looking more closely at the truss output, I can see
5030/1: 4.142629 open("/var/tmp/FIFOs5030/FIFO7", O_RDONLY|O_XPG4OPEN) Err#91 ERESTART
and shortly after it gets stuck. Process 5030 is bash.
I don't have debug symbols right now, but I guess from disassembly that
we are in redir_open function. I have tried to protect the loop taking
into consideration both EINTR and ERESTART, but that didn't help.
Let's try to look into kernel.
$ mdb -k
> 0t1775::pid2proc|::whatthread|::findstack
stack pointer for thread ffffa1002910c840: ffffe33001d95990
[ ffffe33001d95990 _resume_from_idle+0x192() ]
ffffe33001d959c0 swtch+0x19d()
ffffe33001d95a30 cv_wait_sig_swap_core+0x19c()
ffffe33001d95a50 cv_wait_sig_swap+0x18()
ffffe33001d95ae0 fifo_open+0x43e()
ffffe33001d95b60 fop_open+0x18f()
ffffe33001d95d50 vn_openat+0x974()
ffffe33001d95ec0 copen+0x4fd()
ffffe33001d95ef0 openat+0x31()
ffffe33001d95f00 sys_syscall+0x247()
Hmm, so we are waiting on a condition. That needs to be investigated
deeper. It would be great if you could open a case for this as we have
to prioritize our work ...
At the moment it looks like a bug in Solaris to me, but it shows only on
VirtualBox. I'll try to look at it more. Or rather to find out someone
who knows the filesystems code.
Cheers
--
Vlad
Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread
Re: FIFO race condition on SunOS kernels Vladimir Marek <Vladimir.Marek@oracle.com> - 2019-01-01 23:47 +0100
csiph-web