Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #15292

Async processes started in functions not reliably started

Path csiph.com!goblin1!goblin.stu.neva.ru!usenet.stanford.edu!not-for-mail
From Steffen Nurpmeso <steffen@sdaoden.eu>
Newsgroups gnu.bash.bug
Subject Async processes started in functions not reliably started
Date Sun, 04 Aug 2019 00:40:08 +0200
Lines 114
Approved bug-bash@gnu.org
Message-ID <mailman.768.1564881130.1985.bug-bash@gnu.org> (permalink)
References <20190803224008.dVNLU%steffen@sdaoden.eu>
NNTP-Posting-Host lists.gnu.org
X-Trace usenet.stanford.edu 1564881130 29021 209.51.188.17 (4 Aug 2019 01:12:10 GMT)
X-Complaints-To action@cs.stanford.edu
To bug-bash@gnu.org
Envelope-to bug-bash@gnu.org
Mail-Followup-To bug-bash@gnu.org, Steffen Nurpmeso <steffen@sdaoden.eu>
User-Agent s-nail v14.9.14-3-g96dc286e
OpenPGP id=EE19E1C1F2F7054F8D3954D8308964B51883A0DD; url=https://ftp.sdaoden.eu/steffen.asc; preference=signencrypt
BlahBlahBlah Any stupid boy can crush a beetle. But all the professors in the world can make no bugs.
X-detected-operating-system by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From 217.144.132.164
X-Mailman-Approved-At Sat, 03 Aug 2019 21:12:09 -0400
X-BeenThere bug-bash@gnu.org
X-Mailman-Version 2.1.23
Precedence list
List-Id Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org>
List-Unsubscribe <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe>
List-Archive <https://lists.gnu.org/archive/html/bug-bash>
List-Post <mailto:bug-bash@gnu.org>
List-Help <mailto:bug-bash-request@gnu.org?subject=help>
List-Subscribe <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe>
X-Mailman-Original-Message-ID <20190803224008.dVNLU%steffen@sdaoden.eu>
Xref csiph.com gnu.bash.bug:15292

Show key headers only | View raw


Hello.

For the MUA i maintain i yet implemented parallel tests, and now
wanted to add a reaper process which automatically kills tests
which need longer than X seconds.  That turns out to be more
complicated than i thought, it works just fine in mksh, but does
not work at all in dash (which can also not access variables from
within traps it seems), and in bash it works 100% reliably only
when started in-code, but not if placed in a jobreaper_start().
This is on CRUX Linux 3.5 (GNU C based), self-compiled (standard
/etc/pkgmk.conf flags -O2 -march=x86-64 -pipe), bash 5.0.7.
Imagine

  echo shell is $SHELL/$0
   (
      int= hot=
      echo 'Starting job reaper'
      trap 'int=1 hot=1' HUP
      trap 'int=1 hot=' INT
      trap 'echo "Stopping job reaper"; exit 0' TERM
      trap '' EXIT

      while [ 1 ]; do
         int=
         sleep ${JOBWAIT} &
         wait
         if [ -z "${int}" ] && [ -n "${hot}" ]; then
            i=0 l=
            while [ ${i} -lt ${MAXJOBS} ]; do
               i=`add ${i} 1`

               if [ -f t.${i}.pid ] && read p < t.${i}.pid; then
                  kill -KILL ${p}
                  ${rm} -f t.${i}.result
                  l="${l} ${i}"
               fi
            done
            [ -n "${l}" ] &&
               printf '%s!! Reaped job(s)%s after %s seconds%s\n' \
                  "${COLOR_ERR_ON}" "${l}" ${JOBWAIT} "${COLOR_ERR_OFF}"
         fi
      done
   ) </dev/null & #>/dev/null 2>&1 &
   JOBREAPER=$!

This works a hundred percent reliable if placed alongside the
normal code, but if i place it in a function jobreaper_start(),
then i would say my success rate is about 60 percent:

  Spawing up to 4 tests in parallel
  shell is /bin/bash//home/steffen/src/nail.git/mx-test.sh
  Starting job reaper
  ... [1=vpospar] [2=atxplode] .. wait(1)
  /home/steffen/src/nail.git/mx-test.sh: line 401:  2487 Killed                  ( if ${mkdir} t.${JOBS}.d; then
      cd t.${JOBS}.d; eval t_${1} ${JOBS} ${1};
  fi; ${rm} -f ../t.${JOBS}.pid ) > t.${JOBS}.io 2>&1 < /dev/null
  [vpospar]
  1:ok ifs:ok
  !! Reaped job(s) 2 after 3 seconds
  ..
versus
  Spawing up to 4 tests in parallel
  shell is /bin/bash//home/steffen/src/nail.git/mx-test.sh
  ... [1=vpospar] [2=atxplode] .. wait(1)
  /home/steffen/src/nail.git/mx-test.sh: line 405: kill: (2560) - No such process
  [vpospar]
  1:ok ifs:ok
  [atxplode]
  1:ok
  /home/steffen/src/nail.git/mx-test.sh: line 364: kill: (2560) - No such process
  ...

The code is driven via

  else
     if have_feat debug; then
        if have_feat devel; then
           DUMPERR=y
           ARGS="${ARGS} -Smemdebug"
        fi
     elif have_feat devel; then
        LOOPS_MAX=${LOOPS_BIG}
     fi
     color_init

     if [ -z "${RUN_TEST}" ] || [ ${#} -eq 0 ]; then
        jobs_max
        printf 'Spawing up to %s tests in parallel\n' ${MAXJOBS}
        jobreaper_start

(Inject above code here plain and all ok it seems.)

        t_all
     else
  ...
     fi

     jobreaper_stop

Injecting code plain is a bit painful because i need it twice.
I hope i am not missing something, i most likely do, since i would
have thought that the builtin kill would recognize that i am
actually killing something that came in via $! (though it never
started up it seems).
Interesting to me is that dash shows exactly the same errors (but
always, in practice).

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread


Thread

Async processes started in functions not reliably started Steffen Nurpmeso <steffen@sdaoden.eu> - 2019-08-04 00:40 +0200

csiph-web