Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #15135

Re: Combination of "eval set -- ..." and $() command substitution is slow

Path csiph.com!xmission!news.snarked.org!news.linkpendium.com!news.linkpendium.com!panix!usenet.stanford.edu!not-for-mail
From Chet Ramey <chet.ramey@case.edu>
Newsgroups gnu.bash.bug
Subject Re: Combination of "eval set -- ..." and $() command substitution is slow
Date Fri, 12 Jul 2019 10:44:27 -0400
Lines 84
Approved bug-bash@gnu.org
Message-ID <mailman.1011.1562942676.2688.bug-bash@gnu.org> (permalink)
References <bcd08f6c-1c13-0eb4-92b2-4e904b19a0ce@e-nautia.com> <7ba227f9-ed87-8224-6f07-fd444488d472@case.edu>
Reply-To chet.ramey@case.edu
NNTP-Posting-Host lists.gnu.org
Mime-Version 1.0
Content-Type text/plain; charset=utf-8
Content-Transfer-Encoding 7bit
X-Trace usenet.stanford.edu 1562942677 4028 209.51.188.17 (12 Jul 2019 14:44:37 GMT)
X-Complaints-To action@cs.stanford.edu
Cc chet.ramey@case.edu
To astian <astian@e-nautia.com>, bug-bash@gnu.org
Envelope-to bug-bash@gnu.org
Openpgp preference=signencrypt
Autocrypt addr=chet.ramey@case.edu; prefer-encrypt=mutual; keydata= xsDiBEEOsGwRBACFa0A1oa71HSZLWxAx0svXzhOZNQZOzqHmSuGOG92jIpQpr8DpvgRh40Yp AwdcXb8QG1J5yGAKeevNE1zCFaA725vGSdHUyypHouV0xoWwukYO6qlyyX+2BZU+okBUqoWQ koWxiYaCSfzB2Ln7pmdys1fJhcgBKf3VjWCjd2XJTwCgoFJOwyBFJdugjfwjSoRSwDOIMf0D /iQKqlWhIO1LGpMrGX0il0/x4zj0NAcSwAk7LaPZbN4UPjn5pqGEHBlf1+xDDQCkAoZ/VqES GZragl4VqJfxBr29Ag0UDvNbUbXoxQsARdero1M8GiAIRc50hj7HXFoERwenbNDJL86GPLAQ OTGOCa4W2o29nFfFjQrsrrYHzVtyA/9oyKvTeEMJ7NA3VJdWcmn7gOu0FxEmSNhSoV1T4vP2 1Wf7f5niCCRKQLNyUy0wEApQi4tSysdz+AbgAc0b/bHYVzIf2uO2lIEZQNNt+3g2bmXgloWm W5fsm/di50Gm1l1Na63d3RZ00SeFQos6WEwLUHEB0yp6KXluXLLIZitEJM0gQ2hldCBSYW1l eSA8Y2hldC5yYW1leUBjYXNlLmVkdT7CYQQTEQIAIQIbAwYLCQgHAwIDFQIDAxYCAQIeAQIX gAUCRX3FIgIZAQAKCRC7WGnwZOp0q069AKCNDRn+zzN/AHbaynls/Lvq1kH/RQCgkLvF8bDs maUHSxSIPqzlGuKWDxbOwE0EQQ6wbxAEAJCukwDigRDPhAuI+lf+6P64lWanIFOXIndqhvU1 3cDbQ/Wt5LwPzm2QTvd7F+fcHOgZ8KOFScbDpjJaRqwIybMTcIN0B2pBLX/C10W1aY+cUrXZ gXUGVISEMmpaP9v02auToo7XXVEHC+XLO9IU7/xaU98FL69l6/K4xeNSBRM/AAMHA/wNAmRB pcyK0+VggZ5esQaIP/LyolAm2qwcmrd3dZi+g24s7yjV0EUwvRP7xHRDQFgkAo6++QbuecU/ J90lxrVnQwucZmfz9zgWDkT/MpfB/CNRSKLFjhYq2yHmHWT6vEjw9Ry/hF6Pc0oh1a62USdf aKAiim0nVxxQmPmiRvtCmcJJBBgRAgAJBQJBDrBvAhsMAAoJELtYafBk6nSr43AAn2ZZFQg8 Gs/zUzvXMt7evaFqVTzcAJ0cHtKpP1i/4H4R9+OsYeQdxxWxTQ==
User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.8.0
In-Reply-To <bcd08f6c-1c13-0eb4-92b2-4e904b19a0ce@e-nautia.com>
Content-Language en-US
X-Junkmail-Status score=7/90, host=mpv2-2015.case.edu
X-Junkmail-PrAS-Raw score=7/90, refid=2.7.2:2019.7.12.140017:17:7.944, ip=, rules=__HAS_REPLYTO, __HAS_CC_HDR, __SUBJ_REPLY, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __SUBJ_ALPHA_END, __TO_MALFORMED_2, __TO_NAME, __HAS_REFERENCES, __REFERENCES, __HAS_FROM, FROM_EDU_TLD, __HAS_MSGID, __SANE_MSGID, DATE_TZ_NA, __USER_AGENT, __MOZILLA_USER_AGENT, __MIME_VERSION, __IN_REP_TO, __CT, __CT_TEXT_PLAIN, __CTE, __REPLYTO_SAMEAS_FROM_ADDY, __REPLYTO_SAMEAS_FROM_ACC, __FROM_DOMAIN_IN_ANY_CC1, __FROM_DOMAIN_IN_ANY_CC2, __REPLYTO_SAMEAS_FROM_DOMAIN, __ANY_URI, __URI_WITH_PATH, __URI_NO_WWW, __CP_URI_IN_BODY, __FRAUD_MONEY_CURRENCY_DOLLAR, __SUBJ_ALPHA_NEGATE, __URI_IN_BODY, __URI_NOT_IMG, __FORWARDED_MSG, __BODY_NO_MAILTO, __NO_HTML_TAG_RAW, BODY_SIZE_3000_3999, __MIME_TEXT_P1, __MIME_TEXT_ONLY, __URI_NS, HTML_00_01, HTML_00_10, __FRAUD_MONEY_CURRENCY, BODY_SIZE_5000_LESS, IN_REP_TO, MSG_THREAD, __FROM_DOMAIN_IN_RCPT, [TRUNCATED], so=2010-03-03 19:42:08, dmn=2016-08-03-0138
X-detected-operating-system by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic]
X-Received-From 129.22.103.227
X-BeenThere bug-bash@gnu.org
X-Mailman-Version 2.1.23
Precedence list
List-Id Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org>
List-Unsubscribe <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe>
List-Archive <https://lists.gnu.org/archive/html/bug-bash>
List-Post <mailto:bug-bash@gnu.org>
List-Help <mailto:bug-bash-request@gnu.org?subject=help>
List-Subscribe <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe>
X-Mailman-Original-Message-ID <7ba227f9-ed87-8224-6f07-fd444488d472@case.edu>
X-Mailman-Original-References <bcd08f6c-1c13-0eb4-92b2-4e904b19a0ce@e-nautia.com>
Xref csiph.com gnu.bash.bug:15135

Show key headers only | View raw


On 7/10/19 1:21 PM, astian wrote:

> Bash Version: 5.0
> Patch Level: 3
> Release Status: release
> 
> Description:
> 
>   I discovered a curious performance degradation in the combined usage of the
>   constructs "eval set -- ..." and new-style command substitution.  In short,
>   setting the positional arguments via eval and then iterating over each one
>   while performing $() command substitution(s) is significantly slower than
>   not using eval, or not making command substitution, or using `` instead.
> 
>   I include below a reduced test script that illustrates the issue.  A few
>   notes:
>     - The pathological case is "1 1 0".
>     - I did not observe performance difference in unoptimised builds (-O0).
> 

>     --------------------------
>     case 1 1 0
>     eval set
>     real    0m0.002s
>     user    0m0.000s
>     sys     0m0.000s
>     for loop cmdsubst-currency
>     real    0m0.968s
>     user    0m0.432s
>     sys     0m0.148s
>     --------------------------

> 
>   Observations:
>     - The pathological case "1 1 0" spends about 10 times more time doing
>       something in userspace during the loop, relative to the comparable cases
>       "0 1 0", "0 1 1", and "1 1 1".
>     - $() seems generally slightly slower than ``, but becomes pathologically
>       so when preceded with "eval set -- ...".

It is slightly slower -- POSIX requires that the shell parse the contents
of $(...) to determine that it's a valid script as part of finding the
closing `)'. The rules for finding the closing "`" don't have that
requirement.

>     - "eval set -- ..." itself doesn't seem slow at all, but obviously it has
>       side-effects not captured by the "time" measurement tool.

What happens is you end up with a 4900-character command string that you
have to parse multiple times. But that's not the worst of it.

The gprof output provides a clue.


>       case 1 1 0 (pathological):
>        %   cumulative   self              self     total
>       time   seconds   seconds    calls  us/call  us/call  name
>       38.89      0.21     0.21    28890     7.27     7.27  set_line_mbstate

set_line_mbstate() runs through each command line before parsing, creating
a bitmap that indicates whether each element is a single-byte character or
part of a multi-byte character. The scanner uses this to determine whether
a shell metacharacter should act as a delimiter or get skipped over as part
of a multibyte character. For a single run with args `1 1 0', it gets
called around 7300 times, with around 2400 of them for the 4900-character
string with all the arguments.

When you're in a multibyte locale (en_US.UTF-8 is one such), each one of
those characters requires a call to mbrlen/mbrtowc. So that ends up being
2400 * 4900 calls to mbrlen.

There is something happening here -- there's no way there should be that
many calls to set_line_mbstate(), even when you have to save and restore
the input line because you have to parse the contents of $(). There must
be some combination of the effect of `eval' on the line bitmap and the
long string. I'll see what I can figure out.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/

Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread


Thread

Re: Combination of "eval set -- ..." and $() command substitution is slow Chet Ramey <chet.ramey@case.edu> - 2019-07-12 10:44 -0400

csiph-web