Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #15285

Re: Optimize bash string handling?

Path csiph.com!goblin1!goblin.stu.neva.ru!usenet.stanford.edu!not-for-mail
From Chet Ramey <chet.ramey@case.edu>
Newsgroups gnu.bash.bug
Subject Re: Optimize bash string handling?
Date Wed, 31 Jul 2019 15:15:37 -0400
Lines 66
Approved bug-bash@gnu.org
Message-ID <mailman.380.1564600546.1985.bug-bash@gnu.org> (permalink)
References <73adb906-3594-8c38-c820-2bf00b295e22@gmail.com> <6114f957-3862-b264-0a64-f5a12a4c522f@case.edu>
Reply-To chet.ramey@case.edu
NNTP-Posting-Host lists.gnu.org
Mime-Version 1.0
Content-Type text/plain; charset=utf-8
Content-Transfer-Encoding 8bit
X-Trace usenet.stanford.edu 1564600547 30595 209.51.188.17 (31 Jul 2019 19:15:47 GMT)
X-Complaints-To action@cs.stanford.edu
Cc chet.ramey@case.edu
To Alkis Georgopoulos <alkisg@gmail.com>, bug-bash@gnu.org
Envelope-to bug-bash@gnu.org
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=case.edu; s=smtp-primary; t=1564600541; bh=YyTEL0n9DyjNs/cgciodcE8IKh09TSHY/EVmVLGPXm0=; h=Reply-To:Cc:Subject:To:References:From:Message-ID:Date: MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=fWxfE+bhWEK4+Rmtmi5LckbfN5XCdgNgqmW+mEqlaKoIk7O1mnLOw3SxvEMQv9JuBQ gnF6iZEY86CvZT9VSCGEkSxqfbDJVW2Cja+KlDyGak8GwcCk7DX3l4/LHmVxJa4MgAv 8XOx/SMLGAUB7481KwbTuQ0OD9ej/VB6ZhvUghQpLECf7ujg6Z7TLlockBtvbVvs4qD en8g3sYvBfdEtOgMxj5WOV0ksKNA0D0qZm8QITvWCLURinCvySe7uHeh01mwMRtcYG9 9gNsuxAMY/tjl5/+YjfDFG15VPsteC+yCbsw7pITMZZVFU63+a57vwxm/aNhEqWYVzg 5oTlSLoA==
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=case.edu; s=smtp-primary; t=1564600540; bh=Poyu9EBUp08viZ+u35a416VpIxW0gH4feCT2Tq54VMI=; h=Reply-To:Cc:Subject:To:References:From:Message-ID:Date: MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=rszNrGJrEWkr/H8QWhWJFXbFjcvT3Y0VCcgNhZLkPb+CLcbJ4/kykFyETa0s1dtM8E j2MrMzUV2j0+BC/oZMlpK+ozspAg6EI8ju3dUruDaQTPJZfPCBNpE0/kJRZ7Z/ZRjQH M17h0foMIEgS2T3im1IKyM3aNeijU3Aylf+s1lUbb37fNoAcD9gh1793pPiO2ShBCDt KQPDSzMYY5H57Po/J8mK0oYN573mhwoR5/xmWKHSpTej13wf1+1ejhLDnJYAnZ9jxP4 5eVNseXe8proTNY0oLKdbVtm2zylxYBewBvMmZUp4HRgDGwi7p1iXWgTTdntu01BilZ CnATVlwQ==
Openpgp preference=signencrypt
Autocrypt addr=chet.ramey@case.edu; prefer-encrypt=mutual; keydata= xsDiBEEOsGwRBACFa0A1oa71HSZLWxAx0svXzhOZNQZOzqHmSuGOG92jIpQpr8DpvgRh40Yp AwdcXb8QG1J5yGAKeevNE1zCFaA725vGSdHUyypHouV0xoWwukYO6qlyyX+2BZU+okBUqoWQ koWxiYaCSfzB2Ln7pmdys1fJhcgBKf3VjWCjd2XJTwCgoFJOwyBFJdugjfwjSoRSwDOIMf0D /iQKqlWhIO1LGpMrGX0il0/x4zj0NAcSwAk7LaPZbN4UPjn5pqGEHBlf1+xDDQCkAoZ/VqES GZragl4VqJfxBr29Ag0UDvNbUbXoxQsARdero1M8GiAIRc50hj7HXFoERwenbNDJL86GPLAQ OTGOCa4W2o29nFfFjQrsrrYHzVtyA/9oyKvTeEMJ7NA3VJdWcmn7gOu0FxEmSNhSoV1T4vP2 1Wf7f5niCCRKQLNyUy0wEApQi4tSysdz+AbgAc0b/bHYVzIf2uO2lIEZQNNt+3g2bmXgloWm W5fsm/di50Gm1l1Na63d3RZ00SeFQos6WEwLUHEB0yp6KXluXLLIZitEJM0gQ2hldCBSYW1l eSA8Y2hldC5yYW1leUBjYXNlLmVkdT7CYQQTEQIAIQIbAwYLCQgHAwIDFQIDAxYCAQIeAQIX gAUCRX3FIgIZAQAKCRC7WGnwZOp0q069AKCNDRn+zzN/AHbaynls/Lvq1kH/RQCgkLvF8bDs maUHSxSIPqzlGuKWDxbOwE0EQQ6wbxAEAJCukwDigRDPhAuI+lf+6P64lWanIFOXIndqhvU1 3cDbQ/Wt5LwPzm2QTvd7F+fcHOgZ8KOFScbDpjJaRqwIybMTcIN0B2pBLX/C10W1aY+cUrXZ gXUGVISEMmpaP9v02auToo7XXVEHC+XLO9IU7/xaU98FL69l6/K4xeNSBRM/AAMHA/wNAmRB pcyK0+VggZ5esQaIP/LyolAm2qwcmrd3dZi+g24s7yjV0EUwvRP7xHRDQFgkAo6++QbuecU/ J90lxrVnQwucZmfz9zgWDkT/MpfB/CNRSKLFjhYq2yHmHWT6vEjw9Ry/hF6Pc0oh1a62USdf aKAiim0nVxxQmPmiRvtCmcJJBBgRAgAJBQJBDrBvAhsMAAoJELtYafBk6nSr43AAn2ZZFQg8 Gs/zUzvXMt7evaFqVTzcAJ0cHtKpP1i/4H4R9+OsYeQdxxWxTQ==
User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.8.0
In-Reply-To <73adb906-3594-8c38-c820-2bf00b295e22@gmail.com>
Content-Language en-US
X-Junkmail-Status score=7/90, host=mpv2-2015.case.edu
X-Junkmail-PrAS-Raw score=7/90, refid=2.7.2:2019.7.31.175716:17:7.944, ip=, rules=DKIM_SIGNATURE, __HAS_REPLYTO, __HAS_CC_HDR, __SUBJ_REPLY, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __TO_MALFORMED_2, __TO_NAME, __TO_NAME_DIFF_FROM_ACC, __HAS_REFERENCES, __REFERENCES, __HAS_FROM, FROM_EDU_TLD, __HAS_MSGID, __SANE_MSGID, DATE_TZ_NA, __USER_AGENT, __MOZILLA_USER_AGENT, __MIME_VERSION, __IN_REP_TO, __CT, __CT_TEXT_PLAIN, __CTE, __REPLYTO_SAMEAS_FROM_ADDY, __REPLYTO_SAMEAS_FROM_ACC, __FROM_DOMAIN_IN_ANY_CC1, __FROM_DOMAIN_IN_ANY_CC2, __REPLYTO_SAMEAS_FROM_DOMAIN, __DKIM_ALIGNS_1, __DKIM_ALIGNS_2, __ANY_URI, __URI_WITH_PATH, __URI_NO_WWW, __CP_URI_IN_BODY, __STOCK_PHRASE_7, __FRAUD_MONEY_CURRENCY_DOLLAR, __SUBJ_ALPHA_NEGATE, __URI_IN_BODY, __URI_NOT_IMG, __FORWARDED_MSG, __BODY_NO_MAILTO, __NO_HTML_TAG_RAW, BODYTEXTP_SIZE_3000_LESS, BODY_SIZE_2000_2999, __MIME_TEXT_P1, __MIME_TEXT_ONLY, __URI_NS, HTML_00_01, HTML_00_10, [TRUNCATED], so=2010-03-03 19:42:08, dmn=2016-08-03-0138
X-detected-operating-system by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic]
X-Received-From 129.22.103.227
X-BeenThere bug-bash@gnu.org
X-Mailman-Version 2.1.23
Precedence list
List-Id Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org>
List-Unsubscribe <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe>
List-Archive <https://lists.gnu.org/archive/html/bug-bash>
List-Post <mailto:bug-bash@gnu.org>
List-Help <mailto:bug-bash-request@gnu.org?subject=help>
List-Subscribe <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe>
X-Mailman-Original-Message-ID <6114f957-3862-b264-0a64-f5a12a4c522f@case.edu>
X-Mailman-Original-References <73adb906-3594-8c38-c820-2bf00b295e22@gmail.com>
Xref csiph.com gnu.bash.bug:15285

Show key headers only | View raw


On 7/26/19 5:55 AM, Alkis Georgopoulos wrote:
> While handling some big strings, I noticed that bash is a lot slower
> than other shells like dash, posh and busybox ash.
> I came up with the following little benchmark and results.
> While the specific benchmark isn't important, maybe some developer
> would like to use it to pinpoint and optimize some internal bash
> function that is a lot slower than in other shells?

Thanks for the report. There are places in bash where it copies and
re-processes strings too many times, and you uncovered a couple.

> 
> # Avoid UTF-8 complications
> export LANG=C
> 
> # Run the following COMMANDs with `time bash -c`
> # or `time busybox ash -c`
> # The time columns are in seconds, on an i5-4440 CPU
> 
> ASH BASH  COMMAND
> 0.1  0.1  printf "%100000000s" "." >/dev/null
> 0.7  1.1  x=$(printf "%100000000s" ".")

The first assignment is dominated by the command substitution and
reading the data through a pipe.

> 0.8  2.4  x=$(printf "%100000000s" "."); echo ${#x}
> 0.9  3.7  x=$(printf "%100000000s" "."); echo ${#x}; echo ${#x}

The length function was too general, and didn't optimize for the common
case. Bash would expand the parameter name following the `#' as if the
`#' were not present, then take the length of the results. Most uses don't
need that generality, or the common error handling if `set -u' is enabled.
Factoring out the common case provides substantial improvement:

$ time ./bash ./x1a

real	0m1.215s
user	0m0.959s
sys	0m0.248s
$ time ./bash ./x1b
100000000

real	0m1.242s
user	0m0.982s
sys	0m0.256s
$ time ./bash ./x1c
100000000
100000000

real	0m1.290s
user	0m1.020s
sys	0m0.265s

where the three scripts are the three cases above.

There's always more work to do, though.

Chet


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/

Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread


Thread

Re: Optimize bash string handling? Chet Ramey <chet.ramey@case.edu> - 2019-07-31 15:15 -0400

csiph-web