Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #15285

Re: Optimize bash string handling?

From Chet Ramey <chet.ramey@case.edu>
Newsgroups gnu.bash.bug
Subject Re: Optimize bash string handling?
Date 2019-07-31 15:15 -0400
Message-ID <mailman.380.1564600546.1985.bug-bash@gnu.org> (permalink)
References <73adb906-3594-8c38-c820-2bf00b295e22@gmail.com> <6114f957-3862-b264-0a64-f5a12a4c522f@case.edu>

Show all headers | View raw


On 7/26/19 5:55 AM, Alkis Georgopoulos wrote:
> While handling some big strings, I noticed that bash is a lot slower
> than other shells like dash, posh and busybox ash.
> I came up with the following little benchmark and results.
> While the specific benchmark isn't important, maybe some developer
> would like to use it to pinpoint and optimize some internal bash
> function that is a lot slower than in other shells?

Thanks for the report. There are places in bash where it copies and
re-processes strings too many times, and you uncovered a couple.

> 
> # Avoid UTF-8 complications
> export LANG=C
> 
> # Run the following COMMANDs with `time bash -c`
> # or `time busybox ash -c`
> # The time columns are in seconds, on an i5-4440 CPU
> 
> ASH BASH  COMMAND
> 0.1  0.1  printf "%100000000s" "." >/dev/null
> 0.7  1.1  x=$(printf "%100000000s" ".")

The first assignment is dominated by the command substitution and
reading the data through a pipe.

> 0.8  2.4  x=$(printf "%100000000s" "."); echo ${#x}
> 0.9  3.7  x=$(printf "%100000000s" "."); echo ${#x}; echo ${#x}

The length function was too general, and didn't optimize for the common
case. Bash would expand the parameter name following the `#' as if the
`#' were not present, then take the length of the results. Most uses don't
need that generality, or the common error handling if `set -u' is enabled.
Factoring out the common case provides substantial improvement:

$ time ./bash ./x1a

real	0m1.215s
user	0m0.959s
sys	0m0.248s
$ time ./bash ./x1b
100000000

real	0m1.242s
user	0m0.982s
sys	0m0.256s
$ time ./bash ./x1c
100000000
100000000

real	0m1.290s
user	0m1.020s
sys	0m0.265s

where the three scripts are the three cases above.

There's always more work to do, though.

Chet


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/

Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread


Thread

Re: Optimize bash string handling? Chet Ramey <chet.ramey@case.edu> - 2019-07-31 15:15 -0400

csiph-web