Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > gnu.bash.bug > #15135
| From | Chet Ramey <chet.ramey@case.edu> |
|---|---|
| Newsgroups | gnu.bash.bug |
| Subject | Re: Combination of "eval set -- ..." and $() command substitution is slow |
| Date | 2019-07-12 10:44 -0400 |
| Message-ID | <mailman.1011.1562942676.2688.bug-bash@gnu.org> (permalink) |
| References | <bcd08f6c-1c13-0eb4-92b2-4e904b19a0ce@e-nautia.com> <7ba227f9-ed87-8224-6f07-fd444488d472@case.edu> |
On 7/10/19 1:21 PM, astian wrote: > Bash Version: 5.0 > Patch Level: 3 > Release Status: release > > Description: > > I discovered a curious performance degradation in the combined usage of the > constructs "eval set -- ..." and new-style command substitution. In short, > setting the positional arguments via eval and then iterating over each one > while performing $() command substitution(s) is significantly slower than > not using eval, or not making command substitution, or using `` instead. > > I include below a reduced test script that illustrates the issue. A few > notes: > - The pathological case is "1 1 0". > - I did not observe performance difference in unoptimised builds (-O0). > > -------------------------- > case 1 1 0 > eval set > real 0m0.002s > user 0m0.000s > sys 0m0.000s > for loop cmdsubst-currency > real 0m0.968s > user 0m0.432s > sys 0m0.148s > -------------------------- > > Observations: > - The pathological case "1 1 0" spends about 10 times more time doing > something in userspace during the loop, relative to the comparable cases > "0 1 0", "0 1 1", and "1 1 1". > - $() seems generally slightly slower than ``, but becomes pathologically > so when preceded with "eval set -- ...". It is slightly slower -- POSIX requires that the shell parse the contents of $(...) to determine that it's a valid script as part of finding the closing `)'. The rules for finding the closing "`" don't have that requirement. > - "eval set -- ..." itself doesn't seem slow at all, but obviously it has > side-effects not captured by the "time" measurement tool. What happens is you end up with a 4900-character command string that you have to parse multiple times. But that's not the worst of it. The gprof output provides a clue. > case 1 1 0 (pathological): > % cumulative self self total > time seconds seconds calls us/call us/call name > 38.89 0.21 0.21 28890 7.27 7.27 set_line_mbstate set_line_mbstate() runs through each command line before parsing, creating a bitmap that indicates whether each element is a single-byte character or part of a multi-byte character. The scanner uses this to determine whether a shell metacharacter should act as a delimiter or get skipped over as part of a multibyte character. For a single run with args `1 1 0', it gets called around 7300 times, with around 2400 of them for the 4900-character string with all the arguments. When you're in a multibyte locale (en_US.UTF-8 is one such), each one of those characters requires a call to mbrlen/mbrtowc. So that ends up being 2400 * 4900 calls to mbrlen. There is something happening here -- there's no way there should be that many calls to set_line_mbstate(), even when you have to save and restore the input line because you have to parse the contents of $(). There must be some combination of the effect of `eval' on the line bitmap and the long string. I'll see what I can figure out. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread
Re: Combination of "eval set -- ..." and $() command substitution is slow Chet Ramey <chet.ramey@case.edu> - 2019-07-12 10:44 -0400
csiph-web