Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #15610

Re: Locale not Obeyed by Parameter Expansion with Pattern Substitution

From Stephane Chazelas <stephane.chazelas@gmail.com>
Newsgroups gnu.bash.bug
Subject Re: Locale not Obeyed by Parameter Expansion with Pattern Substitution
Date 2019-11-18 20:46 +0000
Message-ID <mailman.1890.1574109997.13325.bug-bash@gnu.org> (permalink)
References <0acc4767-e87f-f163-b39e-f137effdfea2.ref@sbcglobal.net> <0acc4767-e87f-f163-b39e-f137effdfea2@sbcglobal.net> <20191118204626.33smprow5zae4apl@chaz.gmail.com>

Show all headers | View raw


2019-11-17 01:25:31 -0800, Chris Carlen:
[...]
> # write 'REVERSE PILCROW SIGN' to B, then repeat as above:
> printf -v B '\u204B'
> set -- ${B//?()/ }
> echo "${@@Q}"       #-> $'\342' $'\201' $'\213'
> 
> # NOTE: Since there is only one character (under the UTF-8 locale),
> # this should have set only the first positional parameter with the
> # character REVERSE PILCROW SIGN, not split it into bytes (AFAIK).
[...]

Yes, the question is where to resume searching after a match of
an empty string in ${var//pattern/replacement}.

Note that it's even worse in ksh93 where bash copied that syntax
from:

$ A=$'\u2048\u2048' ksh93 -c 'printf "%q\n" "${A//?()/:}"'
$':\u[2048]:\x81:\x88:\u[2048]:\x81:\x88:'

(here with ksh93u+)

Then there's the question of what

${B/$'\201'/}

should do. Should that $'\201' match the byte component of the encoding of
U+204B?

It seems to me that zsh's approach is best:

$ A=$'\u2048\201\u2048' zsh  -c "printf '%q\n' \"\${A//$'\201'/:}\""
⁈:⁈

That is replace that \201 byte, except when it's part of a
properly encoded character.

Compare with:

$ A=$'\u2048\201\u2048' bash  -c "printf '%q\n' \"\${A//$'\201'/:}\""
$'\342:\210:\342:\210'

$ A=$'\u2048\201\u2048' ksh93  -c "printf '%q\n' \"\${A//$'\201'/:}\""
$'\u[2048]:\x88:\u[2048]:\x88'

(or yash which can't deal with that \201 byte at all as it can't
form a valid character).

-- 
Stephane

Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread


Thread

Re: Locale not Obeyed by Parameter Expansion with Pattern Substitution Stephane Chazelas <stephane.chazelas@gmail.com> - 2019-11-18 20:46 +0000

csiph-web