Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > gnu.bash.bug > #15611
| From | Stephane Chazelas <stephane.chazelas@gmail.com> |
|---|---|
| Newsgroups | gnu.bash.bug |
| Subject | Re: Locale not Obeyed by Parameter Expansion with Pattern Substitution |
| Date | 2019-11-19 07:56 +0000 |
| Message-ID | <mailman.1918.1574150225.13325.bug-bash@gnu.org> (permalink) |
| References | <0acc4767-e87f-f163-b39e-f137effdfea2.ref@sbcglobal.net> <0acc4767-e87f-f163-b39e-f137effdfea2@sbcglobal.net> <20191118204626.33smprow5zae4apl@chaz.gmail.com> <20191119075657.paz4zvwskfvr3ptt@chaz.gmail.com> |
2019-11-18 20:46:26 +0000, Stephane Chazelas:
[...]
> > printf -v B '\u204B'
> > set -- ${B//?()/ }
> > echo "${@@Q}" #-> $'\342' $'\201' $'\213'
[...]
> It seems to me that zsh's approach is best:
>
> $ A=$'\u2048\201\u2048' zsh -c "printf '%q\n' \"\${A//$'\201'/:}\""
> ⁈:⁈
>
> That is replace that \201 byte, except when it's part of a
> properly encoded character.
[...]
Actually, zsh would also break a character if the byte to be
replaced is the first of the character:
$ A=$'\u2048\342\u2048' zsh -c "printf '%q\n' \"\${A//$'\342'/:}\""
:$'\201'$'\210'::$'\201'$'\210'
Note that in charsets like BIG5/GB18030... which have characters
whose encoding contains the encoding of other characters, bash
seems to behave better than in UTF-8.
For instance the encoding of é in BIG5-HKSCS is 0x88 0x6d where
0x6d is also the encoding of "m" like in ASCII.
$ printf é | iconv -t big5-hkscs | od -tc -tx1
0000000 210 m
88 6d
0000002
$ LC_ALL=zh_HK.big5hkscs luit
$ U=Stéphane bash -c 'printf "%s\n" "${U//m}"'
Stéphane
$ U=Stéphane ksh93 -c 'printf "%s\n" "${U//m}"'
Stéphane
$ U=Stéphane zsh -c 'printf "%s\n" "${U//m}"'
Stéphane
All 3 shells OK, but:
$ U=Stéphane bash -c 'printf "%s\n" "${U//$'\''\210'\''}"'
Stmphane
$ U=Stéphane ksh -c 'printf "%s\n" "${U//$'\''\210'\''}"'
Stmphane
$ U=Stéphane zsh -c 'printf "%s\n" "${U//$'\''\210'\''}"'
Stmphane
All 3 shells "break" that é character there.
--
Stephane
Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread
Re: Locale not Obeyed by Parameter Expansion with Pattern Substitution Stephane Chazelas <stephane.chazelas@gmail.com> - 2019-11-19 07:56 +0000
csiph-web