Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > gnu.bash.bug > #15611
| Path | csiph.com!weretis.net!feeder6.news.weretis.net!news.linkpendium.com!news.linkpendium.com!panix!usenet.stanford.edu!not-for-mail |
|---|---|
| From | Stephane Chazelas <stephane.chazelas@gmail.com> |
| Newsgroups | gnu.bash.bug |
| Subject | Re: Locale not Obeyed by Parameter Expansion with Pattern Substitution |
| Date | Tue, 19 Nov 2019 07:56:57 +0000 |
| Lines | 54 |
| Approved | bug-bash@gnu.org |
| Message-ID | <mailman.1918.1574150225.13325.bug-bash@gnu.org> (permalink) |
| References | <0acc4767-e87f-f163-b39e-f137effdfea2.ref@sbcglobal.net> <0acc4767-e87f-f163-b39e-f137effdfea2@sbcglobal.net> <20191118204626.33smprow5zae4apl@chaz.gmail.com> <20191119075657.paz4zvwskfvr3ptt@chaz.gmail.com> |
| NNTP-Posting-Host | lists.gnu.org |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset=utf-8 |
| Content-Transfer-Encoding | 8bit |
| X-Trace | usenet.stanford.edu 1574150226 18538 209.51.188.17 (19 Nov 2019 07:57:06 GMT) |
| X-Complaints-To | action@cs.stanford.edu |
| Cc | bug-bash@gnu.org |
| To | Chris Carlen <crobc@sbcglobal.net> |
| Envelope-to | bug-bash@gnu.org |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=FfrXC9GdvLVdpc1UfZKbZI3MSUfWBakolIqvSq635VE=; b=Tf+bX7YujEWmJKCtPbw2qEWdfn6MqlZxwPBrPwG01jAyFPLoBysqyMDkhgJjPXHF7x GRZfvG+4piToM8UGbDgf3X043n84iktdnt9pL63MaxW0uwC2JwZSHj3ZtrnPgQyrNA61 /UlONypN1RE2hqnfYhA8p0PCjpW6BwpPcvqnTY3ODazkpTaZ1q+r7bKEtd3dfQEIWAn0 uXOHnnT1OrnEytWFRLBj8xELiv+oMLbZsekvdj/Zh4oDd19FRWwvhD1W0TvY3Bs7m9oe wI01JT2CIWezwg+sVRrMoNlGnLpChXL60LOJOd3DgEk9ksqNsXc1Nhs0nypVrxOJs5O1 hRdA== |
| X-Google-DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=FfrXC9GdvLVdpc1UfZKbZI3MSUfWBakolIqvSq635VE=; b=XUQWIv58qXTXIq/Xku4CWXnWR2Ipg10S64eCRJlPo/IjMrf2lbCQv/7Wwn2nA0S+J+ gb0menADJW7anxoA6eRYsC2vkxI20dMjh7CllKCrKo5ORpvICtuWkMH9H95l0yqAcvHu enq75Jb6ENpVE8Nl84kb5m6h+biTgWL0IUfb2U05AYMGyBDBSOL69mA1Xdk0jz9KLKVT dcR49EuAUk/5NTw2XSy3SjCISGNn7xdFgsIBAUasVLH0R3sG11VTVL7oQsDSIimO5xhM KXoTg/ConQq2Ei4R+UtKGg6EAj8TlHj74OS5f5RFTT20AYnt34oaVLyUwZ0orUtDijRU E1Rg== |
| X-Gm-Message-State | APjAAAWAIM+iOxh+r+3Dsm37Uwg35hwaVgDo+Y7sO+koEEhXtbT23JBd a3pwqQ5S6EgArp8csehfOWI= |
| X-Google-Smtp-Source | APXvYqzRxUnAUFj4d6TI0+YBdpVEKaUXpWGIIUzn9HqpdQCwQxyWv7exQLCU0RxUXOh1aVGycxH97w== |
| X-Received | by 2002:a1c:3cc4:: with SMTP id j187mr3967715wma.95.1574150219946; Mon, 18 Nov 2019 23:56:59 -0800 (PST) |
| Content-Disposition | inline |
| In-Reply-To | <20191118204626.33smprow5zae4apl@chaz.gmail.com> |
| User-Agent | NeoMutt/20171215 |
| X-detected-operating-system | by eggs.gnu.org: Genre and OS details not recognized. |
| X-Received-From | 2a00:1450:4864:20::330 |
| X-BeenThere | bug-bash@gnu.org |
| X-Mailman-Version | 2.1.23 |
| Precedence | list |
| List-Id | Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org> |
| List-Unsubscribe | <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe> |
| List-Archive | <https://lists.gnu.org/archive/html/bug-bash> |
| List-Post | <mailto:bug-bash@gnu.org> |
| List-Help | <mailto:bug-bash-request@gnu.org?subject=help> |
| List-Subscribe | <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe> |
| X-Mailman-Original-Message-ID | <20191119075657.paz4zvwskfvr3ptt@chaz.gmail.com> |
| X-Mailman-Original-References | <0acc4767-e87f-f163-b39e-f137effdfea2.ref@sbcglobal.net> <0acc4767-e87f-f163-b39e-f137effdfea2@sbcglobal.net> <20191118204626.33smprow5zae4apl@chaz.gmail.com> |
| Xref | csiph.com gnu.bash.bug:15611 |
Show key headers only | View raw
2019-11-18 20:46:26 +0000, Stephane Chazelas:
[...]
> > printf -v B '\u204B'
> > set -- ${B//?()/ }
> > echo "${@@Q}" #-> $'\342' $'\201' $'\213'
[...]
> It seems to me that zsh's approach is best:
>
> $ A=$'\u2048\201\u2048' zsh -c "printf '%q\n' \"\${A//$'\201'/:}\""
> ⁈:⁈
>
> That is replace that \201 byte, except when it's part of a
> properly encoded character.
[...]
Actually, zsh would also break a character if the byte to be
replaced is the first of the character:
$ A=$'\u2048\342\u2048' zsh -c "printf '%q\n' \"\${A//$'\342'/:}\""
:$'\201'$'\210'::$'\201'$'\210'
Note that in charsets like BIG5/GB18030... which have characters
whose encoding contains the encoding of other characters, bash
seems to behave better than in UTF-8.
For instance the encoding of é in BIG5-HKSCS is 0x88 0x6d where
0x6d is also the encoding of "m" like in ASCII.
$ printf é | iconv -t big5-hkscs | od -tc -tx1
0000000 210 m
88 6d
0000002
$ LC_ALL=zh_HK.big5hkscs luit
$ U=Stéphane bash -c 'printf "%s\n" "${U//m}"'
Stéphane
$ U=Stéphane ksh93 -c 'printf "%s\n" "${U//m}"'
Stéphane
$ U=Stéphane zsh -c 'printf "%s\n" "${U//m}"'
Stéphane
All 3 shells OK, but:
$ U=Stéphane bash -c 'printf "%s\n" "${U//$'\''\210'\''}"'
Stmphane
$ U=Stéphane ksh -c 'printf "%s\n" "${U//$'\''\210'\''}"'
Stmphane
$ U=Stéphane zsh -c 'printf "%s\n" "${U//$'\''\210'\''}"'
Stmphane
All 3 shells "break" that é character there.
--
Stephane
Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread
Re: Locale not Obeyed by Parameter Expansion with Pattern Substitution Stephane Chazelas <stephane.chazelas@gmail.com> - 2019-11-19 07:56 +0000
csiph-web