Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #14873

Re: Bash removes unrequested characters in bracket expressions (not a range).

From Chet Ramey <chet.ramey@case.edu>
Newsgroups gnu.bash.bug
Subject Re: Bash removes unrequested characters in bracket expressions (not a range).
Date 2018-11-24 17:32 -0500
Message-ID <mailman.5069.1543846704.1284.bug-bash@gnu.org> (permalink)
References <CAFra36hcAjBHGgd_8sHjOV4wSzjmdCyLV2aQo8Ww1bwJqkxYQA@mail.gmail.com> <1c24a279-f439-a13c-be60-901096ccd4e1@case.edu> <CAFra36hdkG+5qq94cf-sbKrnw6roJWez03aKJPU=Z=Vad2LaXg@mail.gmail.com>

Show all headers | View raw


On 11/24/18 4:32 PM, Bize Ma wrote:

>     > Bash is removing characters not explicitly listed in a bracket
>     > expression (character range).
>     > In this example, it is removing digits from other languages.
> 
>     What is your locale?
> 
>  
> The locale used was en_US.utf-8 but also happens with  459
> locales out of 868 available under Debian (not in C, for example).
> 
> Also in all locales affected (except one), setting either
> LC_ALL=$loc or LC_COLLATE=$loc did the same.
> Except in zh_CN.gb18030
> 
> But IMO locale collation should not be used for an explicit list.

Collation order is used for each individual character in a bracket
expression when compared against the string, as posix specifies.

> I have been made aware that there is a
>       cstart = cend = FOLD (cstart);
> inside the `sm_loop.c` file that will convert into a range many
> individual character. If that understanding is correct that is the
> source of the difference with other shells.

I'm not sure what you mean by "convert into a range." If cstart and cend
were treated as a range, the start end and end characters would be the
same. If cstart == cend, a character that collates >= cstart and <= cend
would have to collate equal to cstart and cend.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/

Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread


Thread

Re: Bash removes unrequested characters in bracket expressions (not a range). Chet Ramey <chet.ramey@case.edu> - 2018-11-24 17:32 -0500

csiph-web