Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > gnu.bash.bug > #14851
| Path | csiph.com!3.us.feeder.erje.net!feeder.erje.net!news.linkpendium.com!news.linkpendium.com!panix!usenet.stanford.edu!not-for-mail |
|---|---|
| From | Bize Ma <binaryzebra@gmail.com> |
| Newsgroups | gnu.bash.bug |
| Subject | Re: Bash removes unrequested characters in bracket expressions (not a range). |
| Date | Sat, 24 Nov 2018 17:34:55 -0400 |
| Lines | 48 |
| Approved | bug-bash@gnu.org |
| Message-ID | <mailman.4547.1543095311.1284.bug-bash@gnu.org> (permalink) |
| References | <CAFra36hcAjBHGgd_8sHjOV4wSzjmdCyLV2aQo8Ww1bwJqkxYQA@mail.gmail.com> <1c24a279-f439-a13c-be60-901096ccd4e1@case.edu> |
| NNTP-Posting-Host | lists.gnu.org |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset="UTF-8" |
| X-Trace | usenet.stanford.edu 1543095312 19093 208.118.235.17 (24 Nov 2018 21:35:12 GMT) |
| X-Complaints-To | action@cs.stanford.edu |
| Cc | bug-bash <bug-bash@gnu.org>, bash@packages.debian.org |
| To | Chester Ramey <chet.ramey@case.edu> |
| Envelope-to | bug-bash@gnu.org |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vizIMoqY7Xu5ayTJwK2GpLmGG/7odWLkK0QzLvwdzC4=; b=akhchKwycMGISCg8VmnyiJEBdMf6yQ3GQzkKO0LDObfDelvEMJ/GavRhdMbFGYF5jH PLap38d7j1bF0+IWOINpvZPkiBUbmM4Lo4BDykBdQAEZ6CnrarUFluU+T6eYJtSA5q+M E1qMx43bNO1jt/UR6JRiN0KV6Qus4V1zFH+RWy6KHfxTdqV9hJT8fmX7IMi2aUwvFgGB Wje2EyU1hT9uP+b6ztqs8CMpFIaVilHvS46fWIgyBNKKgi1ruAEMY1pn90jgzK3TCqId pPS+t224xIBb3CIh7d5NGnWpmuz4ZWl94c0tONX9aJRgXRmYNLiBV/hAWyAuDxjcikMe q1SQ== |
| X-Google-DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vizIMoqY7Xu5ayTJwK2GpLmGG/7odWLkK0QzLvwdzC4=; b=pEdVMcO4cR6GsNbBGwbt7nft4OgQ3exY18vCBW8d9B5EouUZUmAOi2v01kkd4FIZa2 EHXwPDqetKfpaFm63YCOPWSEvB6uzz0KkoYUF7SENvHfNldVsd6dweQ1KwQ6dq2fLb13 WvrwKSgqBlVPmiPAOeHTDkd03X8f3GrYjjMwO9BkTsGI8vZWmglJ9cZOQ3Lqxi40mDsq sibzDT4+0yZspuDpca8lcpDkUJOsubyrroUqRrN57nqocM9vSW+v28JdPXKXz4luG5Tx 3SxMKF/pSOyZXm+cZqW5QL/ZDq66M2SPxVZUKRYhjEHjV2DJyXjJ6DPjFIfOfUDZYUr6 QTaw== |
| X-Gm-Message-State | AA+aEWaQ+hEOjUswPpOYQqGgaxBAmsCfqJyif289zAQ/4H9SDTyTYAPe G71Wa1sGxOEyUC7tMvOTPe0Q1W2kwr21UE3pnCc= |
| X-Google-Smtp-Source | AFSGD/VEfx/bdB0oWL6Fza2PSc+JtNZrQC5NU83QrND41hUNFb02FutBn59kncvXBEjVfcoa90UevMewySZ2D+NTNEE= |
| X-Received | by 2002:a9d:d73:: with SMTP id 106mr7083238oti.291.1543095308574; Sat, 24 Nov 2018 13:35:08 -0800 (PST) |
| In-Reply-To | <1c24a279-f439-a13c-be60-901096ccd4e1@case.edu> |
| X-detected-operating-system | by eggs.gnu.org: Genre and OS details not recognized. |
| X-Received-From | 2607:f8b0:4864:20::335 |
| X-Content-Filtered-By | Mailman/MimeDel 2.1.21 |
| X-BeenThere | bug-bash@gnu.org |
| X-Mailman-Version | 2.1.21 |
| Precedence | list |
| List-Id | Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org> |
| List-Unsubscribe | <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe> |
| List-Archive | <http://lists.gnu.org/archive/html/bug-bash/> |
| List-Post | <mailto:bug-bash@gnu.org> |
| List-Help | <mailto:bug-bash-request@gnu.org?subject=help> |
| List-Subscribe | <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe> |
| Xref | csiph.com gnu.bash.bug:14851 |
Show key headers only | View raw
Chet Ramey (<chet.ramey@case.edu>) wrote:
> On 11/23/18 6:09 PM, Bize Ma wrote:
>
> > Bash Version: 4.4
> > Patch Level: 12
> > Release Status: release
>
> > Description:
> >
> > Bash is removing characters not explicitly listed in a bracket
> > expression (character range).
> > In this example, it is removing digits from other languages.
>
> What is your locale?
>
>
The locale used was en_US.utf-8 but also happens with 459
locales out of 868 available under Debian (not in C, for example).
Also in all locales affected (except one), setting either
LC_ALL=$loc or LC_COLLATE=$loc did the same.
Except in zh_CN.gb18030
But IMO locale collation should not be used for an explicit list.
I have been made aware that there is a
cstart = cend = FOLD (cstart);
inside the `sm_loop.c` file that will convert into a range many
individual character. If that understanding is correct that is the
source of the difference with other shells.
I have the perception that a collation table *must have a "total order"*,
in fact, an strict total order. If two characters `a` and `b` could sort as
equal the order will fail to provide a confirmation that a character is
absent from the list. Consider characters `a`, `b` and `c`, if a and b
sort as equal, a sorted list in which we find `a` followed by `c` doesn't
confirm that `b` is absent as the order could well be `b a c`.
In this case, there must not be any other character than `a` in the
range `a-a` and using a range `a-a` is equivalent (just slower and
more complex) to the single character `a`.
If this is not the case, the error is in the collation table, not in using
single (faster) characters. And what should be updated is such
collation table IMO.
Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread
Re: Bash removes unrequested characters in bracket expressions (not a range). Bize Ma <binaryzebra@gmail.com> - 2018-11-24 17:34 -0400
csiph-web