Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #14873

Re: Bash removes unrequested characters in bracket expressions (not a range).

Path csiph.com!fu-berlin.de!usenet.stanford.edu!not-for-mail
From Chet Ramey <chet.ramey@case.edu>
Newsgroups gnu.bash.bug
Subject Re: Bash removes unrequested characters in bracket expressions (not a range).
Date Sat, 24 Nov 2018 17:32:11 -0500
Lines 38
Approved bug-bash@gnu.org
Message-ID <mailman.5069.1543846704.1284.bug-bash@gnu.org> (permalink)
References <CAFra36hcAjBHGgd_8sHjOV4wSzjmdCyLV2aQo8Ww1bwJqkxYQA@mail.gmail.com> <1c24a279-f439-a13c-be60-901096ccd4e1@case.edu> <CAFra36hdkG+5qq94cf-sbKrnw6roJWez03aKJPU=Z=Vad2LaXg@mail.gmail.com>
Reply-To chet.ramey@case.edu
NNTP-Posting-Host lists.gnu.org
Mime-Version 1.0
Content-Type text/plain; charset=utf-8
Content-Transfer-Encoding 8bit
X-Trace usenet.stanford.edu 1543846705 31513 208.118.235.17 (3 Dec 2018 14:18:25 GMT)
X-Complaints-To action@cs.stanford.edu
Cc chet.ramey@case.edu
To Bize Ma <binaryzebra@gmail.com>
Envelope-to bug-bash@gnu.org
Resent-Date Mon, 3 Dec 2018 09:18:03 -0500
In-Reply-To <CAFra36hdkG+5qq94cf-sbKrnw6roJWez03aKJPU=Z=Vad2LaXg@mail.gmail.com>
Resent-From Chet Ramey <chet.ramey@case.edu>
Resent-To bug-bash@gnu.org, bash@packages.debian.org
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:delivered-to :reply-to:cc:subject:to:references:from:openpgp:autocrypt :organization:message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=OId1gr+/tPeyQ8MpvLdrWGVLuOCZAxRzAtaKCePzLYk=; b=npQoNOJ9hBOn4WW60p7Bh7kvqiolxR2p4ZyiR78+T0oZUhgbCV6gfcnY8LDstpGjeY Q/4zjSHj7VSflzphcEecCH2Tp2c3x8pXsmqIvsU/gJ095khr8JUVKAxD5AL2NBOvTjFd osOjCaCn7dhYrS9jb1V5lErB5F8pLCWnu0hFRCeHNSafoF9J8NB3wH8QiVilDfqV0O7n stuTgELXfuRSY/de5jloNJOd1vCVSpaHoWiIzjqEOxYg/WkBQBbNK/weQZoo27stnwK0 710AlBshyofEdz1CAACG8Yon/wFidcZhyUgXOLGRg+3alwiAAI3XStbrHro2eTpwEJnr mZ2Q==
X-Original-Authentication-Results mx.google.com; spf=pass (google.com: domain of chet.ramey@case.edu designates 209.85.222.197 as permitted sender) smtp.mailfrom=chet.ramey@case.edu
X-Gm-Message-State AA+aEWZOK1WEJ5Pxi6tiaHsEIww78ooYXDjDwP7W/0CpA7sjy/DkZUvz b2h9ZKGAPovFiRrLTK51sDgegg6YEkymdmR3Qle6nVmf6CztfGeG/s+nFQynWmS9tQ94EUsuLQm vRsRnvEOPmBXd6996O77y4hNeCkQnnvp8/37Tb05InJztm0wzKh1OeA==
X-Received X-Received
X-Received by 2002:a1c:48c2:: with SMTP id v185-v6mr14138800wma.1.1543098735979; Sat, 24 Nov 2018 14:32:15 -0800 (PST)
X-Forwarded-To chet@chet-mail.tis.cwru.edu
X-Forwarded-For cpr@case.edu chet@chet-mail.tis.cwru.edu
X-Delivered-To cpr@cwru.edu
X-Google-Smtp-Source AFSGD/X6aO9oiCUdML3CvH1v51KAxt7IzESdrsxh15b53PhzIdzM59TUIULTY9LwWC2Xb2mIKLSx
X-ARC-Seal i=1; a=rsa-sha256; t=1543098734; cv=none; d=google.com; s=arc-20160816; b=oaCc3cECd7UGQwckMJonjvu8CS1hT+BmcsOLLJnTtFvjLucQILzzM/pT6fGAnXHXgv NlVCk+20R7f5/23rG+DNBtppG6QpxR9p9q4r9N9DhAymF/FticCn/uM1FpVFauXrRe9W 4VHFcFu4tMES4f+5dMdmNWhHdvDMZxpRrGVWY6rrpUM5N4TW8aJWkBq7oL6n9nxXdRXw zP3kEXQsJBV4aFzuZEwgF0uUi62kHspYkTL8t1Rvo+1VhKA/ph4pv3JuD1UZFAg7hSd1 nB3WL9vRt+sjpeEP2oUCpAXFI5TaePOivDgMWeh+x583eH9C1jMScKk54DslNflh5faE Ev2w==
X-ARC-Message-Signature i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:organization:autocrypt:openpgp:from :references:to:subject:cc:reply-to; bh=OId1gr+/tPeyQ8MpvLdrWGVLuOCZAxRzAtaKCePzLYk=; b=mSZRRa9qBVtQX+2Q6oU0eWr/H4ldq3+qT8MtHtk2Iew5jYvzHIPfmbdVfxEHeiYAc4 4/gl5tcD6LO0k8Yl3BEFoX/2TdOiHa3+8bgSpDFYg1Jd12REukv+sZb24gKq8JZBfTH6 wap4sWOCKcuyeb8VSAVq4Dlk0WIVa8E9VIYh6PsulTo+bUl6qafoU0R12GQ8Jm5M6x6E t15AkmmQKLI1J3BSH71Y5vkIznbkHH8dobJULPa65dk3wz1b6sGpov0D55pLjTcYPk5l bnpJ2Du9X7ZykSNdTq5okXA8B14OPMRkCHMRO6mmJM631FpyzZMKzB5CTsoQmBfPFN05 qPWg==
X-ARC-Authentication-Results i=1; mx.google.com; spf=pass (google.com: domain of chet.ramey@case.edu designates 209.85.222.197 as permitted sender) smtp.mailfrom=chet.ramey@case.edu
X-Received-SPF pass (google.com: domain of chet.ramey@case.edu designates 209.85.222.197 as permitted sender) client-ip=209.85.222.197;
X-Authentication-Results mx.google.com; spf=pass (google.com: domain of chet.ramey@case.edu designates 209.85.222.197 as permitted sender) smtp.mailfrom=chet.ramey@case.edu
X-Openpgp preference=signencrypt
X-Autocrypt addr=chet.ramey@case.edu; prefer-encrypt=mutual; keydata= xsDiBEEOsGwRBACFa0A1oa71HSZLWxAx0svXzhOZNQZOzqHmSuGOG92jIpQpr8DpvgRh40Yp AwdcXb8QG1J5yGAKeevNE1zCFaA725vGSdHUyypHouV0xoWwukYO6qlyyX+2BZU+okBUqoWQ koWxiYaCSfzB2Ln7pmdys1fJhcgBKf3VjWCjd2XJTwCgoFJOwyBFJdugjfwjSoRSwDOIMf0D /iQKqlWhIO1LGpMrGX0il0/x4zj0NAcSwAk7LaPZbN4UPjn5pqGEHBlf1+xDDQCkAoZ/VqES GZragl4VqJfxBr29Ag0UDvNbUbXoxQsARdero1M8GiAIRc50hj7HXFoERwenbNDJL86GPLAQ OTGOCa4W2o29nFfFjQrsrrYHzVtyA/9oyKvTeEMJ7NA3VJdWcmn7gOu0FxEmSNhSoV1T4vP2 1Wf7f5niCCRKQLNyUy0wEApQi4tSysdz+AbgAc0b/bHYVzIf2uO2lIEZQNNt+3g2bmXgloWm W5fsm/di50Gm1l1Na63d3RZ00SeFQos6WEwLUHEB0yp6KXluXLLIZitEJM0aQ2hldCBSYW1l eSA8Y2hldEBjd3J1LmVkdT7CYQQTEQIAIQIbAwYLCQgHAwIDFQIDAxYCAQIeAQIXgAUCQ+La kQIZAQAKCRC7WGnwZOp0q9rGAJ4sRGLmlF8klZTH75z7jyQScpU6aACeNMahjWIhumt4u96d 9mdMJqlabVnOwE0EQQ6wbxAEAJCukwDigRDPhAuI+lf+6P64lWanIFOXIndqhvU13cDbQ/Wt 5LwPzm2QTvd7F+fcHOgZ8KOFScbDpjJaRqwIybMTcIN0B2pBLX/C10W1aY+cUrXZgXUGVISE MmpaP9v02auToo7XXVEHC+XLO9IU7/xaU98FL69l6/K4xeNSBRM/AAMHA/wNAmRBpcyK0+Vg gZ5esQaIP/LyolAm2qwcmrd3dZi+g24s7yjV0EUwvRP7xHRDQFgkAo6++QbuecU/J90lxrVn QwucZmfz9zgWDkT/MpfB/CNRSKLFjhYq2yHmHWT6vEjw9Ry/hF6Pc0oh1a62USdfaKAiim0n VxxQmPmiRvtCmcJJBBgRAgAJBQJBDrBvAhsMAAoJELtYafBk6nSr43AAn2ZZFQg8Gs/zUzvX Mt7evaFqVTzcAJ0cHtKpP1i/4H4R9+OsYeQdxxWxTQ==
X-Organization ITS, Case Western Reserve University
X-User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:52.0) Gecko/20100101 Thunderbird/52.9.1
X-Content-Language en-US
Resent-Message-Id <20181203141804.2353C32D770D@caleb.ins.cwru.edu>
X-Junkmail-Status score=7/90, host=mpv3-2015.case.edu
X-Junkmail-PrAS-Raw score=7/90, refid=2.7.2:2018.12.3.135116:17:7.944, ip=, rules=DATE_TZ_NA, __HAS_FROM, FROM_EDU_TLD, __TO_MALFORMED_2, __TO_NAME, __TO_NAME_DIFF_FROM_ACC, __SUBJ_REPLY, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __HAS_CC_HDR, __HAS_REPLYTO, __REFERENCES, __IN_REP_TO, __X_RESENT_FROM, __HAS_MSGID, __SANE_MSGID, __MIME_VERSION, __CT, __CT_TEXT_PLAIN, __CTE, __X_GOOGLE_DKIM_SIGNATURE, __X_FORWARDED_TO, __REPLYTO_SAMEAS_FROM_ADDY, __REPLYTO_SAMEAS_FROM_ACC, __FROM_DOMAIN_IN_ANY_CC1, __REPLYTO_SAMEAS_FROM_DOMAIN, __ANY_URI, __URI_WITH_PATH, __URI_NO_WWW, __HIGHBITS, __CP_URI_IN_BODY, __FRAUD_MONEY_CURRENCY_DOLLAR, __SUBJ_ALPHA_NEGATE, __URI_IN_BODY, __URI_NOT_IMG, __FORWARDED_MSG, __NO_HTML_TAG_RAW, BODYTEXTP_SIZE_3000_LESS, BODY_SIZE_1400_1499, __MIME_TEXT_P1, __MIME_TEXT_ONLY, __URI_NS, HTML_00_01, HTML_00_10, __FRAUD_MONEY_CURRENCY, BODY_SIZE_5000_LESS, IN_REP_TO, MSG_THREAD, __FROM_DOMAIN_IN_RCPT, [TRUNCATED], so=2010-03-03 19:42:08, dmn=2016-08-03-0138
X-Mirapoint-Virus-RAPID-Raw score=unknown(0), refid=str=0001.0A020202.5C053B1D.003C,ss=1,re=0.000,fgs=0, ip=0.0.0.0, so=2016-11-06 16:00:04, dmn=2011-05-27 18:58:46
X-Mirapoint-Loop-Id 516e7a201e238d92cc4e9ba7973220ba
X-detected-operating-system by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy]
X-Received-From 129.22.103.194
X-BeenThere bug-bash@gnu.org
X-Mailman-Version 2.1.21
Precedence list
List-Id Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org>
List-Unsubscribe <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe>
List-Archive <http://lists.gnu.org/archive/html/bug-bash/>
List-Post <mailto:bug-bash@gnu.org>
List-Help <mailto:bug-bash-request@gnu.org?subject=help>
List-Subscribe <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe>
Xref csiph.com gnu.bash.bug:14873

Show key headers only | View raw


On 11/24/18 4:32 PM, Bize Ma wrote:

>     > Bash is removing characters not explicitly listed in a bracket
>     > expression (character range).
>     > In this example, it is removing digits from other languages.
> 
>     What is your locale?
> 
>  
> The locale used was en_US.utf-8 but also happens with  459
> locales out of 868 available under Debian (not in C, for example).
> 
> Also in all locales affected (except one), setting either
> LC_ALL=$loc or LC_COLLATE=$loc did the same.
> Except in zh_CN.gb18030
> 
> But IMO locale collation should not be used for an explicit list.

Collation order is used for each individual character in a bracket
expression when compared against the string, as posix specifies.

> I have been made aware that there is a
>       cstart = cend = FOLD (cstart);
> inside the `sm_loop.c` file that will convert into a range many
> individual character. If that understanding is correct that is the
> source of the difference with other shells.

I'm not sure what you mean by "convert into a range." If cstart and cend
were treated as a range, the start end and end characters would be the
same. If cstart == cend, a character that collates >= cstart and <= cend
would have to collate equal to cstart and cend.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/

Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread


Thread

Re: Bash removes unrequested characters in bracket expressions (not a range). Chet Ramey <chet.ramey@case.edu> - 2018-11-24 17:32 -0500

csiph-web