Path: csiph.com!goblin1!goblin.stu.neva.ru!usenet.stanford.edu!not-for-mail From: Chet Ramey Newsgroups: gnu.bash.bug Subject: Re: Bash removes unrequested characters in bracket expressions (not a range). Date: Wed, 28 Nov 2018 09:09:25 -0800 Lines: 60 Approved: bug-bash@gnu.org Message-ID: References: <1c24a279-f439-a13c-be60-901096ccd4e1@case.edu> <63b8941d-16bc-0761-7272-83eb7347354e@case.edu> Reply-To: chet.ramey@case.edu NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: usenet.stanford.edu 1543846821 31677 208.118.235.17 (3 Dec 2018 14:20:21 GMT) X-Complaints-To: action@cs.stanford.edu Cc: chet.ramey@case.edu To: Bize Ma Envelope-to: bug-bash@gnu.org Resent-Date: Mon, 3 Dec 2018 09:20:09 -0500 In-Reply-To: Resent-From: Chet Ramey Resent-To: bug-bash@gnu.org, bash@packages.debian.org X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:delivered-to :reply-to:cc:subject:to:references:from:openpgp:autocrypt :organization:message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=JtjMjT2FSjP4+3FJd+8cp44TKOBOMC4C6SY86tiqSko=; b=W8G0Kz48HOA+wtC2thSkMF4K+jZUp3m0Pvcz0f/RTEriDIsP+T7EaQvSBYcCFjBRIX rUM78T0ku8ohFqqkB+O9aoPCyRrxBTEx17whAysERdVq82sinERHY7oVcDCn2GR1SNQB 3HHNY0COmcqzhJJvxXYOysowDZINCoVBWtCeBlPD+PZFv7/6pq4sHB/Yye381dMiBLtj IdE2aJ89p8A5AA8Bo1hVJgxlnOE0tYs7469yrOSLXojcX2XepR9PwWkOAjXzcV2X9vH7 kyFU0EZuUgLXUKZCG3Np7YDZivdJHLCTUm8iYA7aiMXCRfQ3NkcxEQFu3KU8UJgR0+JV c1Cw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of chet.ramey@case.edu designates 209.85.219.199 as permitted sender) smtp.mailfrom=chet.ramey@case.edu X-Gm-Message-State: AA+aEWaWLx1i6tbKz5VUCMpU4cKMDO3FHSiceszNO70Qx6HPrdb6u4Lt ANfqikwMp19AJEv9oQ1dFHX4cKVedubhK7UJOswJdLRPRgzlrWbVtkgIn0ohMFS+0UsgynAoTd0 XCXdn6bb7MJGLT1rC9dbN7UsAwsEQGDIdqHjx8um2uRZRUFpHiQ12HA== X-Received: X-Received X-Received: by 2002:adf:c44a:: with SMTP id a10mr32150757wrg.145.1543424973837; Wed, 28 Nov 2018 09:09:33 -0800 (PST) X-Forwarded-To: chet@chet-mail.tis.cwru.edu X-Forwarded-For: cpr@case.edu chet@chet-mail.tis.cwru.edu X-Delivered-To: cpr@cwru.edu X-Google-Smtp-Source: AFSGD/Xp6qgay7L3w/eanrAbMeVyEMAdCA3V4ZSLOp1hlKxtCVWKcBrkqBGZfVxhYg9r40iHETF8 X-ARC-Seal: i=1; a=rsa-sha256; t=1543424972; cv=none; d=google.com; s=arc-20160816; b=WTslAKm5ulfgjexRmyfDCsJCQGA0tR+FmE93PvP3BFxHEfraZRsFNGNx3IsYXOKrvB 35ikRmfgGv8QM2nNoGVCgPmw3J8xAchjeZ6koJvelNILvHH6axp/rpPRP+bZkWn82CrB o72aXTd31S788rr0wGrvQqrQf39c4/lyygm0d24HcLGtZBiLqWKh60pfnKye0ASiD5w3 Bou+h+Nb/C6bNVtD17wwHEswCZht7sLtWytoGCmpDU+N+pueLfaKwhADyEEwBUmWShDC fzbM2IkYU8B/F394S6juXmE01q13lX62UvhFHiQFrKtpHsf4fRztmvl/6gJ0TvPOoNEQ 2EvA== X-ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:organization:autocrypt:openpgp:from :references:to:subject:cc:reply-to; bh=JtjMjT2FSjP4+3FJd+8cp44TKOBOMC4C6SY86tiqSko=; b=ns27o+Qu4tbI3SuYdPfR4EpztUij1eKKNQ6ACwm1Oxs94ZykO/KU/4+XvG3/CTJjT4 /IPkFllkYpfj/FAtnf8QCKs3JGp5cTGgvJoNQLl+iuJJ/4ucLhUm4R3ms9+0gRCUSak5 jjb+MVQvqhCd9j5JRW2Aaj15EaxgPYTthAVD0phR0JZr4opISxVr7QqEZHS0dxRIPwSg AH6IcziS3SCHSnVhYsDKXvr7BSfQgjSyouyFoQRwjnmbBYECoaNZwLgErUB+k/HzgpI7 pOeJylv73O+YcX9NDKfZjo403Rxbj1XRCsHmNHb4Kqgx2liu0qbU3AArc8oylYYk4pkl Nrjw== X-ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of chet.ramey@case.edu designates 209.85.219.199 as permitted sender) smtp.mailfrom=chet.ramey@case.edu X-Received-SPF: pass (google.com: domain of chet.ramey@case.edu designates 209.85.219.199 as permitted sender) client-ip=209.85.219.199; X-Authentication-Results: mx.google.com; spf=pass (google.com: domain of chet.ramey@case.edu designates 209.85.219.199 as permitted sender) smtp.mailfrom=chet.ramey@case.edu X-Openpgp: preference=signencrypt X-Autocrypt: addr=chet.ramey@case.edu; prefer-encrypt=mutual; keydata= xsDiBEEOsGwRBACFa0A1oa71HSZLWxAx0svXzhOZNQZOzqHmSuGOG92jIpQpr8DpvgRh40Yp AwdcXb8QG1J5yGAKeevNE1zCFaA725vGSdHUyypHouV0xoWwukYO6qlyyX+2BZU+okBUqoWQ koWxiYaCSfzB2Ln7pmdys1fJhcgBKf3VjWCjd2XJTwCgoFJOwyBFJdugjfwjSoRSwDOIMf0D /iQKqlWhIO1LGpMrGX0il0/x4zj0NAcSwAk7LaPZbN4UPjn5pqGEHBlf1+xDDQCkAoZ/VqES GZragl4VqJfxBr29Ag0UDvNbUbXoxQsARdero1M8GiAIRc50hj7HXFoERwenbNDJL86GPLAQ OTGOCa4W2o29nFfFjQrsrrYHzVtyA/9oyKvTeEMJ7NA3VJdWcmn7gOu0FxEmSNhSoV1T4vP2 1Wf7f5niCCRKQLNyUy0wEApQi4tSysdz+AbgAc0b/bHYVzIf2uO2lIEZQNNt+3g2bmXgloWm W5fsm/di50Gm1l1Na63d3RZ00SeFQos6WEwLUHEB0yp6KXluXLLIZitEJM0aQ2hldCBSYW1l eSA8Y2hldEBjd3J1LmVkdT7CYQQTEQIAIQIbAwYLCQgHAwIDFQIDAxYCAQIeAQIXgAUCQ+La kQIZAQAKCRC7WGnwZOp0q9rGAJ4sRGLmlF8klZTH75z7jyQScpU6aACeNMahjWIhumt4u96d 9mdMJqlabVnOwE0EQQ6wbxAEAJCukwDigRDPhAuI+lf+6P64lWanIFOXIndqhvU13cDbQ/Wt 5LwPzm2QTvd7F+fcHOgZ8KOFScbDpjJaRqwIybMTcIN0B2pBLX/C10W1aY+cUrXZgXUGVISE MmpaP9v02auToo7XXVEHC+XLO9IU7/xaU98FL69l6/K4xeNSBRM/AAMHA/wNAmRBpcyK0+Vg gZ5esQaIP/LyolAm2qwcmrd3dZi+g24s7yjV0EUwvRP7xHRDQFgkAo6++QbuecU/J90lxrVn QwucZmfz9zgWDkT/MpfB/CNRSKLFjhYq2yHmHWT6vEjw9Ry/hF6Pc0oh1a62USdfaKAiim0n VxxQmPmiRvtCmcJJBBgRAgAJBQJBDrBvAhsMAAoJELtYafBk6nSr43AAn2ZZFQg8Gs/zUzvX Mt7evaFqVTzcAJ0cHtKpP1i/4H4R9+OsYeQdxxWxTQ== X-Organization: ITS, Case Western Reserve University X-User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 X-Content-Language: en-US Resent-Message-Id: <20181203142009.4260132D7791@caleb.ins.cwru.edu> X-Junkmail-Status: score=7/90, host=mpv4-2015.case.edu X-Junkmail-PrAS-Raw: score=7/90, refid=2.7.2:2018.12.3.132716:17:7.944, ip=, rules=DATE_TZ_NA, __HAS_FROM, FROM_EDU_TLD, __TO_MALFORMED_2, __TO_NAME, __TO_NAME_DIFF_FROM_ACC, __SUBJ_REPLY, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __HAS_CC_HDR, __HAS_REPLYTO, __REFERENCES, __IN_REP_TO, __X_RESENT_FROM, __HAS_MSGID, __SANE_MSGID, __MIME_VERSION, __CT, __CT_TEXT_PLAIN, __CTE, __X_GOOGLE_DKIM_SIGNATURE, __X_FORWARDED_TO, __REPLYTO_SAMEAS_FROM_ADDY, __REPLYTO_SAMEAS_FROM_ACC, __FROM_DOMAIN_IN_ANY_CC1, __REPLYTO_SAMEAS_FROM_DOMAIN, __ANY_URI, __URI_WITH_PATH, __URI_NO_WWW, __CP_URI_IN_BODY, __SUBJ_ALPHA_NEGATE, __URI_IN_BODY, __URI_NOT_IMG, __FORWARDED_MSG, __NO_HTML_TAG_RAW, BODY_SIZE_1900_1999, BODYTEXTP_SIZE_3000_LESS, __MIME_TEXT_P1, __MIME_TEXT_ONLY, __URI_NS, HTML_00_01, HTML_00_10, BODY_SIZE_5000_LESS, IN_REP_TO, MSG_THREAD, __FROM_DOMAIN_IN_RCPT, __TO_REAL_NAMES, MULTIPLE_REAL_RCPTS, LEGITIMATE_SIGNS, [TRUNCATED], so=2010-03-03 19:42:08, dmn=2016-08-03-0138 X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020204.5C053B9A.0002,ss=1,re=0.000,fgs=0, ip=0.0.0.0, so=2016-11-06 16:00:04, dmn=2011-05-27 18:58:46 X-Mirapoint-Loop-Id: a28f9ebb80a1a9a8ab9451b837a32d5e X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 129.22.103.195 X-BeenThere: bug-bash@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Bug reports for the GNU Bourne Again SHell List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com gnu.bash.bug:14880 On 11/28/18 2:45 AM, Bize Ma wrote: > Chet Ramey (>) wrote: > > On 11/24/18 2:32 PM, Chet Ramey wrote: > > >> But IMO locale collation should not be used for an explicit list. > > > > Collation order is used for each individual character in a bracket > > expression when compared against the string, as posix specifies. > > > Yes, values resulting from a glob expansion should be compared with strcoll. > > How many characters should there be in a range like [0-0] ? > Or to be more precise: in a [0] bracket expression? one? There should be one character ("0") that matches as many characters as collate equal to the character "0", as per the POSIX quote in my previous message. > > If I were you, I would file a bug report with Debian against wcscoll. > > > And I would be told that wcscoll is doing what the collation file 14651 is > telling it to do. Sure. > > And, that in any case, that file has been updated in glib2.8 anyway. That should fix the problem without forcing applications to attempt to impose a total ordering even when strcoll/wcscoll returns 0. > It returns 0 (equal) for L"٠" and L"0" without setting errno. That's > clearly a problem with wcscoll (if the character isn't valid in the current > locale) or the locale definition. > > > Both characters collate to the same position as I have already explained. Yes, so the locale definition files imposing a total ordering will be a clear improvement. > > I don't follow you about what you mean with: /(if the character isn't valid > in the current > locale)./ There are codepoints that correspond to characters in one locale but don't map to a valid character in another. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/