Path: csiph.com!goblin3!goblin1!goblin.stu.neva.ru!usenet.stanford.edu!not-for-mail From: Bize Ma Newsgroups: gnu.bash.bug Subject: Re: Bash removes unrequested characters in bracket expressions (not a range). Date: Wed, 28 Nov 2018 06:29:19 -0400 Lines: 40 Approved: bug-bash@gnu.org Message-ID: References: <1c24a279-f439-a13c-be60-901096ccd4e1@case.edu> <63b8941d-16bc-0761-7272-83eb7347354e@case.edu> NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Trace: usenet.stanford.edu 1543846784 31669 208.118.235.17 (3 Dec 2018 14:19:44 GMT) X-Complaints-To: action@cs.stanford.edu To: Chester Ramey Envelope-to: bug-bash@gnu.org Resent-Date: Mon, 3 Dec 2018 09:19:31 -0500 In-Reply-To: <63b8941d-16bc-0761-7272-83eb7347354e@case.edu> Resent-From: Chet Ramey Resent-To: bug-bash@gnu.org, bash@packages.debian.org X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:delivered-to:dkim-signature:mime-version :references:in-reply-to:from:date:message-id:subject:to; bh=NJnt033/jOxVKXYTz/ra9BN7+rNFs6KJZYB6kto7Ny8=; b=K7Fan+Wpu5iMfy0YPH7bn0PemP4uNmlrHDEZVBf4p0F0q99oB+F2VVWa4NHVlsPdWL OxWCe4rLuA8zOFJVx/YYb3LOr9Q/uczDJ18rsFAH6tPEiY8Q+5TDphlenHbsds9gxLaO Z9eBseNMVm7bOAf3tzIMRuFTiMuNE9XsGHJZFRpJ2iYEpGEN9gzwvSYc9wpNw5XPCXjS kVqT48zKASe4ByeSyoiixpWXfcmzSskMUEg8hR6wcSeO/wkkt3QVwIRZLYBn7gv8v0Z9 MKphmVf89GZlGAaIFHJTX3mcJEh2g8UkDBlHFhVW8iQSDETyWbqAOdC21diVteXUktR1 VLxw== X-Gm-Message-State: AA+aEWbLcY3MRXeFvSTnOcoNkJT+JZ7b+/EvY4BmLRRl/9MVY5J2dLy0 G98MC7jUqDIpEPIsLW3EKl7uIfVvP2s5+nsAVLEwiCDM7htC5olukm++JVamsI2Orf9gf2HnFSk 39IEJvQn/aHGZPtMWoX9lpVMNf+AkVnTldowl7VBlYkmfhYNJ9rha3w== X-Received: X-Received X-Received: by 2002:a1c:d0cd:: with SMTP id h196mr2116506wmg.13.1543400975980; Wed, 28 Nov 2018 02:29:35 -0800 (PST) X-Forwarded-To: chet@chet-mail.tis.cwru.edu X-Forwarded-For: cpr@case.edu chet@chet-mail.tis.cwru.edu X-Delivered-To: cpr@cwru.edu X-Google-Smtp-Source: AFSGD/U37gCKrHvPL+b3ieTh9IVGWr7ux8GCwgz4RyrbRTR7j8QjHxrJ0VnL+ySrT3oXiKN7VA8/ X-ARC-Seal: i=1; a=rsa-sha256; t=1543400974; cv=none; d=google.com; s=arc-20160816; b=eag0l1Lz6pmy/Mr1G0mAS67MTMmmiW90bQCLbG7ZBPN9AGLDmBkoX7mrGAml+RzIHt G37TJqJpSCHI9ELVU9l0vLJyy4aaZHLBOMsLCIvEdraacoOp740Hn+q06mXdpyDMf6cD GqUY3jz/r9CetmX1Z3GdeVYmCvCNorUzbz0YT2rIQ+biB6o439ri30N+EqMGmvoZuzC9 KzCyHW6if21dTkiebjVw1PdiMTR5XttOkyWxDWncjIKAtVUm2Hk7yfUmKeoKyL8e2YMn X41oVwNyKzOc2gdgcO/lr4LwgEM/7jWSNmFM0HKcBItDTKcjE/w9KLt2rYEp9AY8a4NR o2Gg== X-ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=NJnt033/jOxVKXYTz/ra9BN7+rNFs6KJZYB6kto7Ny8=; b=IgMrBEw1MBus4CxMKkofoVvvB9u8URjxnwUfKWHCXr1MwKPPQwkdPy+yScp9E28YMa +FygJIKqfITlz8r1f5cVyywGiXAID7uIzGv//L/KC6t99yzliw1oOfj+AYDAM4d5KnnD C5uYssTbUjy+L/kh/SkwEGgjHDxAVKhyJ0sI3KH4EYhrZl/v0y3o+OmjFmExhtOKHhoO hVhqhRvo/aQv1Wmoldk8d372v/lDV7vlm7UoIhrsPrqgXmIGTVhQT23VV8YPXlWKyxDo /XQ4xlLA3cC46vSbsMPqOnFvoNIIRMFWgqpMCoTh2rg9VdkBLcVtcQwlQFGdK2kqeGvn 67rA== X-ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=O4SeiB8L; spf=pass (google.com: domain of binaryzebra@gmail.com designates 209.85.210.43 as permitted sender) smtp.mailfrom=binaryzebra@gmail.com X-Received-SPF: pass (google.com: domain of binaryzebra@gmail.com designates 209.85.210.43 as permitted sender) client-ip=209.85.210.43; X-Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=O4SeiB8L; spf=pass (google.com: domain of binaryzebra@gmail.com designates 209.85.210.43 as permitted sender) smtp.mailfrom=binaryzebra@gmail.com X-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=NJnt033/jOxVKXYTz/ra9BN7+rNFs6KJZYB6kto7Ny8=; b=O4SeiB8LsdV/hhdWg+ZeqEUCdE9gFpNoYjm6JZdJ5r4mFLZ9MrFg/YfLsb8CxiuCj1 Mz02mrEMypnKiNIOVLuVjrf9QffaAAqJgze/y26NGs2sm8UHVYtgKCjF0HWW4655W+gx lh5U8ctLDoEqDfnuCkFnS0gsOMql0OPkQi2fTVeC2px0U3s1xO0CRunY7o+FMVa5zTtK tNvuQKHV1HCmAI0zQcwU1/UdBziQFLmq55PP/kgJNHmf4cT+WQxh+OUrAqAbfgPtcls6 smE2rLdF/52McfQBfAe2mwpz/OkgriKLlc+N/XKsu/Lml5E2joH8ykCHQy/TAL4vyLal YgTA== Resent-Message-Id: <20181203141931.166F032D776C@caleb.ins.cwru.edu> X-Junkmail-Status: score=8/90, host=mpv4-2015.case.edu X-Junkmail-PrAS-Raw: score=8/90, refid=2.7.2:2018.12.3.132716:17:8.907, ip=, rules=DATE_TZ_NA, __HAS_FROM, __FRAUD_WEBMAIL_FROM, __FROM_GMAIL, __TO_MALFORMED_2, __TO_NAME, __TO_NAME_DIFF_FROM_ACC, __SUBJ_REPLY, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __REFERENCES, __IN_REP_TO, __X_RESENT_FROM, __HAS_MSGID, __SANE_MSGID, __MIME_VERSION, __CT, __CTYPE_MULTIPART_ALT, __HEX28_LC_BOUNDARY, __CTYPE_HAS_BOUNDARY, __CTYPE_MULTIPART, __X_GOOGLE_DKIM_SIGNATURE, __X_FORWARDED_TO, FROM_SAME_AS_TO, FROM_SAME_AS_TO_DOMAIN, __MIME_TEXT_P2, __MIME_TEXT_H2, __ANY_URI, __URI_NO_WWW, __HIGHBITS, __FRAUD_MONEY_CURRENCY_DOLLAR, __SUBJ_ALPHA_NEGATE, __HTML_AHREF_TAG, __FORWARDED_MSG, __HAS_HTML, __HTML_TAG_DIV, HTML_NO_HTTP, BODY_SIZE_4000_4999, BODYTEXTP_SIZE_3000_LESS, BODYTEXTH_SIZE_10000_LESS, __MIME_TEXT_H1, __MIME_TEXT_P1, __MIME_HTML, __URI_NS, HTML_50_70, HEX28_LC_NOT_GOOGLE, __FRAUD_MONEY_CURRENCY, BODY_SIZE_5000_LESS, __FRAUD_WEBMAIL, [TRUNCATED], so=2010-03-03 19:42:08, dmn=2016-08-03-0138 X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A02020E.5C053B73.00C3,ss=1,re=0.000,fgs=0, ip=0.0.0.0, so=2016-11-06 16:00:04, dmn=2011-05-27 18:58:46 X-Mirapoint-Loop-Id: a28f9ebb80a1a9a8ab9451b837a32d5e X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 129.22.103.195 X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: bug-bash@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Bug reports for the GNU Bourne Again SHell List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com gnu.bash.bug:14876 Chet Ramey () wrote: > On 11/24/18 4:32 PM, Bize Ma wrote: [...] > > I have been made aware that there is a > > cstart =3D cend =3D FOLD (cstart); > > inside the `sm_loop.c` file that will convert into a range many > > individual character. If that understanding is correct that is the > > source of the difference with other shells. > > I'm not sure what you mean by "convert into a range." If cstart and cend > were treated as a range, the start end and end characters would be the > same. If cstart =3D=3D cend, a character that collates >=3D cstart and <= =3D cend > would have to collate equal to cstart and cend. > Yes, exactly, a range where the start and the end are the same. Try: $ touch 0 1 =D9=A0 =D9=A1 =DB=B0 =DB=B1 =DF=80 =DF=81 =E0=A5=A6 =E0=A5=A7 $ echo [1] 1 =D9=A1 It is converted to the same range as this $ echo [1-1] 1 =D9=A1 That happens because up to glibc 2.27 this has been the collation order of those characters (search in /usr/share/i18n/locales/iso14651_t1_common) : <0>;;;IGNORE <0>;;;IGNORE Collate to exactly the same values. This breaks the capacity to detect that a character is absent in a list ordered by the collation order.