Path: csiph.com!tncsrv06.tnetconsulting.net!news.snarked.org!news.linkpendium.com!news.linkpendium.com!panix!usenet.stanford.edu!not-for-mail From: Chet Ramey Newsgroups: gnu.bash.bug Subject: Re: Filename Expansion bug Date: Wed, 8 Jan 2020 10:09:46 -0500 Lines: 50 Approved: bug-bash@gnu.org Message-ID: References: <66b2510f-a2cf-b4d3-4574-9193a9bc89c4@case.edu> Reply-To: chet.ramey@case.edu NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: usenet.stanford.edu 1578496195 17879 209.51.188.17 (8 Jan 2020 15:09:55 GMT) X-Complaints-To: action@cs.stanford.edu Cc: chet.ramey@case.edu To: Mickael KENIKSSI , bug-bash@gnu.org Envelope-to: bug-bash@gnu.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=case.edu; s=smtp-primary; t=1578496188; bh=X7Yu32nsHhTIYo+XgouqRs1njLGpVRoIEWxeeBfgJIQ=; h=Reply-To:Cc:Subject:To:References:From:Message-ID:Date: MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=LEU9SoN6M/ytmHiafAvDsIbDiUuD+APM4StN40fl/x4QZQsioq9G6UZ8/SByv5c+Vr AY1KvnlKO5FfwrYEYG8AHDhLTdXNmasjjt3cHqVwWIdjqKCBDf+snJKsqfvnGxuHDM9 XMWBNmD9t+nMhIDlychNCMXSSNctErO9oKcY5vJfN3SAeVK46Tj7Ppm3hsZtpjZiy6D na+EqklTYeOMszupYey7frJmXrjqxmlIcNzRuycBoKogrlFKOhWlP3UB3JVbr/qEZML 9QZ5PkjG5nPtGlcVWtW1cKTAyurIIC4m+8QeqiMaFOGwLHwk8GzHu/t1MF3ri7wRxsb H7zYp52g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=case.edu; s=smtp-primary; t=1578496187; bh=WaVAFdENU0F7lgEoHROq6QlHaUaXHe+OsxDAIZZtCFY=; h=Reply-To:Cc:Subject:To:References:From:Message-ID:Date: MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=lIl4OPY47nOj1pMfwL7FhC0yEJ9K8biwVllos1zEIBdF/BpPOj1MlVWF6F5IeQlHA3 VugSrlgDWenyWU6xtykKM3apW2VLtKf1hsOYyYgH6ZkIminDrItD0dsAgplAmklb778 djvC2oPx50IjZ6T6qFIW6xpRDWElvDtijThKuSk3G1Zt0th0Byr6UMu/AeEKOWgH9sC OtXxzdE1ws5i98g+gEsdy51EY8S9ifPxX5WB92IHGkclUQHYiKxCRoAwKqcj88tKZF5 yGQtqd+Gep+iPl9YzdaBYAbRmF8tHavwPoH6k5zPbh0NGBSyUe8FR9AEmRTLIJ3CeAi DHTxsbqA== User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Thunderbird/68.3.1 In-Reply-To: Content-Language: en-US X-Junkmail-Status: score=7/90, host=mpv2-2015.case.edu X-Junkmail-PrAS-Raw: score=7/90, refid=2.7.2:2020.1.8.140617:17:7.944, ip=, rules=DKIM_SIGNATURE, __HAS_REPLYTO, __HAS_CC_HDR, __SUBJ_REPLY, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __SUBJ_ALPHA_END, __TO_MALFORMED_2, __TO_NAME, __TO_NAME_DIFF_FROM_ACC, __HAS_REFERENCES, __REFERENCES, __HAS_FROM, FROM_EDU_TLD, __HAS_MSGID, __SANE_MSGID, DATE_TZ_NA, __USER_AGENT, __MOZILLA_USER_AGENT, __MIME_VERSION, __IN_REP_TO, __CT, __CT_TEXT_PLAIN, __CTE, __REPLYTO_SAMEAS_FROM_ADDY, __REPLYTO_SAMEAS_FROM_ACC, __FROM_DOMAIN_IN_ANY_CC1, __FROM_DOMAIN_IN_ANY_CC2, __REPLYTO_SAMEAS_FROM_DOMAIN, __DKIM_ALIGNS_1, __DKIM_ALIGNS_2, __ANY_URI, __URI_MAILTO, __URI_WITH_PATH, __URI_NO_WWW, __CP_URI_IN_BODY, __FRAUD_MONEY_CURRENCY_DOLLAR, __SUBJ_ALPHA_NEGATE, __URI_IN_BODY, __URI_NOT_IMG, __MAIL_CHAIN, __FORWARDED_MSG, __BODY_NO_MAILTO, BODY_SIZE_1900_1999, BODYTEXTP_SIZE_3000_LESS, __MIME_TEXT_P1, __MIME_TEXT_ONLY, __URI_NS, HTML_00_01, [TRUNCATED], so=2010-03-03 19:42:08, dmn=2016-08-03-0138 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 129.22.103.227 X-BeenThere: bug-bash@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports for the GNU Bourne Again SHell List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Mailman-Original-Message-ID: <66b2510f-a2cf-b4d3-4574-9193a9bc89c4@case.edu> X-Mailman-Original-References: Xref: csiph.com gnu.bash.bug:15794 On 1/8/20 2:34 AM, Mickael KENIKSSI wrote: > Hello, > > I found a bug regarding how pathnames are processed during filename > expansion. The result for non-normalized path-patterns may get mangled in a > such a way that it becomes inconsistent and unpredictable, making it > useless for string comparison or any kind of string manipulation where > having it in the exact same form as the pattern is required. > > How to reproduce : > > $ mkdir -p a/b/c d/e/f g/h/e; printf '%s\n' .////*//*///////* >> .////a/b/c >> .////d/e/f >> .////g/h/e >> > > This is correct from a filesystem perspective but not from a string > perspective, where you'd need each of the computed path as-is: > > .////a//b///////c >> .////d//e///////f >> .////g//h///////i You're not going to get the path with multiple slashes preceding pattern characters, because the pathname has single slashes, those slashes are, as POSIX says, "explicitly matched by using one or more characters in the pattern," and the matched pathnames that replace the pattern don't have multiple slashes. The reason that the three leading slashes aren't removed is that those directory names don't have any pattern characters and are left unchanged. Since the kernel's filename resolution treats multiple slashes the same as a single slash, the constructed pathname matches what's in the file system. That means, for instance, you have a directory `.////' and a pattern `*'. You opendir `////' and read it for every filename matching `*' (a, d, g), construct the pathnames, and go on with the rest of the pattern. The intermediate runs of multiple slashes get removed as part of the matching algorithm, as described above. They're essentially null pathname components. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/