Path: csiph.com!goblin2!goblin1!goblin.stu.neva.ru!usenet.stanford.edu!not-for-mail From: Mickael KENIKSSI Newsgroups: gnu.bash.bug Subject: Re: Filename Expansion bug Date: Thu, 9 Jan 2020 12:09:22 +0100 Lines: 77 Approved: bug-bash@gnu.org Message-ID: References: <66b2510f-a2cf-b4d3-4574-9193a9bc89c4@case.edu> NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Trace: usenet.stanford.edu 1578568205 4705 209.51.188.17 (9 Jan 2020 11:10:05 GMT) X-Complaints-To: action@cs.stanford.edu Cc: bug-bash@gnu.org To: chet.ramey@case.edu Envelope-to: bug-bash@gnu.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9qJ/4o/17InFnVB6+0vuphSV7TWZYFtwc2FBIxDhWRw=; b=hzvYsZQCuGgKFV46mQeAXIL1j+ztuKTDXqvqnUCj8Ph58dV2KiFQEW6EqGp+mfrgCd 0kQ5bc6uiVi/8IQNbZHdTwQA5+jpLR0zK3gCNXjuorurjQzIw2I+Fb/S59LcecTYQS5s yuJWIBecJURwljJrhVg63xM3snJrUojjnYdxQF+Gu1mOFDVa/Nw78aNIa5PcdnNV4o7W qx1JM+24yoEldn0YQ8vS/jFrIdzlEkayJjdnILCX2DtdN5hJKbPAm/5EKDcYGzHLMec/ EBqmHMQBoodJPowGS6zgj1pvmdpi8p8sP532z9aYycZaJcjnZS4vv0Oc9bL0pJl6yHly Ku1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9qJ/4o/17InFnVB6+0vuphSV7TWZYFtwc2FBIxDhWRw=; b=H7MG97JNa/JG0x+/dGIEaw2wyvrn4DuhRRDIZA8B4n9VzcKTSdkWwHtp55bvt1GYdX jHYIfRhqn+aq4wKOoqwPMYoRN8WXQcQLGK0ykIruJFGI8fugcFgdHzqq4yfOxpztJVHz ycL1stfYB8zYMePIpca9UT5FfBs1KANlbEH8/5M/ZizSEOEU9uL464VcJDcMPtkfzgUZ ZNZ1ECoLks8FkSL7+Cm6+snjwgO7/YMucVRrCc16gYUl4zZD7vr7KBT36N/rXQHkHDEz m2S1y0QRtY0h2c0kGOlGZ7d2k/53yiiJeyk74TZowBA2bJJbXtgcXvyDKHzdQGyQVtS9 KwAg== X-Gm-Message-State: APjAAAWtw6snM5WWz1mKD+l3TyoPWS1kqzUxYhaX6TWsHAb84EFfo8JM cMpGCgneesMrjsQb8eY5eNv9Ayu09sjDSt8o5mY3OIRG X-Google-Smtp-Source: APXvYqzqW4ZtDFb6guFSsljtM8pFIgrqwKgUqzMdFkZO1G9umIrPz489pA8OthuMH93qESQAvviPYEecl6CyHCv0osI= X-Received: by 2002:a2e:9ad8:: with SMTP id p24mr6211924ljj.148.1578568198682; Thu, 09 Jan 2020 03:09:58 -0800 (PST) In-Reply-To: <66b2510f-a2cf-b4d3-4574-9193a9bc89c4@case.edu> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::22d X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: bug-bash@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports for the GNU Bourne Again SHell List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Mailman-Original-Message-ID: X-Mailman-Original-References: <66b2510f-a2cf-b4d3-4574-9193a9bc89c4@case.edu> Xref: csiph.com gnu.bash.bug:15796 Thanks for your comment. I understand this may not sound of primary importance for you since they are canonically equivalent, but sometimes what we really all care about is the path as a literal string (be it well- or ill-formed), and not the filesystem object it points to. Normalization upon filename expansion is not the default Bash behavior, so I see no reason why it should be considered acceptable to have it =E2=80=93 partially =E2=80=93 happen on what is no more than an edge case in the end. zsh (and ksh) provide the expected result: $ mkdir -p a/b/c d/e/f g/h/e; zsh -c 'printf %s\\n .////a//../*///////*' > .////a//../a///////b > .////a//../d///////e > .////a//../g///////h > I suppose it all comes down to an implementation question. Best, Micka=C3=ABl On Wed, Jan 8, 2020 at 4:09 PM Chet Ramey wrote: > On 1/8/20 2:34 AM, Mickael KENIKSSI wrote: > > Hello, > > > > I found a bug regarding how pathnames are processed during filename > > expansion. The result for non-normalized path-patterns may get mangled > in a > > such a way that it becomes inconsistent and unpredictable, making it > > useless for string comparison or any kind of string manipulation where > > having it in the exact same form as the pattern is required. > > > > How to reproduce : > > > > $ mkdir -p a/b/c d/e/f g/h/e; printf '%s\n' .////*//*///////* > >> .////a/b/c > >> .////d/e/f > >> .////g/h/e > >> > > > > This is correct from a filesystem perspective but not from a string > > perspective, where you'd need each of the computed path as-is: > > > > .////a//b///////c > >> .////d//e///////f > >> .////g//h///////i > > You're not going to get the path with multiple slashes preceding > pattern characters, because the pathname has single slashes, those > slashes are, as POSIX says, "explicitly matched by using one or > more characters in the pattern," and the matched pathnames > that replace the pattern don't have multiple slashes. > > The reason that the three leading slashes aren't removed is that those > directory names don't have any pattern characters and are left > unchanged. Since the kernel's filename resolution treats multiple > slashes the same as a single slash, the constructed pathname matches > what's in the file system. > > That means, for instance, you have a directory `.////' and a pattern `*'. > You opendir `////' and read it for every filename matching `*' (a, d, g), > construct the pathnames, and go on with the rest of the pattern. > > The intermediate runs of multiple slashes get removed as part of the > matching algorithm, as described above. They're essentially null pathname > components. > > > -- > ``The lyf so short, the craft so long to lerne.'' - Chaucer > ``Ars longa, vita brevis'' - Hippocrates > Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ >