Path: csiph.com!aioe.org!news.glorb.com!usenet.stanford.edu!not-for-mail From: Jim Meyering Newsgroups: gnu.utils.bug Subject: Re: Diff doesn't properly ignore whitespace for this input Date: Thu, 16 Jul 2015 13:45:55 -0700 Lines: 62 Sender: meyering@gmail.com Approved: bug-gnu-utils@gnu.org Message-ID: References: <55A55C88.5080809@ncsu.edu> NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: usenet.stanford.edu 1437079580 26663 208.118.235.17 (16 Jul 2015 20:46:20 GMT) X-Complaints-To: action@cs.stanford.edu Cc: bug-gnu-utils@gnu.org To: Tyler Bletsch Envelope-to: bug-gnu-utils@gnu.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=Aw+y5m++g5TWqOQpA47/XFlsbaOQL1m9E1W+kJZuCik=; b=J/RD05KQ3JjExsJH5GTJ+wZECgJ4vFkfb7jsm+N9qfWY4wkC6HsobgmVG6XBlq2pb/ /6jCPIQ47UQf6P+ZDxneqSFdHYTHS2yABBKByNz3GdJfcHd9hoIxQlQUKDGr2KbZx0ZE ul4tQ05XrCOROtHruQ1Ioc6rGqoodeeeXPHSGYWSmsfg6jHenR7GIanonU6JEvmP01sk LDG6stnJXp4WBtCZptHbU/nSrIvB6zmg8X0lkdnCBiK7roHykuGYyeGuwGxD2XZIGoci 18NrBNQFiXvmOr2MveY25ISZMSrD2lhts2LrddhWaGoIr+GjaKVyfaZ9rIeDAwCCF2Hy pgZg== X-Received: by 10.107.14.148 with SMTP id 142mr13449715ioo.175.1437079575244; Thu, 16 Jul 2015 13:46:15 -0700 (PDT) In-Reply-To: <55A55C88.5080809@ncsu.edu> X-Google-Sender-Auth: HM2SZv7QlAgqKLuPlrmdOkRzrBM X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:4001:c05::22a X-BeenThere: bug-gnu-utils@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Bug reports for the GNU utilities List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: aioe.org gnu.utils.bug:2185 On Tue, Jul 14, 2015 at 12:01 PM, Tyler Bletsch wrote: > I believe I've found a bug in diff's handling of "ignore whitespace" mode. I > have two test files that differ only in whitespace and newlines; I've > verified this using a separate tool (WinMerge) plus doing a diff on the > files after doing s/\s*/ / on the whole file. When I ask for the diff using > "-wb", it reports a spurious difference only in whitespace if I give the > arguments in one order, but correctly reports no differences if I give it > the reverse order. Further, I get consistently correct behavior if I add > the "-d" option. > > Example: > > $ diff -wB in1.txt in2.txt > 3946c4201,4203 > < Exits: > --- >> >> >> Exits: > $ diff -wB in2.txt in1.txt > $ diff -dwB in1.txt in2.txt > $ diff -dwB in2.txt in1.txt > > This came up while using diff to automatically grade a text adventure I'm > having students do in my class -- this is the ONLY file pair out of over > 3000 that appears to exhibit the problem. This leads me to believe that it > must be a fairly rare issue. I'm fixing it on my end by always using -d, but > I think this should be classified as a bug, because it reports a > non-whitespace difference in files where none exists. > > I'm not sure if this mailing list allows attachments, so I've put the files > in question here: > > https://dl.dropboxusercontent.com/u/68643317/diff-bug-test-files.zip > > I tried paring the files down to just demonstrate the bug and nothing else, > but the behavior would seemingly go away at random as I removed content from > the files. Therefore, I'm including the files in their original form. The > files represent test output of the text adventure, specifically navigation > of the default world from the ROM 2.4b6 MUD (after having been converted to > a format for my class's assignment). This content is safe to share. > > I've confirmed that this behavior is present in the following builds of > diff: > - diff (GNU diffutils) 2.8.1 on Red Hat Enterprise Linux Server release 6.5 > (Santiago) > - diff (GNU diffutils) 3.2 on Ubuntu 12.04.4 LTS > - diff (GNU diffutils) 2.9 on Cygwin 32-bit (Windows 7 x64) Thank you for the report. I confirm that it also affects diff-3.3, but found that with the very latest from diff.git (v3.3-30-g29e8de4), the problem does not arise. I.e., comparing your two files like this produces no output: $ src/diff -wBu /t/in{1,2}.txt | wc -c 0 I suspect that it was fixed via this change by Paul Eggert: http://git.savannah.gnu.org/cgit/diffutils.git/commit/?id=9b48bf3d3ed002e32fad http://bugs.gnu.org/16848