Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.os.linux.misc > #15147

Re: End of file mark

From Charlie Gibbs <cgibbs@kltpzyxm.invalid>
Newsgroups comp.os.linux.misc
Subject Re: End of file mark
Date 2015-08-12 14:57 +0000
Organization NewsGuy - Unlimited Usenet $23.95
Message-ID <mqfmss01cb6@news3.newsguy.com> (permalink)
References <mqalh6$1fd$1@dont-email.me> <X9qdnXBEFbyDj1TInZ2dnUU7-amdnZ2d@giganews.com> <mqb4c301hb@news4.newsguy.com> <3445299.1ZzGpYkYrH@gmail.com>

Show all headers | View raw


On 2015-08-12, Tom Hardy <rhardy702@gmail.com> wrote:

> Charlie Gibbs wrote:
>
>> Although MS-DOS text file input recognizes 0x1a as an EOF marker,
>> this isn't necessary, since the file system stores the file size
>> to the byte. In fact, all of my MS-DOS software took pains to
>> ensure that the mark
>> was never written, and removed it if present.  An early version of
>> MS-DOS (3.1 IIRC) had a bug in COMMAND.COM such that if you
>> redirected output to append to a file (e.g. dir >>foo), an
>> existing EOF mark was not overwritten, causing all new data to be
>> lost.
>> 
>> It's one more bit of backward compatibility that was just plain
>> backward. It lives on in Windows to this day.
>
> Old Wordstar 3.3 worked in 128 byte blocks and would always pad the 
> last block with ^Z^Z^Z^Z...., Control-Z, 0x1A, or ASCII SUB, 
> denoting the end-of-file.

That wasn't WordStar, but CP/M itself; all programs worked that way.

> Plenty of text editors would recognize this, and if saving such a
> file would cut that down to one ^Z.  IIRC, they generally ended files
> with a ^Z.

However, if the size of the file was an exact multile of 128 bytes,
you could omit any trailing ^Zs; to detect the end of a text file
you'd have to check for either ^Z or no more blocks.

> DOS copy had a text mode which, again IIRC, you didn't usually need 
> to specify, but if you did something like 'copy foo + bar foobar' 
> you had to or you'd get results much like the above.

It also had a /B option which would force this behaviour.

> The info wasn't actually lost, but you had to recognize what had
> happened or use software that just didn't care and say "What is
> that ^Z doing in the middle of my file?"

Indeed.  But in automated systems it was usually too late by then.
I still have remnants of the ugly hacks I used to work around this
stuff, such as scanning the entire file for ^Z characters and changing
them to something else before actually trying to read it.  (Consider
the case where you're reading data from a serial port - if a line hit
corrupts a byte to 0x1a, you'll lose the rest of the file if you're
not careful.)

My gripe with MS-DOS (well, one of them anyway) is that all of this
pain and suffering was unnecessary, since the file system stored the
file size to the byte, making special EOF markers unnecessary.

Speaking of DOS copy, remember how it would refuse to copy a
zero-length file?  (This was _not_ inherited from CP/M, whose
PIP would happily copy zero-length files.)  We had a beta site
go down and lose a month's worth of data thanks to that one.
Some apologists will try to justify the ^Z marker by saying
that it keeps an "empty" file from having a size of zero, thus
circumventing this "feature" - but that's just another attempt
to say that two wrongs make a right.

-- 
/~\  cgibbs@kltpzyxm.invalid (Charlie Gibbs)
\ /  I'm really at ac.dekanfrus if you read it the right way.
 X   Top-posted messages will probably be ignored.  See RFC1855.
/ \  HTML will DEFINITELY be ignored.  Join the ASCII ribbon campaign!

Back to comp.os.linux.misc | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

End of file mark T <T@invalid.invalid> - 2015-08-10 10:05 -0700
  Re: End of file mark John Hasler <jhasler@newsguy.com> - 2015-08-10 12:40 -0500
    Re: End of file mark Richard Kettlewell <rjk@greenend.org.uk> - 2015-08-10 19:32 +0100
      Re: End of file mark Keith Thompson <kst-u@mib.org> - 2015-08-19 17:38 -0700
  Re: End of file mark The Natural Philosopher <tnp@invalid.invalid> - 2015-08-10 19:52 +0100
    Re: End of file mark Pascal Hambourg <boite-a-spam@plouf.fr.eu.org> - 2015-08-11 12:40 +0200
      Re: End of file mark The Natural Philosopher <tnp@invalid.invalid> - 2015-08-11 11:51 +0100
  Re: End of file mark Robert Heller <heller@deepsoft.com> - 2015-08-10 16:05 -0500
    Re: End of file mark Charlie Gibbs <cgibbs@kltpzyxm.invalid> - 2015-08-10 21:16 +0000
      Re: End of file mark Tom Hardy <rhardy702@gmail.com> - 2015-08-11 23:34 -0500
        Re: End of file mark Charlie Gibbs <cgibbs@kltpzyxm.invalid> - 2015-08-12 14:57 +0000
          Re: End of file mark Tom Hardy <rhardy702@gmail.com> - 2015-08-13 11:28 -0500
  Re: End of file mark T <T@invalid.invalid> - 2015-08-11 10:27 -0700

csiph-web