Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.os.linux.misc > #15137 > unrolled thread

End of file mark

Started byT <T@invalid.invalid>
First post2015-08-10 10:05 -0700
Last post2015-08-11 10:27 -0700
Articles 13 — 9 participants

Back to article view | Back to comp.os.linux.misc


Contents

  End of file mark T <T@invalid.invalid> - 2015-08-10 10:05 -0700
    Re: End of file mark John Hasler <jhasler@newsguy.com> - 2015-08-10 12:40 -0500
      Re: End of file mark Richard Kettlewell <rjk@greenend.org.uk> - 2015-08-10 19:32 +0100
        Re: End of file mark Keith Thompson <kst-u@mib.org> - 2015-08-19 17:38 -0700
    Re: End of file mark The Natural Philosopher <tnp@invalid.invalid> - 2015-08-10 19:52 +0100
      Re: End of file mark Pascal Hambourg <boite-a-spam@plouf.fr.eu.org> - 2015-08-11 12:40 +0200
        Re: End of file mark The Natural Philosopher <tnp@invalid.invalid> - 2015-08-11 11:51 +0100
    Re: End of file mark Robert Heller <heller@deepsoft.com> - 2015-08-10 16:05 -0500
      Re: End of file mark Charlie Gibbs <cgibbs@kltpzyxm.invalid> - 2015-08-10 21:16 +0000
        Re: End of file mark Tom Hardy <rhardy702@gmail.com> - 2015-08-11 23:34 -0500
          Re: End of file mark Charlie Gibbs <cgibbs@kltpzyxm.invalid> - 2015-08-12 14:57 +0000
            Re: End of file mark Tom Hardy <rhardy702@gmail.com> - 2015-08-13 11:28 -0500
    Re: End of file mark T <T@invalid.invalid> - 2015-08-11 10:27 -0700

#15137 — End of file mark

FromT <T@invalid.invalid>
Date2015-08-10 10:05 -0700
SubjectEnd of file mark
Message-ID<mqalh6$1fd$1@dont-email.me>
Hi All,

Just a trivia question that does not need an answer.

Under VFS and EXT4, do text files have an End Of File
marker?  How do programs know when they are at the
end of a file?

-T

[toc] | [next] | [standalone]


#15138

FromJohn Hasler <jhasler@newsguy.com>
Date2015-08-10 12:40 -0500
Message-ID<87tws7qbx6.fsf@thumper.dhh.gt.org>
In reply to#15137
T writes:
> Under VFS and EXT4, do text files have an End Of File
> marker?

No.

> How do programs know when they are at the end of a file?

The filesystem knows how long the file is.  It returns EOF ("end of
file").
-- 
John Hasler 
jhasler@newsguy.com
Dancing Horse Hill
Elmwood, WI USA

[toc] | [prev] | [next] | [standalone]


#15139

FromRichard Kettlewell <rjk@greenend.org.uk>
Date2015-08-10 19:32 +0100
Message-ID<wwvk2t3ouye.fsf@l1AntVDjLrnP7Td3DQJ8ynzIq3lJMueXf87AxnpFoA.invalid>
In reply to#15138
John Hasler <jhasler@newsguy.com> writes:
> T writes:
>> Under VFS and EXT4, do text files have an End Of File
>> marker?
>
> No.
>
>> How do programs know when they are at the end of a file?
>
> The filesystem knows how long the file is.  It returns EOF ("end of
> file").

The EOF return value is synthesized by stdio.  The kernel just stops
returning bytes when the end of the file is reached.

-- 
http://www.greenend.org.uk/rjk/

[toc] | [prev] | [next] | [standalone]


#15171

FromKeith Thompson <kst-u@mib.org>
Date2015-08-19 17:38 -0700
Message-ID<lnvbca3ib2.fsf@kst-u.example.com>
In reply to#15139
Richard Kettlewell <rjk@greenend.org.uk> writes:
> John Hasler <jhasler@newsguy.com> writes:
>> T writes:
>>> Under VFS and EXT4, do text files have an End Of File
>>> marker?
>>
>> No.
>>
>>> How do programs know when they are at the end of a file?
>>
>> The filesystem knows how long the file is.  It returns EOF ("end of
>> file").
>
> The EOF return value is synthesized by stdio.  The kernel just stops
> returning bytes when the end of the file is reached.

The read() system call takes an argument specifying how many bytes it
should attempt to read, and returns the number of bytes it was able to
read.  It returns 0 to indicate end-of-file.  (Returning a positive
count smaller than the size requested does not necessarily indicate
end-of-file.)  read() returns -1 on error.

The stdio input functions (on POSIX systems) are implemented on top of
read().  Each function (getchar(), fgets(), fread(), etc.) has its own
way to indicate an end-of-file condition.  Read the man page for each
input function to see how it does this.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#15140

FromThe Natural Philosopher <tnp@invalid.invalid>
Date2015-08-10 19:52 +0100
Message-ID<mqaru2$d27$1@news.albasani.net>
In reply to#15137
On 10/08/15 18:05, T wrote:
> Hi All,
>
> Just a trivia question that does not need an answer.
>
> Under VFS and EXT4, do text files have an End Of File
> marker?  How do programs know when they are at the
> end of a file?
>
File size to the nearest byte is part of the directory information IIRC

> -T
>


-- 
New Socialism consists essentially in being seen to have your heart in 
the right place whilst your head is in the clouds and your hand is in 
someone else's pocket.

[toc] | [prev] | [next] | [standalone]


#15143

FromPascal Hambourg <boite-a-spam@plouf.fr.eu.org>
Date2015-08-11 12:40 +0200
Message-ID<mqcjfa$2d56$2@saria.nerim.net>
In reply to#15140
The Natural Philosopher a écrit :
>>
>> Under VFS and EXT4, do text files have an End Of File
>> marker?  How do programs know when they are at the
>> end of a file?
>>
> File size to the nearest byte is part of the directory information IIRC

No, it is part of the inode information.

[toc] | [prev] | [next] | [standalone]


#15144

FromThe Natural Philosopher <tnp@invalid.invalid>
Date2015-08-11 11:51 +0100
Message-ID<mqck4g$oam$1@news.albasani.net>
In reply to#15143
On 11/08/15 11:40, Pascal Hambourg wrote:
> The Natural Philosopher a écrit :
>>>
>>> Under VFS and EXT4, do text files have an End Of File
>>> marker?  How do programs know when they are at the
>>> end of a file?
>>>
>> File size to the nearest byte is part of the directory information IIRC
>
> No, it is part of the inode information.
>
I stand corrected..


-- 
New Socialism consists essentially in being seen to have your heart in 
the right place whilst your head is in the clouds and your hand is in 
someone else's pocket.

[toc] | [prev] | [next] | [standalone]


#15141

FromRobert Heller <heller@deepsoft.com>
Date2015-08-10 16:05 -0500
Message-ID<X9qdnXBEFbyDj1TInZ2dnUU7-amdnZ2d@giganews.com>
In reply to#15137
At Mon, 10 Aug 2015 10:05:13 -0700 T <T@invalid.invalid> wrote:

> 
> Hi All,
> 
> Just a trivia question that does not need an answer.
> 
> Under VFS and EXT4, do text files have an End Of File
> marker?  How do programs know when they are at the
> end of a file?

An EOF marker is only needed for file systems that only record file size in 
terms of storage blocks.  CP/M (and MS-DOS, which is ultimately based on CP/M) 
are that way.  UNIX file systems have *always* stored the exact size of the 
file in bytes as part of the inode structure.  This functionallity is retained 
in modern Linux file systems, like VFS, EXT4, and others.

> 
> -T
> 
>                                           

-- 
Robert Heller             -- 978-544-6933
Deepwoods Software        -- Custom Software Services
http://www.deepsoft.com/  -- Linux Administration Services
heller@deepsoft.com       -- Webhosting Services
                                                                                                                        

[toc] | [prev] | [next] | [standalone]


#15142

FromCharlie Gibbs <cgibbs@kltpzyxm.invalid>
Date2015-08-10 21:16 +0000
Message-ID<mqb4c301hb@news4.newsguy.com>
In reply to#15141
On 2015-08-10, Robert Heller <heller@deepsoft.com> wrote:

> At Mon, 10 Aug 2015 10:05:13 -0700 T <T@invalid.invalid> wrote:
>
>> Hi All,
>> 
>> Just a trivia question that does not need an answer.
>> 
>> Under VFS and EXT4, do text files have an End Of File
>> marker?  How do programs know when they are at the
>> end of a file?
>
> An EOF marker is only needed for file systems that only record file size
> in terms of storage blocks.  CP/M (and MS-DOS, which is ultimately based
> on CP/M) are that way.  UNIX file systems have *always* stored the exact
> size of the file in bytes as part of the inode structure.  This functionallity
> is retained in modern Linux file systems, like VFS, EXT4, and others.

Although MS-DOS text file input recognizes 0x1a as an EOF marker, this
isn't necessary, since the file system stores the file size to the byte.
In fact, all of my MS-DOS software took pains to ensure that the mark
was never written, and removed it if present.  An early version of MS-DOS
(3.1 IIRC) had a bug in COMMAND.COM such that if you redirected output to
append to a file (e.g. dir >>foo), an existing EOF mark was not overwritten,
causing all new data to be lost.

It's one more bit of backward compatibility that was just plain backward.
It lives on in Windows to this day.

-- 
/~\  cgibbs@kltpzyxm.invalid (Charlie Gibbs)
\ /  I'm really at ac.dekanfrus if you read it the right way.
 X   Top-posted messages will probably be ignored.  See RFC1855.
/ \  HTML will DEFINITELY be ignored.  Join the ASCII ribbon campaign!

[toc] | [prev] | [next] | [standalone]


#15146

FromTom Hardy <rhardy702@gmail.com>
Date2015-08-11 23:34 -0500
Message-ID<3445299.1ZzGpYkYrH@gmail.com>
In reply to#15142
Charlie Gibbs wrote:

> Although MS-DOS text file input recognizes 0x1a as an EOF marker,
> this isn't necessary, since the file system stores the file size
> to the byte. In fact, all of my MS-DOS software took pains to
> ensure that the mark
> was never written, and removed it if present.  An early version of
> MS-DOS (3.1 IIRC) had a bug in COMMAND.COM such that if you
> redirected output to append to a file (e.g. dir >>foo), an
> existing EOF mark was not overwritten, causing all new data to be
> lost.
> 
> It's one more bit of backward compatibility that was just plain
> backward. It lives on in Windows to this day.

Old Wordstar 3.3 worked in 128 byte blocks and would always pad the 
last block with ^Z^Z^Z^Z...., Control-Z, 0x1A, or ASCII SUB, 
denoting the end-of-file.  Plenty of text editors would recognize 
this, and if saving such a file would cut that down to one ^Z.  
IIRC, they generally ended files with a ^Z.

DOS copy had a text mode which, again IIRC, you didn't usually need 
to specify, but if you did something like 'copy foo + bar foobar' 
you had to or you'd get results much like the above.  The info 
wasn't actually lost, but you had to recognize what had happened or 
use software that just didn't care and say "What is that ^Z doing in 
the middle of my file?"

-- 
Tom Hardy <rhardy702@gmail.com>

[toc] | [prev] | [next] | [standalone]


#15147

FromCharlie Gibbs <cgibbs@kltpzyxm.invalid>
Date2015-08-12 14:57 +0000
Message-ID<mqfmss01cb6@news3.newsguy.com>
In reply to#15146
On 2015-08-12, Tom Hardy <rhardy702@gmail.com> wrote:

> Charlie Gibbs wrote:
>
>> Although MS-DOS text file input recognizes 0x1a as an EOF marker,
>> this isn't necessary, since the file system stores the file size
>> to the byte. In fact, all of my MS-DOS software took pains to
>> ensure that the mark
>> was never written, and removed it if present.  An early version of
>> MS-DOS (3.1 IIRC) had a bug in COMMAND.COM such that if you
>> redirected output to append to a file (e.g. dir >>foo), an
>> existing EOF mark was not overwritten, causing all new data to be
>> lost.
>> 
>> It's one more bit of backward compatibility that was just plain
>> backward. It lives on in Windows to this day.
>
> Old Wordstar 3.3 worked in 128 byte blocks and would always pad the 
> last block with ^Z^Z^Z^Z...., Control-Z, 0x1A, or ASCII SUB, 
> denoting the end-of-file.

That wasn't WordStar, but CP/M itself; all programs worked that way.

> Plenty of text editors would recognize this, and if saving such a
> file would cut that down to one ^Z.  IIRC, they generally ended files
> with a ^Z.

However, if the size of the file was an exact multile of 128 bytes,
you could omit any trailing ^Zs; to detect the end of a text file
you'd have to check for either ^Z or no more blocks.

> DOS copy had a text mode which, again IIRC, you didn't usually need 
> to specify, but if you did something like 'copy foo + bar foobar' 
> you had to or you'd get results much like the above.

It also had a /B option which would force this behaviour.

> The info wasn't actually lost, but you had to recognize what had
> happened or use software that just didn't care and say "What is
> that ^Z doing in the middle of my file?"

Indeed.  But in automated systems it was usually too late by then.
I still have remnants of the ugly hacks I used to work around this
stuff, such as scanning the entire file for ^Z characters and changing
them to something else before actually trying to read it.  (Consider
the case where you're reading data from a serial port - if a line hit
corrupts a byte to 0x1a, you'll lose the rest of the file if you're
not careful.)

My gripe with MS-DOS (well, one of them anyway) is that all of this
pain and suffering was unnecessary, since the file system stored the
file size to the byte, making special EOF markers unnecessary.

Speaking of DOS copy, remember how it would refuse to copy a
zero-length file?  (This was _not_ inherited from CP/M, whose
PIP would happily copy zero-length files.)  We had a beta site
go down and lose a month's worth of data thanks to that one.
Some apologists will try to justify the ^Z marker by saying
that it keeps an "empty" file from having a size of zero, thus
circumventing this "feature" - but that's just another attempt
to say that two wrongs make a right.

-- 
/~\  cgibbs@kltpzyxm.invalid (Charlie Gibbs)
\ /  I'm really at ac.dekanfrus if you read it the right way.
 X   Top-posted messages will probably be ignored.  See RFC1855.
/ \  HTML will DEFINITELY be ignored.  Join the ASCII ribbon campaign!

[toc] | [prev] | [next] | [standalone]


#15148

FromTom Hardy <rhardy702@gmail.com>
Date2015-08-13 11:28 -0500
Message-ID<1934062.nj8Py12zOn@gmail.com>
In reply to#15147
Charlie Gibbs wrote:

> That wasn't WordStar, but CP/M itself; all programs worked that
> way.

I never used CPM, but I knew Wordstar very well, and DOS fairly 
well.  I used to run WS out of a script to set the current directory 
because WS was clueless about directories.

> Speaking of DOS copy, remember how it would refuse to copy a
> zero-length file?

I don't recall that one.

How about the BIOS text routine that would insert CR/LF to wrap 
lines that reached the end of the screen.  Not exceeded the end of 
the screen but reached it.  It effectively inserted a blank line 
after 80 character lines.  It affected all the command line tools 
and other programs that used the BIOS.

That one may still be around.

-- 
Tom Hardy <rhardy702@gmail.com>

[toc] | [prev] | [next] | [standalone]


#15145

FromT <T@invalid.invalid>
Date2015-08-11 10:27 -0700
Message-ID<mqdb6h$j7o$1@dont-email.me>
In reply to#15137
On 08/10/2015 10:05 AM, T wrote:
> Hi All,
>
> Just a trivia question that does not need an answer.
>
> Under VFS and EXT4, do text files have an End Of File
> marker?  How do programs know when they are at the
> end of a file?
>
> -T
>

Hi All,

Thank you for your responses.  I learned something!

[toc] | [prev] | [standalone]


Back to top | Article view | comp.os.linux.misc


csiph-web