Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.sys.mac.system > #107091 > unrolled thread
| Started by | JF Mezei <jfmezei.spamnot@vaxination.ca> |
|---|---|
| First post | 2017-05-20 20:07 -0400 |
| Last post | 2017-05-23 21:18 +0000 |
| Articles | 20 on this page of 21 — 4 participants |
Back to article view | Back to comp.sys.mac.system
APFS and software updates JF Mezei <jfmezei.spamnot@vaxination.ca> - 2017-05-20 20:07 -0400
Re: APFS and software updates dempson@actrix.gen.nz (David Empson) - 2017-05-21 17:59 +1200
Re: APFS and software updates Jolly Roger <jollyroger@pobox.com> - 2017-05-21 16:26 +0000
Re: APFS and software updates JF Mezei <jfmezei.spamnot@vaxination.ca> - 2017-05-21 13:00 -0400
Re: APFS and software updates dempson@actrix.gen.nz (David Empson) - 2017-05-22 12:02 +1200
Re: APFS and software updates JF Mezei <jfmezei.spamnot@vaxination.ca> - 2017-05-21 23:36 -0400
Re: APFS and software updates Jolly Roger <jollyroger@pobox.com> - 2017-05-22 04:35 +0000
Re: APFS and software updates Jolly Roger <jollyroger@pobox.com> - 2017-05-22 15:18 +0000
Re: APFS and software updates dempson@actrix.gen.nz (David Empson) - 2017-05-23 14:11 +1200
Re: APFS and software updates JF Mezei <jfmezei.spamnot@vaxination.ca> - 2017-05-22 23:15 -0400
Re: APFS and software updates dempson@actrix.gen.nz (David Empson) - 2017-05-24 23:27 +1200
Re: APFS and software updates JF Mezei <jfmezei.spamnot@vaxination.ca> - 2017-05-24 15:57 -0400
Re: APFS and software updates dempson@actrix.gen.nz (David Empson) - 2017-05-25 12:37 +1200
Re: APFS and software updates JF Mezei <jfmezei.spamnot@vaxination.ca> - 2017-05-24 20:51 -0400
Re: APFS and software updates dempson@actrix.gen.nz (David Empson) - 2017-05-26 13:44 +1200
Re: APFS and software updates Jolly Roger <jollyroger@pobox.com> - 2017-05-26 02:11 +0000
Re: APFS and software updates JF Mezei <jfmezei.spamnot@vaxination.ca> - 2017-05-26 00:08 -0400
Re: APFS and software updates Lewis <g.kreme@gmail.com.dontsendmecopies> - 2017-05-26 01:27 +0000
Re: APFS and software updates Lewis <g.kreme@gmail.com.dontsendmecopies> - 2017-05-23 09:36 +0000
Re: APFS and software updates dempson@actrix.gen.nz (David Empson) - 2017-05-24 01:42 +1200
Re: APFS and software updates Lewis <g.kreme@gmail.com.dontsendmecopies> - 2017-05-23 21:18 +0000
Page 1 of 2 [1] 2 Next page →
| From | JF Mezei <jfmezei.spamnot@vaxination.ca> |
|---|---|
| Date | 2017-05-20 20:07 -0400 |
| Subject | APFS and software updates |
| Message-ID | <5920da3b$0$17396$c3e8da3$dd9697d2@news.astraweb.com> |
With a totally new way to update files being used for APFS with snapshots etc, does this mean that a system update/upgrade could be done on a running system with a new version of files being created while the running system still has files opened pointing to the old version? (with old versions then deleted at reboot since the system would then boot with most recent version of files) ? In a different vein, if file1 is some indexed database file, and process1 updates records in it while process2 accesses records, will process2 continue to access its "snapshot" of the file at the time it opened it, or would it get updated records (written in different blocks) ? Or put it another way, would an application specify, when it opens a file, whether it wants a static snapshot at time of opening versus dynamic accxess to the file as it is being modified by others ?
[toc] | [next] | [standalone]
| From | dempson@actrix.gen.nz (David Empson) |
|---|---|
| Date | 2017-05-21 17:59 +1200 |
| Message-ID | <1n6dxd5.1eiz2hs1avbau9N%dempson@actrix.gen.nz> |
| In reply to | #107091 |
JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > With a totally new way to update files being used for APFS with > snapshots etc, does this mean that a system update/upgrade could be done > on a running system with a new version of files being created while the > running system still has files opened pointing to the old version? I think you need to go away and read/view the APFS presentation at last year's WWDC again. It would also help if you stop obsessing about APFS being some magic that will completely change the way things work. New features of APFS have little to do with software updates. The only way snapshots figure in software updates would be to save the state of the file system so an update can be rolled back if something goes wrong while it is being installed, or if the user decides they don't like the result and want to revert to the prior state, or to allow mounting the prior state (read only) for comparison. Clones might help updates by reducing the amount of data needing to be copied when a file is patched (assuming the patch is replacing a block of data the same size, which is probably rare). > (with old versions then deleted at reboot since the system would then > boot with most recent version of files) ? Unix file systems (including HFS+) can already do that. The problem with OS updates is nothing to do with how files are replaced, but with coordinating all parts of the system to use the same versions of a related set of files. The easiest way to ensure that is to restart. > In a different vein, if file1 is some indexed database file, and > process1 updates records in it while process2 accesses records, will > process2 continue to access its "snapshot" of the file at the time it > opened it, or would it get updated records (written in different blocks) ? Existing databases can already do that without file system help, e.g. SQLite in WAL mode: if reader A starts a transaction, then writer B starts a transaction, makes changes and commits, reader A continues to see the old state of the database until its transaction ends. The changes are visible to reader A when it starts its next transaction. The clone feature in APFS is somewhat similar but would require the database software to close and reopen the database to see changes done in a clone which then replaced the original file. This would certainly be faster, as it avoids double writing new/updated database pages (initially to the WAL file, then later to the main database at a checkpoint). Supporting this would require a new journalling method and updated database software, but old journalling methods would still need to be supported for other file systems, therefore it increases the complexity of cross platform database engines for a feature only available on a limited number of systems. That makes it less likely to be supported by cross-platform database engines like SQLite, unless Apple did a platform-specific branch. > Or put it another way, would an application specify, when it opens a > file, whether it wants a static snapshot at time of opening versus > dynamic accxess to the file as it is being modified by others ? Snapshots have nothing to do with opening files. They are for saving the state of entire volumes. -- David Empson dempson@actrix.gen.nz
[toc] | [prev] | [next] | [standalone]
| From | Jolly Roger <jollyroger@pobox.com> |
|---|---|
| Date | 2017-05-21 16:26 +0000 |
| Message-ID | <eodtdjFreetU1@mid.individual.net> |
| In reply to | #107098 |
On 2017-05-21, David Empson <dempson@actrix.gen.nz> wrote: > JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > >> With a totally new way to update files being used for APFS with >> snapshots etc, does this mean that a system update/upgrade could be done >> on a running system with a new version of files being created while the >> running system still has files opened pointing to the old version? > > I think you need to go away and read/view the APFS presentation at last > year's WWDC again. It would also help if you stop obsessing about APFS > being some magic that will completely change the way things work. You have to wonder how many times JF has to be asked nicely to RTFM before he'll actually do it. Maybe the answer is: never. -- E-mail sent to this address may be devoured by my ravenous SPAM filter. I often ignore posts from Google. Use a real news client instead. JR
[toc] | [prev] | [next] | [standalone]
| From | JF Mezei <jfmezei.spamnot@vaxination.ca> |
|---|---|
| Date | 2017-05-21 13:00 -0400 |
| Message-ID | <5921c795$0$12341$b1db1813$65575428@news.astraweb.com> |
| In reply to | #107098 |
On 2017-05-21 01:59, David Empson wrote: > Snapshots have nothing to do with opening files. They are for saving the > state of entire volumes. certain types of links to files act as snapshots. If file1 is made to point to file2, it initially acts as a hard link. (same blocks). But as file2 is modified, it contains new blocks, and file1 continues to point to the old blocks, so file1 is now different content from file2. I was wondering if a process having a file open would have similar "snapshot" behaviour or if by default, they always point to the current "live" file. Consider an 86 block file, with a continuous extend from blocks 700 to 796. Process 1 rewrites bytes 1024 to 2047 and those 2 blocks get written to blocks 3456-3457 (APFS never overwrites blocks). The file now has 3 extents: 700-701, 3456-3457, 704-786. If process 2 has the same opened and wants to read 1024 to 2047 at same time, depending on how the file pointers/caches etc are handled, it could still see the file as having one extent from 700-786 or the new fragmented one. (whether they are "extents" or just a glorified linked list, it is the same question: whether different processes retain a coherent view of the file as byte ranges inside the file get moved to new blocks.
[toc] | [prev] | [next] | [standalone]
| From | dempson@actrix.gen.nz (David Empson) |
|---|---|
| Date | 2017-05-22 12:02 +1200 |
| Message-ID | <1n6f8l5.11tc1uh1qikfpyN%dempson@actrix.gen.nz> |
| In reply to | #107100 |
JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > On 2017-05-21 01:59, David Empson wrote: > > > > > Snapshots have nothing to do with opening files. They are for saving the > > state of entire volumes. > > certain types of links to files act as snapshots. No they don't. In APFS, a snapshot is the state of the file system on the entire volume at a point in time. It is mountable as a read only copy of the volume as at that point, and can be used to roll back the entire volume to that point. As I said but you snipped out: I think you need to go away and read/view the APFS presentation at last year's WWDC again. > If file1 is made to point to file2, it initially acts as a hard link. > (same blocks). > > But as file2 is modified, it contains new blocks, and file1 continues to > point to the old blocks, so file1 is now different content from file2. That is a clone. It is not a hard link, nor is it a snapshot. > I was wondering if a process having a file open would have similar > "snapshot" behaviour or if by default, they always point to the current > "live" file. Clones are two completely different files as far as applications are concerned. The purpose of clones is to allow file duplicate operations to take almost no time or disk space, because they don't need to copy any data, just create new entries in the directory structures. A clone behaves like a copy of the original file. Subsequent modifications to either the original file or the clone are not visible in the other file. Under the hood, they share disk storage for the unmodified portions, but applications have no way to access that level of detail. This is not the same as a hard link, because a hard link does see changes made via other hard links to the same file. In traditional Unix file system terms: - Hard links are directory entries (pathnames) pointing to the same inode, wihch points to the data blocks. Any change to the inode is visible to all the hard links. - Clones are directory entries pointing to separate inodes which initially point to the same data blocks, but changes made via one inode result in that inode pointing to different data blocks for modified parts of the file; other inodes in the set of clones still point to the original data blocks so see no changes. (Clones can also be used to quickly duplicate an entire folder, not just a single file, with the same properties that the original and clone folders subsequently behave like independent copies, not showing any changes made to the other.) > Consider an 86 block file, with a continuous extend from blocks 700 to 796. [snip] No. You have a completely wrong concept of the purpose of clones. -- David Empson dempson@actrix.gen.nz
[toc] | [prev] | [next] | [standalone]
| From | JF Mezei <jfmezei.spamnot@vaxination.ca> |
|---|---|
| Date | 2017-05-21 23:36 -0400 |
| Message-ID | <59225cc4$0$41892$b1db1813$2411a48f@news.astraweb.com> |
| In reply to | #107105 |
On 2017-05-21 20:02, David Empson wrote: > That is a clone. It is not a hard link, nor is it a snapshot. But snapshots use the same underlying technology as a clone. And it is the term I was looking for. > Clones are two completely different files as far as applications are > concerned. Yes. But what i am asking here: if process1 has file1 opened for read, and process2 opens file1 for read-write, do both processes access the same file, or does the system generate a temorary clone so that process1 gets a static snapshot (for lack of better word) of the file structure at the time it opened it. During the presentation, Apple mentioned that APFS was designed for desktops, not large servers, so I wonder about concurrency issues when an update to a file causes the file structure to change. There is also a security issue. Process1 is reading file sequentially and reads a block just as process2 rwrites that block, causing original one to be deallocated from the file, moved tto the free block list, and a new block is allocated to containe the updated data. Process1 could end up reading a block which is no longer part of that file and has already been allocated to another file that this used does not have access to. If a "clone" is made, then the system knows that the blocks deallopcated by process2 are still in use by process1 and would keep them intact as as part of the version of the file seen by process1 at time it opened it. If a clone is not made, then the OS and file system will have work to do for cache/file system coherence to ensure that 2 processes accessing the same file never get blocks that no longer belong to the file. I am sure Apple has thought of this, but I am curious on how they solved it. That was not mentioned in the presentation.
[toc] | [prev] | [next] | [standalone]
| From | Jolly Roger <jollyroger@pobox.com> |
|---|---|
| Date | 2017-05-22 04:35 +0000 |
| Message-ID | <eof83oF6qbuU1@mid.individual.net> |
| In reply to | #107106 |
JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > On 2017-05-21 20:02, David Empson wrote: > >> That is a clone. It is not a hard link, nor is it a snapshot. > > But snapshots use the same underlying technology as a clone. Show the group where Apple states that APFS snapshots use the same underlying technology as a clone. [foolish ramblings rightfully ignored] -- E-mail sent to this address may be devoured by my ravenous SPAM filter. I often ignore posts from Google. Use a real news client instead. JR
[toc] | [prev] | [next] | [standalone]
| From | Jolly Roger <jollyroger@pobox.com> |
|---|---|
| Date | 2017-05-22 15:18 +0000 |
| Message-ID | <eogdqiFf9fnU1@mid.individual.net> |
| In reply to | #107107 |
On 2017-05-22, Jolly Roger <jollyroger@pobox.com> wrote: > JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: >> On 2017-05-21 20:02, David Empson wrote: >> >>> That is a clone. It is not a hard link, nor is it a snapshot. >> >> But snapshots use the same underlying technology as a clone. > > Show the group where Apple states that APFS snapshots use the same > underlying technology as a clone. *crickets chirping*... -- E-mail sent to this address may be devoured by my ravenous SPAM filter. I often ignore posts from Google. Use a real news client instead. JR
[toc] | [prev] | [next] | [standalone]
| From | dempson@actrix.gen.nz (David Empson) |
|---|---|
| Date | 2017-05-23 14:11 +1200 |
| Message-ID | <1n6hbwa.1dwmt491syhev2N%dempson@actrix.gen.nz> |
| In reply to | #107106 |
JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > On 2017-05-21 20:02, David Empson wrote: > > > That is a clone. It is not a hard link, nor is it a snapshot. > > But snapshots use the same underlying technology as a clone. And it is > the term I was looking for. I'm not bothering to answer your text in detail because you have everything completely wrong. Again, an APFS "snapshot" has nothing to do with individual files. It is a read-only reference to the state of an entire volume at a point in time. An APFS "clone" is a method of copying a file without using disk space, resulting in two separate files which happen to share storage until either one is modified, and that modification is invisible to the other file. A hard link is something that looks like a separate file but actually points to the same file as the original (also a hard link), and changes in either hard link are seen via the other one. This is the same behaviour as existing file systems. If none of these methods are used, and two processes just happen to have the same file open, then writes by one process will be seen immediately by the other process when it reads the modified portion of the file. This is the same behaviour as existing file systems. You seem to be inventing a "version" mechanism where two open instances of the same file start to have diverging content just because one process happens to have the file open while another process is writing to that file. No such mechanism exists in APFS as described at WWDC 2016. A "version" mechanism would have to be implemented at a higher level, e.g. by an application or the OS explictly cloning the file to preserve a reference to its current state. That results in another file with a different pathname which can be opened separately. The clone happens to initially share storage with the original file, but it won't see any subsequent changes to the original file. For example, this could be used by the Autosave and Versions mechanism, or by Time Machine Local Storage backups. -- David Empson dempson@actrix.gen.nz
[toc] | [prev] | [next] | [standalone]
| From | JF Mezei <jfmezei.spamnot@vaxination.ca> |
|---|---|
| Date | 2017-05-22 23:15 -0400 |
| Message-ID | <5923a964$0$61860$c3e8da3$e074e489@news.astraweb.com> |
| In reply to | #107138 |
On 2017-05-22 22:11, David Empson wrote: > Again, an APFS "snapshot" has nothing to do with individual files. It is > a read-only reference to the state of an entire volume at a point in > time. The way I saw it, cloning used the same underlying mechanism as snapshot to share unchanged blocks and when one is changed, the changed file links to the changed blocks, and the unchanged file links to the unchanged blocks. > If none of these methods are used, and two processes just happen to have > the same file open, then writes by one process will be seen immediately > by the other process when it reads the modified portion of the file. So you're saying Apple has solved the issue of processes getting file structure when they open the file, and those on process-memory structures get dynamically changed whenever another process updates the file causing the list of blocks containing current data to change. Note: in many systems, a process opens a file and gets the enf of file marker, and file size and those remain static even if another process appends to the file. The "synch() routine in C was developped to force thsoe structures to be re-read so the process has an updated view of the file's structure. In the case of APFS, it isn't just end of file that changes, but also the list of already allocated blocks because when you rwerite a block, it gets written elsewhere. If process1 doesn't get updated information, it will read the block at the old location and get the old data. > You seem to be inventing a "version" mechanism where two open instances The underlying structure of APFS is "version" based because an update of a record never overwites, it is always written elsewhere and the file allocation list changes to include that new block instead of the old block. A process whose in-memory structures don't get dynamically changed will still have the ols "view" of which blocks contain current data and will end up reading unupdated data.
[toc] | [prev] | [next] | [standalone]
| From | dempson@actrix.gen.nz (David Empson) |
|---|---|
| Date | 2017-05-24 23:27 +1200 |
| Message-ID | <1n6hi75.15jb9r61ue6veeN%dempson@actrix.gen.nz> |
| In reply to | #107147 |
JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > On 2017-05-22 22:11, David Empson wrote: > > > If none of these methods are used, and two processes just happen to have > > the same file open, then writes by one process will be seen immediately > > by the other process when it reads the modified portion of the file. > > So you're saying Apple has solved the issue of processes getting file > structure when they open the file, Processes do not get file structures when then open files. They get a descriptor to the file, and the internal details of the on-disk strucures are managed by the kernel and file system driver. > and those on process-memory structures get dynamically changed whenever > another process updates the file causing the list of blocks containing > current data to change. The kernel data structures get updated, which is not changing - it was already done that way with HFS+. You seem to be imagining a nonexistent problem. > Note: in many systems, a process opens a file and gets the enf of file > marker, and file size and those remain static even if another process > appends to the file. The "synch() routine in C was developped to force > thsoe structures to be re-read so the process has an updated view of the > file's structure. I don't care what some obscure or obsolete system you've dealt with does. There is no such "synch()" function or anything resembling it in standard C, and that isn't how the BSD file API behaves on any operating system I've worked on. It doesn't matter which file system you are using. If process 1 and process 2 both have the same file open, and process 2 writes to the file changing its length, the next call by process 1 to the BSD API fstat() or similar will see the updated file length. If process 1 reads the file length into a variable and then keeps referring to its own variable, then it won't know about the changed length, but that isn't an OS or file system issue - it is an application design issue. The same principle applies to the content of files, not just the length. If process 1 is repeatedly reading a particular area in a file, and process 2 has the same file open and writes to that area of the file, the next read by process 1 will return the data that was written by process 2. > In the case of APFS, it isn't just end of file that changes, but also > the list of already allocated blocks because when you rwerite a block, > it gets written elsewhere. If process1 doesn't get updated information, > it will read the block at the old location and get the old data. Process 1 has absolutely no clue this has happened. It just sees changed data in the file the next time it reads the area that has been modified. It is all handled inside the kernel, which knows there are two open references to the same file, and updates caches and other in-memory data structures accordingly. > > You seem to be inventing a "version" mechanism where two open instances > > The underlying structure of APFS is "version" based because an update of > a record never overwites, it is always written elsewhere and the file > allocation list changes to include that new block instead of the old block. > > A process whose in-memory structures don't get dynamically changed will > still have the ols "view" of which blocks contain current data and will > end up reading unupdated data. No it won't, because the "process" doesn't have access to that level of detail. -- David Empson dempson@actrix.gen.nz
[toc] | [prev] | [next] | [standalone]
| From | JF Mezei <jfmezei.spamnot@vaxination.ca> |
|---|---|
| Date | 2017-05-24 15:57 -0400 |
| Message-ID | <5925e5ae$0$38718$b1db1813$19ace300@news.astraweb.com> |
| In reply to | #107187 |
On 2017-05-24 07:27, David Empson wrote:
> Processes do not get file structures when then open files. They get a
> descriptor to the file, and the internal details of the on-disk
> strucures are managed by the kernel and file system driver.
There is a lot of file context kept in process memory. (for instance
where in the file the current" pointer" is for the next read, and many
file attributes. Not sure whetherread ahead buffers are in process or
kernel memory. (if process 1 read first 12 bytes of a file, it is likely
the first read read 2048 or more bytes in a read ahead buffer). And that
matters because if process 2 modifies some of those bytes already in a
read ahead buffer, if in process memory, won't get updated, but if in
kernel memory, it might.
> You seem to be imagining a nonexistent problem.
Since it is a new file system with a new way to store data (moving it
around to different blocks with every rewite operation), asking
questions on how it works is normal.
As I said before, the WWDC presentation explicitely skipped over
concurrency issues (hint of unresolved issues).
> I don't care what some obscure or obsolete system you've dealt with
> does. There is no such "synch()" function or anything resembling it in
> standard C,
man fsync
FSYNC(2) BSD System Calls Manual
FSYNC(2)
NAME
fsync -- synchronize a file's in-core state with that on disk
And there is a good reason for that call: if you do mass writes to
append to the file, an implicit fsync after every write would kill
performance, you are much better off stacking many writes for efficiency
and either having the implicit fsync (or expklicit) are regular
intervals or when you're done.
Even in Finder, you will find cases where the number of bytes used
remains at 0 while another process writes to it, but bytes allocated
increases and only once file closed is the bytes used updated.
Applications that write to log files want an explicit or implicit fsync
to happen after every write since they don't know when the next write
will happen and it isn't a performance hit if the next write won't
happen for another few seconds.
Apple may or may not have implemented live updates to such structures
and this wasn't mentioned in the WWDC presentation which didn't deal
with concurrency. So I don't think my question is unwarranted.
> If process 1 and process 2 both have the same file open, and process 2
> writes to the file changing its length, the next call by process 1 to
> the BSD API fstat() or similar will see the updated file length.
does fstat cause and implicit fsync?
But if you read sequentially, while aother process rewrites random areas
of file, concurrency starts to matter.
> It is all handled inside the kernel, which knows there are two open
> references to the same file, and updates caches and other in-memory data
> structures accordingly.
Wel, the "all handled inside the kernel" is the big question. If you
make low level calls, sure, but even "low level" calls in C for instance
still have the C-run time in between the programmer and the kernel.
It is all likely that Apple has solved the problem. And you may very
well be right, But just stating that it is solved doesn't give me a
pointer where Apple explains HOW it has handled concurrency for APFS.
[toc] | [prev] | [next] | [standalone]
| From | dempson@actrix.gen.nz (David Empson) |
|---|---|
| Date | 2017-05-25 12:37 +1200 |
| Message-ID | <1n6ksac.1u9iucwum32c2N%dempson@actrix.gen.nz> |
| In reply to | #107194 |
JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > On 2017-05-24 07:27, David Empson wrote: > > > Processes do not get file structures when then open files. They get a > > descriptor to the file, and the internal details of the on-disk > > strucures are managed by the kernel and file system driver. > > There is a lot of file context kept in process memory. (for instance > where in the file the current" pointer" is for the next read, You have this whole concept completely wrong. A process uses system calls like read() and write() which only specify a file descriptor previously returned by the kernel via the open() system call. The file position is managed by the kernel on the other side of the system call. (Whether the kernel stores the file state in process-specific or kernel-specific memory is not relevant, because that data is managed by the kernel and protected so the process has no access to it.) The library code running in the process does not keep track of the current file position, nor does it need to specify the position when doing read/write calls. > and many file attributes. Not sure whetherread ahead buffers are in > process or kernel memory. Higher level APIs like the standard C library fopen(), fread() etc. can do an extra level of buffering, but any buffering done at the block level is managed by the kernel, which is able to correctly update those buffers if two processes have the same file open and one of them modifies it. > > You seem to be imagining a nonexistent problem. > > Since it is a new file system with a new way to store data (moving it > around to different blocks with every rewite operation), asking > questions on how it works is normal. It is not relevant to how processes see files. It is a kernel implementation issue. > As I said before, the WWDC presentation explicitely skipped over > concurrency issues (hint of unresolved issues). > > > I don't care what some obscure or obsolete system you've dealt with > > does. There is no such "synch()" function or anything resembling it in > > standard C, > > man fsync > > FSYNC(2) BSD System Calls Manual > FSYNC(2) > > NAME > fsync -- synchronize a file's in-core state with that on disk If you had bothered to read the first paragraph of DESCRIPTION section you would see that fsync() is for the purpose of making sure data has been written to disk (at least as far as the drive's internal cache) rather than just to in-memory cache managed by the kernel. fsync() has nothing to do with reading files, and neither process with the same open file needs to do an fsync() for a write by process 2 to be readable by process 1. This is not your imagined synch() call. No such call exists in the BSD API. > And there is a good reason for that call: if you do mass writes to > append to the file, an implicit fsync after every write would kill > performance, you are much better off stacking many writes for efficiency > and either having the implicit fsync (or expklicit) are regular > intervals or when you're done. > > Even in Finder, you will find cases where the number of bytes used > remains at 0 while another process writes to it, but bytes allocated > increases and only once file closed is the bytes used updated. That has nothing to do with multiple processes having the same file open, nor does it have anything to do with fsync(). Finder doesn't have the file open. Finder reads the directory at the point you open the window, reporting the current size and other metadata for each file in the directory. It doesn't update its view of the directory until it is told by the system that the directory has changed. If a file in the directory is being modified, that does not signal a change to the directory until the modified file is closed. This is easily tested by having something periodically write to a file without closing it, and comparing what you see in Finder vs ls -l in Terminal. ls looks at the current state. Finder only updates its reported file size when the file is closed (or if Finder refreshes its view of the directory for some other reason such as quit and relaunch of Finder). I've written some test programs to prove this. ls shows the current state as an open file is periodically written via write() with no other system calls, but Finder shows the initial file size unchanged until the file is closed. Adding an fsync() after the write makes no difference. To get Finder to update dynamically it is necessary to close() the file after the write() (then open it again prior to the next write). A companion test program which opens the file and monitors its current length via fstat(), printing changes, correctly shows the file size at the moment it is written by the write() call, indicating that the in-memory state of the file is being updated without needing to flush anything to disk, with nothing resembling a sync call in either the reader or writer. > Applications that write to log files want an explicit or implicit fsync > to happen after every write since they don't know when the next write > will happen and it isn't a performance hit if the next write won't > happen for another few seconds. That's because they don't want to lose data that didn't get written to disk if the system happened to crash or there is a power cut shortly after the write. > Apple may or may not have implemented live updates to such structures > and this wasn't mentioned in the WWDC presentation which didn't deal > with concurrency. So I don't think my question is unwarranted. It is unwarranted, because current OS versions and file systems do not behave the way you describe. > > If process 1 and process 2 both have the same file open, and process 2 > > writes to the file changing its length, the next call by process 1 to > > the BSD API fstat() or similar will see the updated file length. > > does fstat cause and implicit fsync? No. Why would it? fstat() reads information about a file, much of which will be cached in memory, whereas fsync() flushes data from the kernel caches to disk. If fstat() forced a sync operation it would kill performance, because fstat() is heavily used to poll for changes to an open file. > But if you read sequentially, while aother process rewrites random areas > of file, concurrency starts to matter. Yes, and the kernel takes care of that. > > It is all handled inside the kernel, which knows there are two open > > references to the same file, and updates caches and other in-memory data > > structures accordingly. > > Wel, the "all handled inside the kernel" is the big question. If you > make low level calls, sure, but even "low level" calls in C for instance > still have the C-run time in between the programmer and the kernel. Calls like open(), read(), write() etc. are calls to the kernel on UNIX systems. The only support code in the C run-time library is to deal with things like dynamic loading, parameter passing, etc., and nothing to do with the state of the file which is being accessed. I can confirm this for macOS because I did an assembly level single-step through my test program calling write(). It did a lot of mucking around with dynamic loading and symbols, then ultimately executed a syscall instruction. Other platforms such as Windows need a translation layer in the C run-time library to convert the BSD file API calls into the native API, including extra processing for things like files open in text mode with end of line character translation (which involves extra buffering), and converting error results to standardised values. Even then, the C run-time code for these calls doesn't do any file state management - it is handled in the system call. For example, on Windows, the read() function in the C library ends up calling the Windows API ReadFile(). The C library code doesn't remember the file position. The kernel deals with that. > It is all likely that Apple has solved the problem. And you may very > well be right, But just stating that it is solved doesn't give me a > pointer where Apple explains HOW it has handled concurrency for APFS. macOS already handles concurrency for "two open instances of the same file" in existing OS versions on HFS+ and other file systems, so there will not be any change in this area for APFS. -- David Empson dempson@actrix.gen.nz
[toc] | [prev] | [next] | [standalone]
| From | JF Mezei <jfmezei.spamnot@vaxination.ca> |
|---|---|
| Date | 2017-05-24 20:51 -0400 |
| Message-ID | <59262a85$0$22714$c3e8da3$33881b6a@news.astraweb.com> |
| In reply to | #107195 |
On 2017-05-24 20:37, David Empson wrote: > Other platforms such as Windows need a translation layer in the C > run-time library to convert the BSD file API calls into the native API, In terms of C on OS-X, does the fact that C calls are translated to HFS calls not mean there is a translation layer ? > For example, on Windows, the read() function in the C library ends up > calling the Windows API ReadFile(). The C library code doesn't remember > the file position. The kernel deals with that. when you do random read/write, the higher level language deals with file position. And in VMS for instance, you can specify your own RAB/FAB *process storage* to hold the open file context. Since not all OS are alike it is a fair question to ask. You seem to have answered it with the info that for OS-X, it is all in kernel and it manages multiple "opens" to the same file and keep them coherent with each other. Thanks. Sorry if it took a while.
[toc] | [prev] | [next] | [standalone]
| From | dempson@actrix.gen.nz (David Empson) |
|---|---|
| Date | 2017-05-26 13:44 +1200 |
| Message-ID | <1n6l1cr.tlzxi31j4665pN%dempson@actrix.gen.nz> |
| In reply to | #107196 |
JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > On 2017-05-24 20:37, David Empson wrote: > > > Other platforms such as Windows need a translation layer in the C > > run-time library to convert the BSD file API calls into the native API, > > In terms of C on OS-X, does the fact that C calls are translated to HFS > calls not mean there is a translation layer ? C calls are not "translated to HFS calls". C calls to the BSD file API are handled by the kernel. The BSD API defines a standard interface for how to access files, independent of the underlying file system. The kernel is responsible for using the appropriate file system driver to deal with the specifics of the file system on which the file resides, whether it be UFS, HFS+, FAT, APFS, some other installed file system, or a request forwarded to server for a networked file system. The HFS+ file system driver implements the kernel's internal file I/O operations as appropriate for the HFS+ file system, e.g. dealing with the catalog tree and other structures, extra layers for Core Storage if appropriate, etc. > > For example, on Windows, the read() function in the C library ends up > > calling the Windows API ReadFile(). The C library code doesn't remember > > the file position. The kernel deals with that. > > when you do random read/write, the higher level language deals with file > position. Yes, by calling the BSD API lseek() which sets the file position (as managed by the kernel) to whatever the application wants. lseek() also returns the file position, so the application can find the current position if it needs to know. > And in VMS for instance you can specify your own RAB/FAB *process > storage* to hold the open file context. Irrelevant. This thread is about macOS (and by extension, UNIX systems which support the same file I/O API). I could comment on how ProDOS on the Apple II implements file buffering in memory supplied by the application, but it would be equally irrelevant to how macOS and the BSD API manage file state and buffering, or how APFS will behave. > Since not all OS are alike it is a fair question to ask. You assumed that file buffering and state was managed by processes, expressed it as a fact, and jumped to multiple wrong conclusions as a result. I don't call that a "question". > You seem to have answered it with the info that for OS-X, it is all in > kernel and it manages multiple "opens" to the same file and keep them > coherent with each other. > > Thanks. Sorry if it took a while. -- David Empson dempson@actrix.gen.nz
[toc] | [prev] | [next] | [standalone]
| From | Jolly Roger <jollyroger@pobox.com> |
|---|---|
| Date | 2017-05-26 02:11 +0000 |
| Message-ID | <eoph5vFj238U1@mid.individual.net> |
| In reply to | #107202 |
On 2017-05-26, David Empson <dempson@actrix.gen.nz> wrote: > JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > >> Since not all OS are alike it is a fair question to ask. > > You assumed that file buffering and state was managed by processes, > expressed it as a fact, and jumped to multiple wrong conclusions as a > result. I don't call that a "question". You have way more patience than most, David! -- E-mail sent to this address may be devoured by my ravenous SPAM filter. I often ignore posts from Google. Use a real news client instead. JR
[toc] | [prev] | [next] | [standalone]
| From | JF Mezei <jfmezei.spamnot@vaxination.ca> |
|---|---|
| Date | 2017-05-26 00:08 -0400 |
| Message-ID | <5927aa47$0$61830$c3e8da3$e074e489@news.astraweb.com> |
| In reply to | #107202 |
On 2017-05-25 21:44, David Empson wrote: > Irrelevant. This thread is about macOS (and by extension, UNIX systems > which support the same file I/O API). It may be irrelevant to you because you knew the answer, not not irrelevant to someonone who only knows that OS-X is on a mach kernel and there are differet Unix kernels out there and different implementations of file systems that hapen to share the same higher level APIs to make them "Unix". BTW, VMS got Unix certification before Solaris. And OS-X also got Unix certification at some point. Not all Unixes are alike. With regards to "fsync". When you have multiple hosts accessing a shared file system in a disk array (not a file server), the flushing to disk is the only means to signal changes to files so the other compuyters can see those changes. This is why the WWDC message about APFS being designed for desktops and not necessary most optimal for servers raised my questions. (I am not debating your answer, just explaining why I had the question).
[toc] | [prev] | [next] | [standalone]
| From | Lewis <g.kreme@gmail.com.dontsendmecopies> |
|---|---|
| Date | 2017-05-26 01:27 +0000 |
| Message-ID | <slrnoif1ar.1hf.g.kreme@snow.local> |
| In reply to | #107194 |
In message <5925e5ae$0$38718$b1db1813$19ace300@news.astraweb.com> JF Mezei <jfmezei.spamnot@vaxination.ca> wrote: > On 2017-05-24 07:27, David Empson wrote: >> Processes do not get file structures when then open files. They get a >> descriptor to the file, and the internal details of the on-disk >> strucures are managed by the kernel and file system driver. > There is a lot of file context kept in process memory. (for instance > where in the file the current" pointer" is for the next read, and many > file attributes. No. You are entirely wrong. > Since it is a new file system with a new way to store data (moving it > around to different blocks with every rewite operation), asking > questions on how it works is normal. That's not waht you are doing, you are making up inane 'problems' that don't exist and 'asking' how APFS deals with them. These are problems that don't exist in any modern file system, because you don't understand how computers, kernels, drivers, and file systems work. At all. > As I said before, the WWDC presentation explicitely skipped over > concurrency issues (hint of unresolved issues). No, you are making shit up. > man fsync > FSYNC(2) BSD System Calls Manual > FSYNC(2) > NAME > fsync -- synchronize a file's in-core state with that on disk Do you know what "in-core" means? Did you READ the man page? "Note that while fsync() will flush all data from the host to the drive (i.e. the "permanent storage device"), the drive itself may not physically write the data to the platters for quite some time and it may be written in an out-of-order sequence." fsync is a flush command, and isn't doing at all what you are implying. > And there is a good reason for that call: if you do mass writes to > append to the file, an implicit fsync after every write would kill > performance, you are much better off stacking many writes for efficiency > and either having the implicit fsync (or expklicit) are regular > intervals or when you're done. That is not how fsync is used, if it used. > Applications that write to log files want an explicit or implicit fsync No. -- And what group was that, Gail? The Menstrual Cycles.
[toc] | [prev] | [next] | [standalone]
| From | Lewis <g.kreme@gmail.com.dontsendmecopies> |
|---|---|
| Date | 2017-05-23 09:36 +0000 |
| Message-ID | <slrnoi80rg.2umq.g.kreme@snow.local> |
| In reply to | #107138 |
In message <1n6hbwa.1dwmt491syhev2N%dempson@actrix.gen.nz> David Empson <dempson@actrix.gen.nz> wrote: > A "version" mechanism would have to be implemented at a higher level, > e.g. by an application or the OS explictly cloning the file to preserve > a reference to its current state. That results in another file with a > different pathname which can be opened separately. The clone happens to > initially share storage with the original file, but it won't see any > subsequent changes to the original file. > For example, this could be used by the Autosave and Versions mechanism, > or by Time Machine Local Storage backups. I know at least some people are assuming that the APFS version of Time Machine will do exactly this. I am not so sure, but I've not really read up that much on APFS (only enough to know JF was wrong again, but not enough to point out the exact error in cloned files as you did). If it does, it means a TM backup disk is going to be able to hold way more history than it does now. -- You came in that thing? You're braver than I thought!
[toc] | [prev] | [next] | [standalone]
| From | dempson@actrix.gen.nz (David Empson) |
|---|---|
| Date | 2017-05-24 01:42 +1200 |
| Message-ID | <1n6i8ei.6zwet78mqwtwN%dempson@actrix.gen.nz> |
| In reply to | #107154 |
Lewis <g.kreme@gmail.com.dontsendmecopies> wrote: > In message <1n6hbwa.1dwmt491syhev2N%dempson@actrix.gen.nz> David Empson > <dempson@actrix.gen.nz> wrote: > > A "version" mechanism would have to be implemented at a higher level, > > e.g. by an application or the OS explictly cloning the file to preserve > > a reference to its current state. That results in another file with a > > different pathname which can be opened separately. The clone happens to > > initially share storage with the original file, but it won't see any > > subsequent changes to the original file. > > > For example, this could be used by the Autosave and Versions mechanism, > > or by Time Machine Local Storage backups. > > I know at least some people are assuming that the APFS version of Time > Machine will do exactly this. I am not so sure, but I've not really read > up that much on APFS (only enough to know JF was wrong again, but not > enough to point out the exact error in cloned files as you did). > > If it does, it means a TM backup disk is going to be able to hold way more > history than it does now. Apple didn't say anything last year about Time Machine and APFS, but it makes perfect sense to use clones to implement the Local Backup mechanism. (Keeping snapshots instead would be overkill because they would require keeping the state of the entire volume, whereas local backups are only used for selected files.) The Autosave and Versions mechanism wouldn't work so well with clones, because it needs to track insertions and deletions, not just identically sized replacements. As for Time Machine on a separate disk, with both disks using APFS, that is crying out for use of snapshots to get a frozen sample of the source drive. How to do the backup raises questions: there could be new features not yet described (e.g. backing up snapshots with structure retention, including deltas between snapshots), or they could just use clones on the backup drive. WWDC is going to be interesting. -- David Empson dempson@actrix.gen.nz
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.sys.mac.system
csiph-web