Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.forth > #3616 > unrolled thread
| Started by | Jean-François Michaud <cometaj@comcast.net> |
|---|---|
| First post | 2011-06-28 14:19 -0700 |
| Last post | 2011-07-03 08:44 -1000 |
| Articles | 5 — 3 participants |
Back to article view | Back to comp.lang.forth
DPANS94 BLOCK/BUFFER ambiguity? Jean-François Michaud <cometaj@comcast.net> - 2011-06-28 14:19 -0700
Re: DPANS94 BLOCK/BUFFER ambiguity? Elizabeth D Rather <erather@forth.com> - 2011-06-28 14:06 -1000
Re: DPANS94 BLOCK/BUFFER ambiguity? Mark Wills <markrobertwills@yahoo.co.uk> - 2011-06-30 12:46 +0100
Re: DPANS94 BLOCK/BUFFER ambiguity? Jean-François Michaud <cometaj@comcast.net> - 2011-07-03 09:38 -0700
Re: DPANS94 BLOCK/BUFFER ambiguity? Elizabeth D Rather <erather@forth.com> - 2011-07-03 08:44 -1000
| From | Jean-François Michaud <cometaj@comcast.net> |
|---|---|
| Date | 2011-06-28 14:19 -0700 |
| Subject | DPANS94 BLOCK/BUFFER ambiguity? |
| Message-ID | <121cc3fd-5605-4c3e-b145-364058b0a886@d19g2000prh.googlegroups.com> |
I was wondering what you guys' thoughts were regarding the BLOCK/ BUFFER definition in the standard which seems, to me, to be ambiguous at best. There seems to be an implicit assumption here that ANS94 doesn't address very well. Because BUFFER exists, we can't really assume that the address returned by BLOCK contains mass-storage content unless we enforce it. We are trading speed (buffering) for accuracy (assurance that disk block content is present in the buffer). The standard seems to want to address both accuracy and speed simultaneously but seemingly creates an logical ambiguity instead. Trivially, "5 BUFFER 5 BLOCK" would yield the buffer address upon calling block but without the particular assigned buffer containing mass-storage content. For content accuracy, per my understanding, block MUST read from disk *everytime*. For speed, accuracy can be sacrificed by returning a buffer address without knowing whether the buffer contains mass- storage content or not, effectively giving precedence to BUFFER over BLOCK (alternatively, accuracy would give precedence to BLOCK over BUFFER except when a BUFFER is UPDATEd). Interestingly, and quoted from DPANS94, and taken from both the BLOCK/ BUFFER definition, "If block u is already in a block buffer, a-addr is the address of that block buffer." cannot be strictly determined without doing a comparion between the content of the buffer and the mass-storage content for the associated block number which implies a systematic mass-storage read or a mirror image of mass-storage being present and maintained in memory for such a comparison to take place. So we can generally extrapolate that the standard doesn't intend for a very heavy implementation and is thus implicitly giving precedence to BUFFER over BLOCK while assuming speed over accuracy. Also, DPANS94 ends the definition with: "At the conclusion of the operation, the block buffer pointed to by a-addr is the current block buffer and is assigned to u". Somewhat assuming that mass storage is accurately reflected when this is strictly incorrect given the above. Also, doesn't the standard imply implementation details? A simple modular arithmetic implementation for buffer assignment while assuming accuracy over speed, for example, would simply yield a buffer that would either reference the disk block content or not and would have the desired block number assigned to it or not. Similarly, In the modular arithmetic implementation argument provided above, and still assuming accuracy rather than speed, determining whether block <n> is present or not and associated to a buffer may be irrelevant if a mass storage MUST be read when calling BLOCK; BUFFER only gaining precedence over BLOCK in the case where a BUFFER is UPDATEd. Thoughts? Regards Jean-Francois Michaud
[toc] | [next] | [standalone]
| From | Elizabeth D Rather <erather@forth.com> |
|---|---|
| Date | 2011-06-28 14:06 -1000 |
| Message-ID | <TK6dnQAxZvMA8ZfTnZ2dnUVZ_vCdnZ2d@supernews.com> |
| In reply to | #3616 |
On 6/28/11 11:19 AM, Jean-François Michaud wrote: > I was wondering what you guys' thoughts were regarding the BLOCK/ > BUFFER definition in the standard which seems, to me, to be ambiguous > at best. > > There seems to be an implicit assumption here that ANS94 doesn't > address very well. Because BUFFER exists, we can't really assume that > the address returned by BLOCK contains mass-storage content unless we > enforce it. We are trading speed (buffering) for accuracy (assurance > that disk block content is present in the buffer). The standard seems > to want to address both accuracy and speed simultaneously but > seemingly creates an logical ambiguity instead. You are correct, there are some implicit assumptions here. The main one is that BUFFER is almost never used in application code. An exception is in a disk copy or initialization utility, in which since you are going to overwrite the content of the block anyway there is no need to do a READ. Basically, you should think of BUFFER as an underlying factor of BLOCK which is exposed for use in a very narrow range of circumstances, in which the programmer must understand the consequences. Never use BUFFER unless you are absolutely sure you do not care what the contents of that block are because you are about to replace them. > Trivially, "5 BUFFER 5 BLOCK" would yield the buffer address upon > calling block but without the particular assigned buffer containing > mass-storage content. One wouldn't want to do that. This is why the text has all the warnings about exactly what these words do. > For content accuracy, per my understanding, block MUST read from disk > *everytime*. For speed, accuracy can be sacrificed by returning a > buffer address without knowing whether the buffer contains mass- > storage content or not, effectively giving precedence to BUFFER over > BLOCK (alternatively, accuracy would give precedence to BLOCK over > BUFFER except when a BUFFER is UPDATEd). To avoid confusion, user code should avoid BUFFER. By sticking to BLOCK you will get both excellent reliability and excellent performance. > Interestingly, and quoted from DPANS94, and taken from both the BLOCK/ > BUFFER definition, "If block u is already in a block buffer, a-addr is > the address of that block buffer." cannot be strictly determined > without doing a comparion between the content of the buffer and the > mass-storage content for the associated block number which implies a > systematic mass-storage read or a mirror image of mass-storage being > present and maintained in memory for such a comparison to take place. > So we can generally extrapolate that the standard doesn't intend for a > very heavy implementation and is thus implicitly giving precedence to > BUFFER over BLOCK while assuming speed over accuracy. In fact, it is inappropriate to assume that a block in a buffer should be identical to its disk equivalent, since it may have been modified since it's been in memory, and the buffer marked UPDATEd. In my experience, accuracy isn't impaired at all. Native (i.e., no host OS) block systems have been used in very large and complex data base applications, with many users and intense requirements for data security, and OS-hosted versions have shown equivalent reliability, though with much slower performance. Even though disks are a lot faster nowadays, a disk read consumes many microseconds compared with memory-only operations, and in a disk-intensive application the number of physical disk accesses becomes a dominant factor in performance. Professional implementations even try to avoid unnecessary disk accesses by doing things like picking the least recently accessed buffer to overwrite when a READ is to be done, and such efforts can show significant performance improvement. On hosted systems the necessity for an OS call to get a block makes the overhead even worse. Nowadays, we find blocks being implemented in flash, which changes some of the tradeoffs, but the principles are the same. > Also, DPANS94 ends the definition with: "At the conclusion of the > operation, the block buffer pointed to by a-addr is the current block > buffer and is assigned to u". Somewhat assuming that mass storage is > accurately reflected when this is strictly incorrect given the above. It will be correct if the programmer avoids introducing ambiguity by using BUFFER inappropriately. > Also, doesn't the standard imply implementation details? A simple > modular arithmetic implementation for buffer assignment while assuming > accuracy over speed, for example, would simply yield a buffer that > would either reference the disk block content or not and would have > the desired block number assigned to it or not. > > Similarly, In the modular arithmetic implementation argument provided > above, and still assuming accuracy rather than speed, determining > whether block<n> is present or not and associated to a buffer may be > irrelevant if a mass storage MUST be read when calling BLOCK; BUFFER > only gaining precedence over BLOCK in the case where a BUFFER is > UPDATEd. > > Thoughts? Not sure what you mean by a modular arithmetic implementation. Of course some implementations will be more successful than others, but I think there's enough info for compliance. ANS Forth tries to focus on semantics, not implementation. Cheers, Elizabeth -- ================================================== Elizabeth D. Rather (US & Canada) 800-55-FORTH FORTH Inc. +1 310.999.6784 5959 West Century Blvd. Suite 700 Los Angeles, CA 90045 http://www.forth.com "Forth-based products and Services for real-time applications since 1973." ==================================================
[toc] | [prev] | [next] | [standalone]
| From | Mark Wills <markrobertwills@yahoo.co.uk> |
|---|---|
| Date | 2011-06-30 12:46 +0100 |
| Message-ID | <TdCdnSrTI_qU_5HTnZ2dnUVZ7v2dnZ2d@bt.com> |
| In reply to | #3616 |
On 28/06/2011 22:19, Jean-François Michaud wrote: > Interestingly, and quoted from DPANS94, and taken from both the BLOCK/ > BUFFER definition, "If block u is already in a block buffer, a-addr is > the address of that block buffer." cannot be strictly determined > without doing a comparion between the content of the buffer and the > mass-storage content for the associated block number which implies a > systematic mass-storage read or a mirror image of mass-storage being > present and maintained in memory for such a comparison to take place. > So we can generally extrapolate that the standard doesn't intend for a > very heavy implementation and is thus implicitly giving precedence to > BUFFER over BLOCK while assuming speed over accuracy. A comparison is not required. When BLOCK is called to retrieve a block from disk, an internal table of some sort is consulted to determine if the block is already in memory. If it is, the address of the beginning of the block is returned. In addition, a mechanism is provided to indicate if the contents of the block have been changed in memory, meaning the version on disk is now out of date. In my implementation, I call updated block buffers "dirty". UPDATE will utilise this mechanism to set the last accessed block's status to dirty. FLUSH will also use this information to determine which blocks to flush to disk when no more buffers are available. Regards Mark
[toc] | [prev] | [next] | [standalone]
| From | Jean-François Michaud <cometaj@comcast.net> |
|---|---|
| Date | 2011-07-03 09:38 -0700 |
| Message-ID | <c6c3b5ea-8492-4925-bc42-15c7f587c8f2@p19g2000prg.googlegroups.com> |
| In reply to | #3665 |
On Jun 30, 4:46 am, Mark Wills <markrobertwi...@yahoo.co.uk> wrote: > On 28/06/2011 22:19, Jean-Fran ois Michaud wrote: > > > Interestingly, and quoted from DPANS94, and taken from both the BLOCK/ > > BUFFER definition, "If block u is already in a block buffer, a-addr is > > the address of that block buffer." cannot be strictly determined > > without doing a comparion between the content of the buffer and the > > mass-storage content for the associated block number which implies a > > systematic mass-storage read or a mirror image of mass-storage being > > present and maintained in memory for such a comparison to take place. > > So we can generally extrapolate that the standard doesn't intend for a > > very heavy implementation and is thus implicitly giving precedence to > > BUFFER over BLOCK while assuming speed over accuracy. > > A comparison is not required. When BLOCK is called to retrieve a block > from disk, an internal table of some sort is consulted to determine if > the block is already in memory. The angle I'm looking at here is that this cannot guaranteed because of BUFFER. If BUFFER is used and a block number becomes associated to a buffer, then the mass storage block may very well not be in memory in which case the address would be returned as if it were. To guarantee that the buffer does indeed contain mass storage content or a modified version thereafter originating from the mass storage content. I'm seeing that BUFFER cannot be used in conjunction with BLOCK without the above mentioned effects, potentially yielding unreliability wrt to having mass storage content be accurately reflected in a buffer. It all depends what the requirement is and I'm seeing that the spec seems to make assumptions about precedence between BLOCK and BUFFER without explicitly stating what those assumptions are. If it is, the address of the beginning > of the block is returned. > > In addition, a mechanism is provided to indicate if the contents of the > block have 6been changed in memory, meaning the version on disk is now > out of date. In my implementation, I call updated block buffers "dirty". > UPDATE will utilise this mechanism to set the last accessed block's > status to dirty. FLUSH will also use this information to determine which > blocks to flush to disk when no more buffers are available. I believe I clearly understand the accepted use of BLOCK and BUFFER, I'm mostly pointing out that an implementation based on the ANS 94 spec, if more strictly followed for these two words seems to yield an ambiguous condition. Regards Jean-Francois Michaud
[toc] | [prev] | [next] | [standalone]
| From | Elizabeth D Rather <erather@forth.com> |
|---|---|
| Date | 2011-07-03 08:44 -1000 |
| Message-ID | <sbOdnXP4Z4wzJY3TnZ2dnUVZ_j6dnZ2d@supernews.com> |
| In reply to | #3752 |
On 7/3/11 6:38 AM, Jean-François Michaud wrote: > On Jun 30, 4:46 am, Mark Wills<markrobertwi...@yahoo.co.uk> wrote: >> On 28/06/2011 22:19, Jean-Fran ois Michaud wrote: >> >>> Interestingly, and quoted from DPANS94, and taken from both the BLOCK/ >>> BUFFER definition, "If block u is already in a block buffer, a-addr is >>> the address of that block buffer." cannot be strictly determined >>> without doing a comparion between the content of the buffer and the >>> mass-storage content for the associated block number which implies a >>> systematic mass-storage read or a mirror image of mass-storage being >>> present and maintained in memory for such a comparison to take place. >>> So we can generally extrapolate that the standard doesn't intend for a >>> very heavy implementation and is thus implicitly giving precedence to >>> BUFFER over BLOCK while assuming speed over accuracy. >> >> A comparison is not required. When BLOCK is called to retrieve a block >> from disk, an internal table of some sort is consulted to determine if >> the block is already in memory. > > The angle I'm looking at here is that this cannot guaranteed because > of BUFFER. If BUFFER is used and a block number becomes associated to > a buffer, then the mass storage block may very well not be in memory > in which case the address would be returned as if it were. To > guarantee that the buffer does indeed contain mass storage content or > a modified version thereafter originating from the mass storage > content. I'm seeing that BUFFER cannot be used in conjunction with > BLOCK without the above mentioned effects, potentially yielding > unreliability wrt to having mass storage content be accurately > reflected in a buffer. It all depends what the requirement is and I'm > seeing that the spec seems to make assumptions about precedence > between BLOCK and BUFFER without explicitly stating what those > assumptions are. Indeed, if you misuse BUFFER you can get in trouble. That is true of many things in programming. You are welcome to suggest clarifying language to the Forth 20xx Committee if it would make you feel better. > If it is, the address of the beginning >> of the block is returned. >> >> In addition, a mechanism is provided to indicate if the contents of the >> block have 6been changed in memory, meaning the version on disk is now >> out of date. In my implementation, I call updated block buffers "dirty". >> UPDATE will utilise this mechanism to set the last accessed block's >> status to dirty. FLUSH will also use this information to determine which >> blocks to flush to disk when no more buffers are available. > > I believe I clearly understand the accepted use of BLOCK and BUFFER, > I'm mostly pointing out that an implementation based on the ANS 94 > spec, if more strictly followed for these two words seems to yield an > ambiguous condition. If you understand the "accepted use of BLOCK and BUFFER" you also understand how to avoid ambiguity. As I said above, if you are concerned you should propose clarifying language for the new Standard (without changing the actual semantics of either word or requiring repeated, unnecessary disk reads). Cheers, Elizabeth -- ================================================== Elizabeth D. Rather (US & Canada) 800-55-FORTH FORTH Inc. +1 310.999.6784 5959 West Century Blvd. Suite 700 Los Angeles, CA 90045 http://www.forth.com "Forth-based products and Services for real-time applications since 1973." ==================================================
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.forth
csiph-web