Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.forth > #3616 > unrolled thread

DPANS94 BLOCK/BUFFER ambiguity?

Started byJean-François Michaud <cometaj@comcast.net>
First post2011-06-28 14:19 -0700
Last post2011-07-03 08:44 -1000
Articles 5 — 3 participants

Back to article view | Back to comp.lang.forth


Contents

  DPANS94 BLOCK/BUFFER ambiguity? Jean-François Michaud <cometaj@comcast.net> - 2011-06-28 14:19 -0700
    Re: DPANS94 BLOCK/BUFFER ambiguity? Elizabeth D Rather <erather@forth.com> - 2011-06-28 14:06 -1000
    Re: DPANS94 BLOCK/BUFFER ambiguity? Mark Wills <markrobertwills@yahoo.co.uk> - 2011-06-30 12:46 +0100
      Re: DPANS94 BLOCK/BUFFER ambiguity? Jean-François Michaud <cometaj@comcast.net> - 2011-07-03 09:38 -0700
        Re: DPANS94 BLOCK/BUFFER ambiguity? Elizabeth D Rather <erather@forth.com> - 2011-07-03 08:44 -1000

#3616 — DPANS94 BLOCK/BUFFER ambiguity?

FromJean-François Michaud <cometaj@comcast.net>
Date2011-06-28 14:19 -0700
SubjectDPANS94 BLOCK/BUFFER ambiguity?
Message-ID<121cc3fd-5605-4c3e-b145-364058b0a886@d19g2000prh.googlegroups.com>
I was wondering what you guys' thoughts were regarding the BLOCK/
BUFFER definition in the standard which seems, to me, to be ambiguous
at best.

There seems to be an implicit assumption here that ANS94 doesn't
address very well. Because BUFFER exists, we can't really assume that
the address returned by BLOCK contains mass-storage content unless we
enforce it. We are trading speed (buffering) for accuracy (assurance
that disk block content is present in the buffer). The standard seems
to want to address both accuracy and speed simultaneously but
seemingly creates an logical ambiguity instead.

Trivially, "5 BUFFER 5 BLOCK" would yield the buffer address upon
calling block but without the particular assigned buffer containing
mass-storage content.

For content accuracy, per my understanding, block MUST read from disk
*everytime*. For speed, accuracy can be sacrificed by returning a
buffer address without knowing whether the buffer contains mass-
storage content or not, effectively giving precedence to BUFFER over
BLOCK (alternatively, accuracy would give precedence to BLOCK over
BUFFER except when a BUFFER is UPDATEd).

Interestingly, and quoted from DPANS94, and taken from both the BLOCK/
BUFFER definition, "If block u is already in a block buffer, a-addr is
the address of that block buffer." cannot be strictly determined
without doing a comparion between the content of the buffer and the
mass-storage content for the associated block number which implies a
systematic mass-storage read or a mirror image of mass-storage being
present and maintained in memory for such a comparison to take place.
So we can generally extrapolate that the standard doesn't intend for a
very heavy implementation and is thus implicitly giving precedence to
BUFFER over BLOCK while assuming speed over accuracy.

Also, DPANS94 ends the definition with: "At the conclusion of the
operation, the block buffer pointed to by a-addr is the current block
buffer and is assigned to u". Somewhat assuming that mass storage is
accurately reflected when this is strictly incorrect given the above.

Also, doesn't the standard imply implementation details? A simple
modular arithmetic implementation for buffer assignment while assuming
accuracy over speed, for example, would simply yield a buffer that
would either reference the disk block content or not and would have
the desired block number assigned to it or not.

Similarly, In the modular arithmetic implementation argument provided
above, and still assuming accuracy rather than speed, determining
whether block <n> is present or not and associated to a buffer may be
irrelevant if a mass storage MUST be read when calling BLOCK; BUFFER
only gaining precedence over BLOCK in the case where a BUFFER is
UPDATEd.

Thoughts?

Regards
Jean-Francois Michaud

[toc] | [next] | [standalone]


#3618

FromElizabeth D Rather <erather@forth.com>
Date2011-06-28 14:06 -1000
Message-ID<TK6dnQAxZvMA8ZfTnZ2dnUVZ_vCdnZ2d@supernews.com>
In reply to#3616
On 6/28/11 11:19 AM, Jean-François Michaud wrote:
> I was wondering what you guys' thoughts were regarding the BLOCK/
> BUFFER definition in the standard which seems, to me, to be ambiguous
> at best.
>
> There seems to be an implicit assumption here that ANS94 doesn't
> address very well. Because BUFFER exists, we can't really assume that
> the address returned by BLOCK contains mass-storage content unless we
> enforce it. We are trading speed (buffering) for accuracy (assurance
> that disk block content is present in the buffer). The standard seems
> to want to address both accuracy and speed simultaneously but
> seemingly creates an logical ambiguity instead.

You are correct, there are some implicit assumptions here.  The main one 
is that BUFFER is almost never used in application code.  An exception 
is in a disk copy or initialization utility, in which since you are 
going to overwrite the content of the block anyway there is no need to 
do a READ. Basically, you should think of BUFFER as an underlying factor 
of BLOCK which is exposed for use in a very narrow range of 
circumstances, in which the programmer must understand the consequences. 
  Never use BUFFER unless you are absolutely sure you do not care what 
the contents of that block are because you are about to replace them.

> Trivially, "5 BUFFER 5 BLOCK" would yield the buffer address upon
> calling block but without the particular assigned buffer containing
> mass-storage content.

One wouldn't want to do that.  This is why the text has all the warnings 
about exactly what these words do.

> For content accuracy, per my understanding, block MUST read from disk
> *everytime*. For speed, accuracy can be sacrificed by returning a
> buffer address without knowing whether the buffer contains mass-
> storage content or not, effectively giving precedence to BUFFER over
> BLOCK (alternatively, accuracy would give precedence to BLOCK over
> BUFFER except when a BUFFER is UPDATEd).

To avoid confusion, user code should avoid BUFFER.  By sticking to BLOCK 
you will get both excellent reliability and excellent performance.

> Interestingly, and quoted from DPANS94, and taken from both the BLOCK/
> BUFFER definition, "If block u is already in a block buffer, a-addr is
> the address of that block buffer." cannot be strictly determined
> without doing a comparion between the content of the buffer and the
> mass-storage content for the associated block number which implies a
> systematic mass-storage read or a mirror image of mass-storage being
> present and maintained in memory for such a comparison to take place.
> So we can generally extrapolate that the standard doesn't intend for a
> very heavy implementation and is thus implicitly giving precedence to
> BUFFER over BLOCK while assuming speed over accuracy.

In fact, it is inappropriate to assume that a block in a buffer should 
be identical to its disk equivalent, since it may have been modified 
since it's been in memory, and the buffer marked UPDATEd.

In my experience, accuracy isn't impaired at all.  Native (i.e., no host 
OS) block systems have been used in very large and complex data base 
applications, with many users and intense requirements for data 
security, and OS-hosted versions have shown equivalent reliability, 
though with much slower performance.

Even though disks are a lot faster nowadays, a disk read consumes many 
microseconds compared with memory-only operations, and in a 
disk-intensive application the number of physical disk accesses becomes 
a dominant factor in performance. Professional implementations even try 
to avoid unnecessary disk accesses by doing things like picking the 
least recently accessed buffer to overwrite when a READ is to be done, 
and such efforts can show significant performance improvement.  On 
hosted systems the necessity for an OS call to get a block makes the 
overhead even worse.

Nowadays, we find blocks being implemented in flash, which changes some 
of the tradeoffs, but the principles are the same.

> Also, DPANS94 ends the definition with: "At the conclusion of the
> operation, the block buffer pointed to by a-addr is the current block
> buffer and is assigned to u". Somewhat assuming that mass storage is
> accurately reflected when this is strictly incorrect given the above.

It will be correct if the programmer avoids introducing ambiguity by 
using BUFFER inappropriately.

> Also, doesn't the standard imply implementation details? A simple
> modular arithmetic implementation for buffer assignment while assuming
> accuracy over speed, for example, would simply yield a buffer that
> would either reference the disk block content or not and would have
> the desired block number assigned to it or not.
>
> Similarly, In the modular arithmetic implementation argument provided
> above, and still assuming accuracy rather than speed, determining
> whether block<n>  is present or not and associated to a buffer may be
> irrelevant if a mass storage MUST be read when calling BLOCK; BUFFER
> only gaining precedence over BLOCK in the case where a BUFFER is
> UPDATEd.
>
> Thoughts?

Not sure what you mean by a modular arithmetic implementation.  Of 
course some implementations will be more successful than others, but I 
think there's enough info for compliance.  ANS Forth tries to focus on 
semantics, not implementation.

Cheers,
Elizabeth

-- 
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

[toc] | [prev] | [next] | [standalone]


#3665

FromMark Wills <markrobertwills@yahoo.co.uk>
Date2011-06-30 12:46 +0100
Message-ID<TdCdnSrTI_qU_5HTnZ2dnUVZ7v2dnZ2d@bt.com>
In reply to#3616
On 28/06/2011 22:19, Jean-François Michaud wrote:
> Interestingly, and quoted from DPANS94, and taken from both the BLOCK/
> BUFFER definition, "If block u is already in a block buffer, a-addr is
> the address of that block buffer." cannot be strictly determined
> without doing a comparion between the content of the buffer and the
> mass-storage content for the associated block number which implies a
> systematic mass-storage read or a mirror image of mass-storage being
> present and maintained in memory for such a comparison to take place.
> So we can generally extrapolate that the standard doesn't intend for a
> very heavy implementation and is thus implicitly giving precedence to
> BUFFER over BLOCK while assuming speed over accuracy.

A comparison is not required. When BLOCK is called to retrieve a block 
from disk, an internal table of some sort is consulted to determine if 
the block is already in memory. If it is, the address of the beginning 
of the block is returned.

In addition, a mechanism is provided to indicate if the contents of the 
block have been changed in memory, meaning the version on disk is now 
out of date. In my implementation, I call updated block buffers "dirty". 
UPDATE will utilise this mechanism to set the last accessed block's 
status to dirty. FLUSH will also use this information to determine which 
blocks to flush to disk when no more buffers are available.

Regards

Mark

[toc] | [prev] | [next] | [standalone]


#3752

FromJean-François Michaud <cometaj@comcast.net>
Date2011-07-03 09:38 -0700
Message-ID<c6c3b5ea-8492-4925-bc42-15c7f587c8f2@p19g2000prg.googlegroups.com>
In reply to#3665
On Jun 30, 4:46 am, Mark Wills <markrobertwi...@yahoo.co.uk> wrote:
> On 28/06/2011 22:19, Jean-Fran ois Michaud wrote:
>
> > Interestingly, and quoted from DPANS94, and taken from both the BLOCK/
> > BUFFER definition, "If block u is already in a block buffer, a-addr is
> > the address of that block buffer." cannot be strictly determined
> > without doing a comparion between the content of the buffer and the
> > mass-storage content for the associated block number which implies a
> > systematic mass-storage read or a mirror image of mass-storage being
> > present and maintained in memory for such a comparison to take place.
> > So we can generally extrapolate that the standard doesn't intend for a
> > very heavy implementation and is thus implicitly giving precedence to
> > BUFFER over BLOCK while assuming speed over accuracy.
>
> A comparison is not required. When BLOCK is called to retrieve a block
> from disk, an internal table of some sort is consulted to determine if
> the block is already in memory.

The angle I'm looking at here is that this cannot guaranteed because
of BUFFER. If BUFFER is used and a block number becomes associated to
a buffer, then the mass storage block may very well not be in memory
in which case the address would be returned as if it were. To
guarantee that the buffer does indeed contain mass storage content or
a modified version thereafter originating from the mass storage
content. I'm seeing that BUFFER cannot be used in conjunction with
BLOCK without the above mentioned effects, potentially yielding
unreliability wrt to having mass storage content be accurately
reflected in a buffer. It all depends what the requirement is and I'm
seeing that the spec seems to make assumptions about precedence
between BLOCK and BUFFER without explicitly stating what those
assumptions are.

 If it is, the address of the beginning
> of the block is returned.
>
> In addition, a mechanism is provided to indicate if the contents of the
> block have 6been changed in memory, meaning the version on disk is now
> out of date. In my implementation, I call updated block buffers "dirty".
> UPDATE will utilise this mechanism to set the last accessed block's
> status to dirty. FLUSH will also use this information to determine which
> blocks to flush to disk when no more buffers are available.

I believe I clearly understand the accepted use of BLOCK and BUFFER,
I'm mostly pointing out that an implementation based on the ANS 94
spec, if more strictly followed for these two words seems to yield an
ambiguous condition.

Regards
Jean-Francois Michaud

[toc] | [prev] | [next] | [standalone]


#3756

FromElizabeth D Rather <erather@forth.com>
Date2011-07-03 08:44 -1000
Message-ID<sbOdnXP4Z4wzJY3TnZ2dnUVZ_j6dnZ2d@supernews.com>
In reply to#3752
On 7/3/11 6:38 AM, Jean-François Michaud wrote:
> On Jun 30, 4:46 am, Mark Wills<markrobertwi...@yahoo.co.uk>  wrote:
>> On 28/06/2011 22:19, Jean-Fran ois Michaud wrote:
>>
>>> Interestingly, and quoted from DPANS94, and taken from both the BLOCK/
>>> BUFFER definition, "If block u is already in a block buffer, a-addr is
>>> the address of that block buffer." cannot be strictly determined
>>> without doing a comparion between the content of the buffer and the
>>> mass-storage content for the associated block number which implies a
>>> systematic mass-storage read or a mirror image of mass-storage being
>>> present and maintained in memory for such a comparison to take place.
>>> So we can generally extrapolate that the standard doesn't intend for a
>>> very heavy implementation and is thus implicitly giving precedence to
>>> BUFFER over BLOCK while assuming speed over accuracy.
>>
>> A comparison is not required. When BLOCK is called to retrieve a block
>> from disk, an internal table of some sort is consulted to determine if
>> the block is already in memory.
>
> The angle I'm looking at here is that this cannot guaranteed because
> of BUFFER. If BUFFER is used and a block number becomes associated to
> a buffer, then the mass storage block may very well not be in memory
> in which case the address would be returned as if it were. To
> guarantee that the buffer does indeed contain mass storage content or
> a modified version thereafter originating from the mass storage
> content. I'm seeing that BUFFER cannot be used in conjunction with
> BLOCK without the above mentioned effects, potentially yielding
> unreliability wrt to having mass storage content be accurately
> reflected in a buffer. It all depends what the requirement is and I'm
> seeing that the spec seems to make assumptions about precedence
> between BLOCK and BUFFER without explicitly stating what those
> assumptions are.

Indeed, if you misuse BUFFER you can get in trouble.  That is true of 
many things in programming.  You are welcome to suggest clarifying 
language to the Forth 20xx Committee if it would make you feel better.

>   If it is, the address of the beginning
>> of the block is returned.
>>
>> In addition, a mechanism is provided to indicate if the contents of the
>> block have 6been changed in memory, meaning the version on disk is now
>> out of date. In my implementation, I call updated block buffers "dirty".
>> UPDATE will utilise this mechanism to set the last accessed block's
>> status to dirty. FLUSH will also use this information to determine which
>> blocks to flush to disk when no more buffers are available.
>
> I believe I clearly understand the accepted use of BLOCK and BUFFER,
> I'm mostly pointing out that an implementation based on the ANS 94
> spec, if more strictly followed for these two words seems to yield an
> ambiguous condition.

If you understand the "accepted use of BLOCK and BUFFER" you also 
understand how to avoid ambiguity. As I said above, if you are concerned 
you should propose clarifying language for the new Standard (without 
changing the actual semantics of either word or requiring repeated, 
unnecessary disk reads).

Cheers,
Elizabeth

-- 
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.forth


csiph-web