Groups > comp.lang.forth > #14522 > unrolled thread

Appending definition to another definition

Started by	programmingkidx@gmail.com
First post	2012-07-29 11:28 -0700
Last post	2012-07-29 18:23 -0400
Articles	20 on this page of 70 — 21 participants

Back to article view | Back to comp.lang.forth

  Appending definition to another definition programmingkidx@gmail.com - 2012-07-29 11:28 -0700
    Re: Appending definition to another definition Jason Damisch <jasondamisch@yahoo.com> - 2012-07-29 11:39 -0700
      Re: Appending definition to another definition teammember0x01@gmail.com - 2012-07-29 12:16 -0700
        Re: Appending definition to another definition Jason Damisch <jasondamisch@yahoo.com> - 2012-07-29 12:24 -0700
          Re: Appending definition to another definition programmingkidx@gmail.com - 2012-07-29 13:41 -0700
            Re: Appending definition to another definition Jason Damisch <jasondamisch@yahoo.com> - 2012-07-29 13:46 -0700
              Re: Appending definition to another definition programmingkidx@gmail.com - 2012-07-29 14:00 -0700
                Re: Appending definition to another definition Jason Damisch <jasondamisch@yahoo.com> - 2012-07-29 14:03 -0700
                Re: Appending definition to another definition Paul Rubin <no.email@nospam.invalid> - 2012-07-29 14:21 -0700
                Re: Appending definition to another definition "Elizabeth D. Rather" <erather@forth.com> - 2012-07-30 12:43 -0500
                Re: Appending definition to another definition rickman <gnuarm@gmail.com> - 2012-07-31 02:07 -0700
        Re: Appending definition to another definition Mark Wills <markrobertwills@yahoo.co.uk> - 2012-07-30 01:13 -0700
          Re: Appending definition to another definition Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-07-31 00:42 -0700
            Re: Appending definition to another definition Mark Wills <markrobertwills@yahoo.co.uk> - 2012-07-31 01:05 -0700
              Re: Appending definition to another definition Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-07-31 01:55 -0700
                Re: Appending definition to another definition George Hubert <georgeahubert@yahoo.co.uk> - 2012-07-31 09:55 -0700
                  Re: Appending definition to another definition Andrew Haley <andrew29@littlepinkcloud.invalid> - 2012-07-31 12:38 -0500
                    Re: Appending definition to another definition George Hubert <georgeahubert@yahoo.co.uk> - 2012-07-31 13:20 -0700
                      Re: Appending definition to another definition Andrew Haley <andrew29@littlepinkcloud.invalid> - 2012-08-01 04:34 -0500
                        Re: Appending definition to another definition hughaguilar96@yahoo.com - 2012-08-01 03:27 -0700
                          Re: Appending definition to another definition Coos Haak <chforth@hccnet.nl> - 2012-08-01 15:45 +0200
                            Re: Appending definition to another definition Paul Rubin <no.email@nospam.invalid> - 2012-08-01 07:01 -0700
                              Re: Appending definition to another definition Jason Damisch <jasondamisch@yahoo.com> - 2012-08-01 08:02 -0700
                                Re: Appending definition to another definition Mark Wills <markrobertwills@yahoo.co.uk> - 2012-08-01 08:07 -0700
                                  Re: Appending definition to another definition jim@rainbarrel.com - 2012-08-01 14:06 -0700
                            Re: Appending definition to another definition hughaguilar96@yahoo.com - 2012-08-01 16:59 -0700
                      Re: Appending definition to another definition anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-08-01 15:09 +0000
                        Re: Appending definition to another definition Paul Rubin <no.email@nospam.invalid> - 2012-08-01 08:56 -0700
                          Re: Appending definition to another definition anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-08-01 16:34 +0000
                            Re: Appending definition to another definition Paul Rubin <no.email@nospam.invalid> - 2012-08-01 10:15 -0700
                              Re: Appending definition to another definition George Hubert <georgeahubert@yahoo.co.uk> - 2012-08-01 11:04 -0700
                                Re: Appending definition to another definition Paul Rubin <no.email@nospam.invalid> - 2012-08-01 12:02 -0700
                                  Re: Appending definition to another definition George Hubert <georgeahubert@yahoo.co.uk> - 2012-08-01 13:51 -0700
                                  Re: Appending definition to another definition Aleksej Saushev <asau@inbox.ru> - 2012-08-02 05:28 +0400
                                    Re: Appending definition to another definition Paul Rubin <no.email@nospam.invalid> - 2012-08-01 19:04 -0700
                                      Re: Appending definition to another definition Aleksej Saushev <asau@inbox.ru> - 2012-08-02 08:33 +0400
                                        Re: Appending definition to another definition Paul Rubin <no.email@nospam.invalid> - 2012-08-01 22:07 -0700
                                          Re: Appending definition to another definition "A. K." <akk@nospam.org> - 2012-08-02 08:03 +0200
                                            Re: Appending definition to another definition Paul Rubin <no.email@nospam.invalid> - 2012-08-01 23:11 -0700
                                          Re: Appending definition to another definition hughaguilar96@yahoo.com - 2012-08-02 00:05 -0700
                                          Re: Appending definition to another definition Aleksej Saushev <asau@inbox.ru> - 2012-08-06 01:44 +0400
                              Re: Appending definition to another definition hughaguilar96@yahoo.com - 2012-08-01 16:48 -0700
                              heap implementation (was: Appending definition to another definition) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-08-02 14:48 +0000
                                Re: heap implementation Paul Rubin <no.email@nospam.invalid> - 2012-08-02 14:37 -0700
                                  Re: heap implementation anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-08-03 13:58 +0000
                                    Re: heap implementation Alex McDonald <blog@rivadpm.com> - 2012-08-03 08:57 -0700
                                      Re: heap implementation anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-08-03 16:12 +0000
                                        Re: heap implementation Alex McDonald <blog@rivadpm.com> - 2012-08-04 08:29 -0700
                                    Re: heap implementation Paul Rubin <no.email@nospam.invalid> - 2012-08-03 21:22 -0700
                                      Re: heap implementation Andrew Haley <andrew29@littlepinkcloud.invalid> - 2012-08-04 03:11 -0500
                                      Re: heap implementation anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-08-04 11:40 +0000
                                Re: heap implementation (was: Appending definition to another definition) Bernd Paysan <bernd.paysan@gmx.de> - 2012-08-03 00:04 +0200
                        Re: Appending definition to another definition George Hubert <georgeahubert@yahoo.co.uk> - 2012-08-01 11:20 -0700
                        Re: Appending definition to another definition Bernd Paysan <bernd.paysan@gmx.de> - 2012-08-02 01:55 +0200
                          Re: Appending definition to another definition Bernd Paysan <bernd.paysan@gmx.de> - 2012-08-02 02:09 +0200
                            Re: Appending definition to another definition hughaguilar96@yahoo.com - 2012-08-01 18:57 -0700
                          Re: Appending definition to another definition anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-08-02 09:41 +0000
                  Re: Appending definition to another definition hughaguilar96@yahoo.com - 2012-07-31 23:52 -0700
                    Re: Appending definition to another definition Coos Haak <chforth@hccnet.nl> - 2012-08-01 15:46 +0200
              Re: Appending definition to another definition jim@rainbarrel.com - 2012-08-01 14:01 -0700
                Re: Appending definition to another definition Jason Damisch <jasondamisch@yahoo.com> - 2012-08-01 14:39 -0700
          Re: Appending definition to another definition programmingkidx@gmail.com - 2012-08-05 19:35 -0700
    Re: Appending definition to another definition Jason Damisch <jasondamisch@yahoo.com> - 2012-07-29 11:40 -0700
      Re: Appending definition to another definition programmingkidx@gmail.com - 2012-07-29 11:58 -0700
        Re: Appending definition to another definition Jason Damisch <jasondamisch@yahoo.com> - 2012-07-29 12:08 -0700
        Re: Appending definition to another definition Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2012-07-29 22:43 +0100
          Re: Appending definition to another definition Arnold Doray <invalid@invalid.com> - 2012-07-30 02:34 +0000
            Re: Appending definition to another definition Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2012-07-30 06:59 +0100
        Re: Appending definition to another definition Arnold Doray <invalid@invalid.com> - 2012-07-30 13:42 +0000
    Re: Appending definition to another definition "Rod Pemberton" <do_not_have@notemailnot.cmm> - 2012-07-29 18:23 -0400

Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →

#14751

From	Aleksej Saushev <asau@inbox.ru>
Date	2012-08-06 01:44 +0400
Message-ID	<87mx29au9r.fsf@inbox.ru>
In reply to	#14647

Paul Rubin <no.email@nospam.invalid> writes:

> Aleksej Saushev <asau@inbox.ru> writes:
>> What is the point of maintaining a list when you can maintain at least
>> balanced tree with faster lookups and insertions?? You can implement
>> self-balancing tree the way you like with minimal effort.
>
> I don't understand what point you were making about the linear ordering
> of addresses in this case.  Structures like self-balancing trees are
> fancier than what one normally sees in Forth, in my limited knowledge.
> The scheme I suggested earlier of a pointer next to each allocated
> region, pointing back to a table of allocated addresses, seems a lot
> simpler than a self-balancing tree.

I'm not sure that information that can be easily overwritten in case of
buffer overrun is useful for correctness check. Extra pointer, whether
appended or prepended to allocated block, can be easily overwritten.
Storing this information in a separate memory region gives more chances
to its survival.

-- 
HE CE3OH...

[toc] | [prev] | [next] | [standalone]

#14638

From	hughaguilar96@yahoo.com
Date	2012-08-01 16:48 -0700
Message-ID	<c608e51f-f1fe-445b-8374-db91a6fcbe4a@googlegroups.com>
In reply to	#14620

On Wednesday, August 1, 2012 10:15:10 AM UTC-7, Paul Rubin wrote:
> anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
> 
> > On a 32-bit system I would expect it to fail for one out of 2^32 FREEs
> 
> > of non-ALLOCATEd addresses.
> 
> 
> 
> Using two or three cells would decrease that likelihood exponentially.
> 
> The alternative is to remember every pointer ALLOCATE has returned until
> 
> it is freed.  I guess that could be tolerable.  Have cell 0 point back
> 
> to a table entry owned by the allocator, where the table entry has the
> 
> allocated block's address, until it is freed.  The table would have to
> 
> expand dynamically, maybe using a tree structure, etc.  Blecch.
> 
> 
> 
> > And if not-crashing was guaranteed, there would likely be programs
> 
> > that make use of this guarantee and FREE arbitrary addresses with
> 
> > abandon,
> 
> 
> 
> Yeah, that would be a pretty horrendous way to program, but I suppose
> 
> issuing a guarantee would invite it.

Using multiple magic-number signatures doesn't help! What happens if you try to FREE a block of memory more than once? The old magic numbers are still sitting there in memory most likely.

The big problem here is that everybody seems to be stuck on using GLIBC with its MALLOC and FREE. According to Anton Ertl, this is why my ALLOCATION was rejected; because C doesn't have it and so neither can Forth. This is despite the fact that the allocated size is obviously stored internally already. Now we are talking about keeping a list or a tree of all the allocated blocks of memory. This is despite the fact that this information is obviously stored internally already. 

Oh, come on! Write your own heap and do a good job of it, with ALLOCATION available and FREE working properly. Stop relying on GLIBC. If you think that GLIBC is so wonderful, then why not just program in C and forget about Forth? 

This reliance on GLIBC is an albatross around your neck. Gforth is a disgrace to Forth --- abandon Gforth! --- your decision to support C-based Forth systems has brought you only bad luck!

I'm not going to support C-based systems in Straight Forth. My goal is to provide a *better* development system than C, so that C programmers will abandon C and switch to Forth --- I'm not trying to provide a Forth that is just a pathetic imitation of C, which is what Gforth is.

[toc] | [prev] | [next] | [standalone]

#14666 — heap implementation (was: Appending definition to another definition)

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2012-08-02 14:48 +0000
Subject	heap implementation (was: Appending definition to another definition)
Message-ID	<2012Aug2.164829@mips.complang.tuwien.ac.at>
In reply to	#14620

Paul Rubin <no.email@nospam.invalid> writes:
>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>> On a 32-bit system I would expect it to fail for one out of 2^32 FREEs
>> of non-ALLOCATEd addresses.
>
>Using two or three cells would decrease that likelihood exponentially.

Two or three cells extra for just this purpose is not what I consider
efficient.  And it would still be probabilistic.

>The alternative is to remember every pointer ALLOCATE has returned until
>it is freed.  I guess that could be tolerable.  Have cell 0 point back
>to a table entry owned by the allocator, where the table entry has the
>allocated block's address, until it is freed.  The table would have to
>expand dynamically, maybe using a tree structure, etc.  Blecch.

That's better: It still costs two cells per ALLOCATEd block, but at
least it's not probabilistic.

I would not use a tree for the table, just a big block; then the check
for validity can be performed in constant time.  Getting the big block
may be a problem in some environments, but with a large address space
(64 bits) and an OS that supports memory overcommitment it's a
non-issue.  In smaller environments one might grow that block towards
bigger addresses and grow the ALLOCATEd memory towards smaller
addresses.

Still, the space overhead of that approach is considerable for small
ALLOCATEd blocks, which are frequent.

I thought a little more about this problem and remembered that I have
already solved it, in the allocator used in my garbage collector.  The
garbage collector is conservative, so I need a way to check if a value
X can be the address of an allocated block.

I solved the problem as follows: for each grain (of, say, 8 bytes),
there are two bits (stored elsewhere): One says if the grain is at the
start of a block (this provides information about the length of the
block), the other says whether the block starting at the grain is
occupied or free.  These bits are stored in separate areas, which has
similar issues as the table discussed above.

The check whether an address points to an ALLOCATEd block can be
performed in constant time.  The space overhead is 3% for 8-byte
granularity.  To achieve the same space efficiency with a system that
needs three cells per block (one for the block size and two for the
two pointers mentioned above), the average block size would have to be
96 cells.

One could also combine the two systems: Have an allocator for blocks
up to 96 cells that uses my approach and one for larger blocks that
uses your approach; put the small blocks in one area of the address
space that does not contain any big blocks, then the address check can
just decide which check to use with a simple WITHIN check.

Implementing all that is left as an exercise to the reader:-)

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2012: http://www.euroforth.org/ef12/

[toc] | [prev] | [next] | [standalone]

#14674 — Re: heap implementation

From	Paul Rubin <no.email@nospam.invalid>
Date	2012-08-02 14:37 -0700
Subject	Re: heap implementation
Message-ID	<7xfw85otzy.fsf@ruckus.brouhaha.com>
In reply to	#14666

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
> I would not use a tree for the table, just a big block; then the check
> for validity can be performed in constant time. 

You mean a preallocated big block that could hold all the pointers
that might potentially be allocated all at once?  You might have
to reserve 100's of MB, or even GB.  Even though it might not allocate
real pages, it still seems awful.

A tree isn't needed: I think a linked list of blocks is good enough.
Just prepend new blocks to the linked list and remember where the newest
one is.  When you delete a pointer from the middle of some block, pop
and copy the most recent pointer from the newest block, into the
now-freed slot.  If that makes the block become empty, release it after
following its link to get to the previous block.  All these operations
are constant time (you do need a way to allocate these blocks...).

> I thought a little more about this problem and remembered that I have
> already solved it, in the allocator used in my garbage collector.  The
> garbage collector is conservative, so I need a way to check if a value
> X can be the address of an allocated block.

Of course that is also probabilistic (the definition of a conservative
gc).

[toc] | [prev] | [next] | [standalone]

#14688 — Re: heap implementation

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2012-08-03 13:58 +0000
Subject	Re: heap implementation
Message-ID	<2012Aug3.155806@mips.complang.tuwien.ac.at>
In reply to	#14674

Paul Rubin <no.email@nospam.invalid> writes:
>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>> I would not use a tree for the table, just a big block; then the check
>> for validity can be performed in constant time. 
>
>You mean a preallocated big block that could hold all the pointers
>that might potentially be allocated all at once?

Yes, but some variations are possible:

1) Start small, and double the size when it becomes too small; this
may incur the costs of relocation, though (either redo the pointers
when the size is doubled, or store offsets into the table instead of
pointers).

2) Use an expandable memory area (e.g., sbrk() in Unix, and mmap for
the data blocks).

>You might have
>to reserve 100's of MB, or even GB.  Even though it might not allocate
>real pages, it still seems awful.

Why?

Unused virtual memory costs very little.  Let's see what 1TB costs:

[~:77867] time gforth -e bye

real    0m0.007s
user    0m0.008s
sys     0m0.004s
[~:77868] time gforth -m 1T -e bye

real    0m0.007s
user    0m0.004s
sys     0m0.004s

I.e., time difference in the noise.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
anton     6119  0.0  0.0 1073755196 1544 pts/0 S+   16:17   0:00 gforth -m 1T
anton     6142  0.0  0.0  21564  1532 pts/0    S+   16:18   0:00 gforth

And a whopping 12KB difference in resident set size.

The strategy is admittedly limited in the platforms it works on, but
on those platforms it's a good strategy.

>A tree isn't needed: I think a linked list of blocks is good enough.

In order to perform the check we need to determine if the validation
pointer points to an address in one of the blocks (otherwise you can
get false positives).  If we keep the extra data in one block, we can
do this with a simple WITHIN.  If we keep it in a tree, we need a tree
search (no longer constant time); if we keep it in a list, we need a
linear search (linear cost).

>> I thought a little more about this problem and remembered that I have
>> already solved it, in the allocator used in my garbage collector.  The
>> garbage collector is conservative, so I need a way to check if a value
>> X can be the address of an allocated block.
>
>Of course that is also probabilistic (the definition of a conservative
>gc).

The check is reliable.  Of course success still does not guarantee
that the value is a pointer, so I could have just as well used a
probabilistic method that may produce false positives (but must not
produce false negatives), but I didn't; there would be no advantage.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2012: http://www.euroforth.org/ef12/

[toc] | [prev] | [next] | [standalone]

#14695 — Re: heap implementation

From	Alex McDonald <blog@rivadpm.com>
Date	2012-08-03 08:57 -0700
Subject	Re: heap implementation
Message-ID	<088d4ed8-68be-4ba3-8b04-564cb6a3a583@a4g2000vbt.googlegroups.com>
In reply to	#14688

On Aug 3, 2:58 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Paul Rubin <no.em...@nospam.invalid> writes:
> >an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:
> >> I would not use a tree for the table, just a big block; then the check
> >> for validity can be performed in constant time.
>
> >You mean a preallocated big block that could hold all the pointers
> >that might potentially be allocated all at once?
>
> Yes, but some variations are possible:
>
> 1) Start small, and double the size when it becomes too small; this
> may incur the costs of relocation, though (either redo the pointers
> when the size is doubled, or store offsets into the table instead of
> pointers).
>
> 2) Use an expandable memory area (e.g., sbrk() in Unix, and mmap for
> the data blocks).
>
> >You might have
> >to reserve 100's of MB, or even GB.  Even though it might not allocate
> >real pages, it still seems awful.
>
> Why?
>
> Unused virtual memory costs very little.  Let's see what 1TB costs:
>
> [~:77867] time gforth -e bye
>
> real    0m0.007s
> user    0m0.008s
> sys     0m0.004s
> [~:77868] time gforth -m 1T -e bye
>
> real    0m0.007s
> user    0m0.004s
> sys     0m0.004s
>
> I.e., time difference in the noise.
>
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> anton     6119  0.0  0.0 1073755196 1544 pts/0 S+   16:17   0:00 gforth -m 1T
> anton     6142  0.0  0.0  21564  1532 pts/0    S+   16:18   0:00 gforth
>
> And a whopping 12KB difference in resident set size.
>
> The strategy is admittedly limited in the platforms it works on, but
> on those platforms it's a good strategy.

It works on Windows too.

The downside is; allocated but uncommitted memory is cheap, but
unallocated and committed memory is wasteful (for some definition of
wasteful; it increases the working set size for one and may result in
page copy-on-write operations in other processes). It takes effort &
an understanding of the relationship of pages to allocated blocks from
the allocator to do an uncommit since it can only be done on whole
pages.

[snipped]

[toc] | [prev] | [next] | [standalone]

#14699 — Re: heap implementation

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2012-08-03 16:12 +0000
Subject	Re: heap implementation
Message-ID	<2012Aug3.181204@mips.complang.tuwien.ac.at>
In reply to	#14695

Alex McDonald <blog@rivadpm.com> writes:
>The downside is; allocated but uncommitted memory is cheap, but
>unallocated and committed memory is wasteful (for some definition of
>wasteful; it increases the working set size for one and may result in
>page copy-on-write operations in other processes). It takes effort &
>an understanding of the relationship of pages to allocated blocks from
>the allocator to do an uncommit since it can only be done on whole
>pages.

I don't know what you mean with "unallocated" in this context
(especially after the way you use "allocated" in the sentence before).
But if you want to reduce the amount of commited memory after a
reduction in the amount of live memory, you could mmap the unused
memory again (using MAP_FIXED), and the old mapping will be
overwritten and the old memory should be reclaimed.  At least on POSIX
systems.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2012: http://www.euroforth.org/ef12/

[toc] | [prev] | [next] | [standalone]

#14729 — Re: heap implementation

From	Alex McDonald <blog@rivadpm.com>
Date	2012-08-04 08:29 -0700
Subject	Re: heap implementation
Message-ID	<3d6da9c4-cf79-4e1d-a058-45a3bb121595@w8g2000vbx.googlegroups.com>
In reply to	#14699

On Aug 3, 5:12 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Alex McDonald <b...@rivadpm.com> writes:
> >The downside is; allocated but uncommitted memory is cheap, but
> >unallocated and committed memory is wasteful (for some definition of
> >wasteful; it increases the working set size for one and may result in
> >page copy-on-write operations in other processes). It takes effort &
> >an understanding of the relationship of pages to allocated blocks from
> >the allocator to do an uncommit since it can only be done on whole
> >pages.
>
> I don't know what you mean with "unallocated" in this context
> (especially after the way you use "allocated" in the sentence before).

I should have said "reserved" rather than "allocated" in the first
sentence; and "user allocated blocks that are freed" in the second.

> But if you want to reduce the amount of commited memory after a
> reduction in the amount of live memory, you could mmap the unused
> memory again (using MAP_FIXED), and the old mapping will be
> overwritten and the old memory should be reclaimed.  At least on POSIX
> systems.
>
> - anton
> --
> M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
> comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
>      New standard:http://www.forth200x.org/forth200x.html
>    EuroForth 2012:http://www.euroforth.org/ef12/

[toc] | [prev] | [next] | [standalone]

#14714 — Re: heap implementation

From	Paul Rubin <no.email@nospam.invalid>
Date	2012-08-03 21:22 -0700
Subject	Re: heap implementation
Message-ID	<7xehnngubd.fsf@ruckus.brouhaha.com>
In reply to	#14688

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> anton     6119  0.0  0.0 1073755196 1544 pts/0 S+   16:17   0:00 gforth -m 1T
> I.e., time difference in the noise.
> And a whopping 12KB difference in resident set size.

Arggghh... hmmm... I dunno... maybe it's legitimate but seeing a number
like that still makes me sit up in my chair.  I wonder if a lot of page
tables get allocated in the kernel that don't count as user RSS.  I'd
think it prevents the hardware address protection from catching most
stray pointers during debugging.  I wonder if it affects the paging
behavior of other processes.  It never occurred to me that allocating
such a region was even possible.  Wow.

> In order to perform the check we need to determine if the validation
> pointer points to an address in one of the blocks... If we keep it in
> a tree, we need a tree search (no longer constant time)

Hmm, good point, though the tree can have large nodes (100's or 1000's
of pointers) so the number of levels will always be small, effectively
bounded by a constant.  They will probably have better cache locality
than the allocated regions might often cost less to traverse than
getting to the freed region did.

Overall though it seems implausible that the Forth standard intended to
impose a requirement like this.

[toc] | [prev] | [next] | [standalone]

#14716 — Re: heap implementation

From	Andrew Haley <andrew29@littlepinkcloud.invalid>
Date	2012-08-04 03:11 -0500
Subject	Re: heap implementation
Message-ID	<PZadne00EOGrRIHNnZ2dnUVZ8nmdnZ2d@supernews.com>
In reply to	#14714

Paul Rubin <no.email@nospam.invalid> wrote:
> 
> Overall though it seems implausible that the Forth standard intended to
> impose a requirement like this.  

I'm sure the TC intended no such thing.  This discussion is just about
(IMO pointless) thoretical niceties.

Andrew.

[toc] | [prev] | [next] | [standalone]

#14725 — Re: heap implementation

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2012-08-04 11:40 +0000
Subject	Re: heap implementation
Message-ID	<2012Aug4.134006@mips.complang.tuwien.ac.at>
In reply to	#14714

Paul Rubin <no.email@nospam.invalid> writes:
>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>> anton     6119  0.0  0.0 1073755196 1544 pts/0 S+   16:17   0:00 gforth -m 1T
>> I.e., time difference in the noise.
>> And a whopping 12KB difference in resident set size.
>
>Arggghh... hmmm... I dunno... maybe it's legitimate but seeing a number
>like that still makes me sit up in my chair.  I wonder if a lot of page
>tables get allocated in the kernel that don't count as user RSS.

Maybe it's not shown in the RSS, but given that there is hardly any
increment in execution time (see below), it cannot be a lot; we see
nothing in the time results, so let's see if we can see more with
performance counters:

[~:77886] perfex -e 0x4100c0 -e 0x4200c0 gforth -e bye
tsc                                        17680590
event 0x004100C0@0                         11816963
event 0x004200C0@1                          1152795
[~:77887] perfex -e 0x4100c0 -e 0x4200c0 gforth -m 1T -e bye
tsc                                        17921088
event 0x004100C0@0                         11818484
event 0x004200C0@1                          1200593

An increase in cycles (tsc) by very little (same order of magnitude as
the noise), 1500 more instructions executed in user mode (0x4100c0),
and 48000 more instructions executed in system mode (0x4200c0).  There
does not fit a lot in there.

And why should a lot of page tables get allocated?  There's a lot of
sharing possible: The zeroed page can be shared, the first-level page
tables all pointing to that page can be shared, too, and so on.

>I'd
>think it prevents the hardware address protection from catching most
>stray pointers during debugging.

A few GB (or whatever) in an address space of 17179869184 GB will
prevent catching *most* stray pointers?  Not in my experience.  But if
you are worried about that you can first mmap the area with PROT_NONE,
and change the permissions when you actually need the pages for data.

>Hmm, good point, though the tree can have large nodes (100's or 1000's
>of pointers) so the number of levels will always be small, effectively
>bounded by a constant.  They will probably have better cache locality
>than the allocated regions might often cost less to traverse than
>getting to the freed region did.

I cannot follow you here.

>Overall though it seems implausible that the Forth standard intended to
>impose a requirement like this.  

Yes.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2012: http://www.euroforth.org/ef12/

[toc] | [prev] | [next] | [standalone]

#14675 — Re: heap implementation (was: Appending definition to another definition)

From	Bernd Paysan <bernd.paysan@gmx.de>
Date	2012-08-03 00:04 +0200
Subject	Re: heap implementation (was: Appending definition to another definition)
Message-ID	<10262742.RuWNhMaO5F@sunwukong.fritz.box>
In reply to	#14666

Anton Ertl wrote:
> I thought a little more about this problem and remembered that I have
> already solved it, in the allocator used in my garbage collector.  The
> garbage collector is conservative, so I need a way to check if a value
> X can be the address of an allocated block.
> 
> I solved the problem as follows: for each grain (of, say, 8 bytes),
> there are two bits (stored elsewhere): One says if the grain is at the
> start of a block (this provides information about the length of the
> block), the other says whether the block starting at the grain is
> occupied or free.  These bits are stored in separate areas, which has
> similar issues as the table discussed above.

Actually, for small blocks on a 64 bit system I'd use one area for each 
size.  No overhead required except the mapped space.  If the block is in 
the 8-byte-chunks space, it is an 8 byte chunk.

Each thread has its own starting point to allocate and its own free list 
(this is completely lock-free).  I'd leave the NUMA stuff to the copy-
on-write part for anonymously mmapped pages.  I'd say if thread A 
reclaims memory allocated from thread B, it ends up in thread A's free 
list, and the user gets what he deservers.

IMHO, there are four sorts of objects you want to claim memory for:

* small fixed-size objects which you never will resize, but which need 
very fast allocate and free - these should go into one-region-per-size, 
one-starting-point-per-thread buckets.  These fixed-size objects could 
be under the regime of a garbage colletor (it makes most sense here).

* small variable-sized objects which need some way of compaction 
(editable strings), as they change their size quite often. These should 
go into pages (allocated per-thread), which can be compacted 
individually, and they have a backlink pointer which gets updated on 
compaction.  These objects are under the regime of a compactor, they are 
explicitly deallocated (e.g. when the object that points to them dies).  
These objects know their length - they are strings.

* large objects should be allocated and freed in page granularities, 
using mmap. No compaction necessary if you only allow power-of-two 
aligned starting points.

* large, purgeable objects should be treated the same way, but may be 
deallocated if the system senses a need for that (some operating systems 
tell you when they are short in memory).

I'm not that sure about the strings-to-be-compacted, as it is just as 
easy to keep them in the fixed-size buckets, and move them completely 
into another region if they change size.

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

[toc] | [prev] | [next] | [standalone]

#14621

From	George Hubert <georgeahubert@yahoo.co.uk>
Date	2012-08-01 11:20 -0700
Message-ID	<70962017-c72d-4824-81d0-05a594ac3c5b@googlegroups.com>
In reply to	#14614

On Wednesday, August 1, 2012 4:09:07 PM UTC+1, Anton Ertl wrote:
> George Hubert <georgeahubert@yahoo.co.uk> writes: >On Tuesday, July 31, 2012 6:38:13 PM UTC+1, Andrew Haley wrote: >> George Hubert <georgeahubert@yahoo.co.uk> wrote: > The Standard states th= >at FREE either succeeds and returns zero or > fails returning non-zero: any= >thing else is clearly a bug. Not exactly. The standard doesn't say anything= > about what happens if you fail to honour your side of the contract, which = >is "a-addr shall indicate a region of data space that was previously obtain= >ed by ALLOCATE or RESIZE." Andrew. > >Surely: > >14.4.1.2 Ambiguous conditions >no additional requirements.=20 > >implies that there aren't any ambiguous conditions Or not any additional ones. >which undefined behavio= >ur in the event of a-addr not coming from ALLOCATE implies. Anyway it doesn= >'t state it will crash, as Hugh said. But crash it does, on a number of systems (see below). And I don't see a reliable and efficient way to prevent this in all cases. - anton ----------------------------------------------------- VFX Forth for Linux IA32 © MicroProcessor Engineering Ltd, 1998-2009 Version: 4.40 [build 0404] Build date: 7 December 2009 Free dictionary = 7922527 bytes [7736kb] here free CS=0023 DS=002B ES=002B SS=002B FS=0000 GS=0063 EAX=0800:0000 EBX=F7FB:4FF4 ECX=0893:4EB1 EDX=0000:0000 ESI=F7FB:6160 EDI=080B:9CA1 EBP=0893:4F94 ESP=0893:4F7C EIP=F7ED:631C EFLAGS=0001:0206 --- RS top --- $0000:0009 $080B:61C9 ADD-HISTORY-LINE $0800:0000 $0000:0000 $0894:4FC0 $0893:4FA0 $0894:4FB0 $0804:AADD NXCALL Signal number SIGSEGV at address F7ED:631C, probably in * outside dictionary * Press E to exit, R to restart, other to continue: ---------------------------------------------------------- [c7:~:31707] sf SwiftForth i386-Linux 3.2.1 28-Dec-2009 here free *** glibc detected *** sf: free(): invalid pointer: 0x080801b1 *** Aborted ---------------------------------------------------------- Gforth 0.6.2, Copyright (C) 1995-2003 Free Software Foundation, Inc. Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license' Type `bye' to exit 0 free . 0 ok here free . *** glibc detected *** gforth: free(): invalid pointer: 0x00007f85c1f28898 *** ======= Backtrace: ========= /lib/libc.so.6[0x7f85c2767948] /lib/libc.so.6(cfree+0x76)[0x7f85c2769a56] gforth(engine+0x2584)[0x406674] gforth(go_forth+0x174)[0x40f914] gforth(main+0x153)[0x4106e3] /lib/libc.so.6(__libc_start_main+0xe6)[0x7f85c27121a6] gforth[0x404059] ======= Memory map: ======== ... Aborted. The Gforth behaviour is wrong. The Forth system should not terminate (it normally does not terminate even if it executes an illegal instruction). - anton -- M. Anton Ertl 

Both Win32forth and Bigforth (at least on Windows XP) return non-zero iors, so it's seems that some systems don't crash. Certainly the Swiftforth and VFX behaviours at least return control to the terminal and give an error message so Programmers know that FREE caused the error and can use that information for debugging (I'd describe it as ABORTing rather than crashing). Certainly the assertation that it WILL crash (rather than may) is plain wrong. I would certainly expect a programmer to know how the forth he uses for testing and debugging to know how their system behaves in order to make sense of the information (or lack of it). 

Anyway as both you and Andrew have pointed out I was wrong.

George Hubert

[toc] | [prev] | [next] | [standalone]

#14639

From	Bernd Paysan <bernd.paysan@gmx.de>
Date	2012-08-02 01:55 +0200
Message-ID	<1790043.2xHZYgURxo@sunwukong.fritz.box>
In reply to	#14614

Anton Ertl wrote:
> The Gforth behaviour is wrong.  The Forth system should not terminate
> (it normally does not terminate even if it executes an illegal
> instruction).

We could set the environment variable MALLOC_CHECK_ to 1.  Then glibc 
prints a diagnostic output on stderr, but does not terminate the 
program.  Or catch SIGABRT (we do, but quit on SIGABRT instead of throw, 
as SIBABRT is intented to do that).  If we set MALLOC_CHECK_ to 2, it 
will only abort, and print no diagnostics.

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

[toc] | [prev] | [next] | [standalone]

#14640

From	Bernd Paysan <bernd.paysan@gmx.de>
Date	2012-08-02 02:09 +0200
Message-ID	<1567443.UxR9ooIsYs@sunwukong.fritz.box>
In reply to	#14639

Bernd Paysan wrote:
> We could set the environment variable MALLOC_CHECK_ to 1.

Or, as Hugh suggested, we could implement our own memory management 
code, e.g. take the code from bigForth (I did my own memory management 
in bigForth, because the Atari ST's OS-based memory management was piss-
poor).

It does an awful lot for the 20 blocks it is implemented in, a lot more 
than just allocate, resize, and free.

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

[toc] | [prev] | [next] | [standalone]

#14644

From	hughaguilar96@yahoo.com
Date	2012-08-01 18:57 -0700
Message-ID	<d6c41321-2ccc-4f69-a17a-d48fc82bfb32@googlegroups.com>
In reply to	#14640

On Wednesday, August 1, 2012 5:09:38 PM UTC-7, Bernd Paysan wrote:
> Bernd Paysan wrote:
> 
> > We could set the environment variable MALLOC_CHECK_ to 1.
> 
> 
> 
> Or, as Hugh suggested, we could implement our own memory management 
> 
> code, e.g. take the code from bigForth (I did my own memory management 
> 
> in bigForth, because the Atari ST's OS-based memory management was piss-
> 
> poor).
> 
> 
> 
> It does an awful lot for the 20 blocks it is implemented in, a lot more 
> 
> than just allocate, resize, and free.
> 
> 
> 
> -- 
> 
> Bernd Paysan
> 
> "If you want it done right, you have to do it yourself"
> 
> http://bernd-paysan.de/

Well, as your tag says: "If you want it done right, you have to do it yourself." It is always a bad idea to rely on OPC --- that is what script kiddies do --- they look like geniuses for a while, but then their ignorance trips them up and they fall on their faces. Similarly, I can beat most computer programs at Go --- they play in a dull and unimaginative manner --- they have heuristics but they don't have understanding.

In the days of the Atari ST and its MC68K, a heap was pretty easy. Now with all of these cache systems in use, it is more complicated. It is hardly the kind of problem that should daunt the intrepid Forth-200x members though. Also, there are plenty of people on CLAX who can help if necessary --- it is not like this is the first time anybody has ever implemented a heap.

[toc] | [prev] | [next] | [standalone]

#14658

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2012-08-02 09:41 +0000
Message-ID	<2012Aug2.114113@mips.complang.tuwien.ac.at>
In reply to	#14639

Bernd Paysan <bernd.paysan@gmx.de> writes:
>Anton Ertl wrote:
>> The Gforth behaviour is wrong.  The Forth system should not terminate
>> (it normally does not terminate even if it executes an illegal
>> instruction).
>
>We could set the environment variable MALLOC_CHECK_ to 1.

Setting an environment variable is not a good style for communicating
within a process, because it will also influence child processes.

However, while we are considering workarounds for this misfeature of
glibc, we could call mcheck().

A relatively easy-to-implement approch would be to supply a function
that converts the mcheck_status into a THROW value and then THROWs.

But given the interface of FREE it would be nicer to have FREE return
the throw value as ior (and leave it to the application to throw or do
something else).  I don't see a clean way to implement that, though.

Implementing ALLOCATE based on lower-level stuff would also be
possible (another implementation is part of my garbage collector), but
seems overkill, and worse, I doubt that our own implementation would
be as efficient.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2012: http://www.euroforth.org/ef12/

[toc] | [prev] | [next] | [standalone]

#14603

From	hughaguilar96@yahoo.com
Date	2012-07-31 23:52 -0700
Message-ID	<264d6917-a52f-46ca-8e61-484d86587101@googlegroups.com>
In reply to	#14586

On Tuesday, July 31, 2012 9:55:20 AM UTC-7, George Hubert wrote:
> On Tuesday, July 31, 2012 9:55:56 AM UTC+1, Hugh Aguilar wrote:
> 
>  --- this is a huge improvement over typical Forth in which FREE will crash if it is given a block of memory in the dictionary rather than the heap. 
> 
> Which Forth are you talking about. The Standard states that FREE either succeeds and returns zero or fails returning non-zero: anything else is clearly a bug. Perhaps you should try reading the Standard instead of coming up with incorrect statements.
> 
> 
> 
> George Hubert

Everything in my novice package is ANS-Forth compliant. You don't know what you are talking about.

[toc] | [prev] | [next] | [standalone]

#14609

From	Coos Haak <chforth@hccnet.nl>
Date	2012-08-01 15:46 +0200
Message-ID	<yvddxqyche21$.16524lpduh7kw$.dlg@40tude.net>
In reply to	#14603

Op Tue, 31 Jul 2012 23:52:24 -0700 (PDT) schreef hughaguilar96@yahoo.com:

> On Tuesday, July 31, 2012 9:55:20 AM UTC-7, George Hubert wrote:
>> On Tuesday, July 31, 2012 9:55:56 AM UTC+1, Hugh Aguilar wrote:
>> 
>>  --- this is a huge improvement over typical Forth in which FREE will crash if it is given a block of memory in the dictionary rather than the heap. 
>> 
>> Which Forth are you talking about. The Standard states that FREE either succeeds and returns zero or fails returning non-zero: anything else is clearly a bug. Perhaps you should try reading the Standard instead of coming up with incorrect statements.
>> 
>> 
>> 
>> George Hubert
> 
> Everything in my novice package is ANS-Forth compliant. You don't know what you are talking about.

lamenielache!
-- 
Coos

CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html

[toc] | [prev] | [next] | [standalone]

#14631

From	jim@rainbarrel.com
Date	2012-08-01 14:01 -0700
Message-ID	<e051e1bb-c306-4d22-9b81-74c907d999ad@googlegroups.com>
In reply to	#14564

> > All of this seems way more complicated than anything that I would
> > normally do when writing a program. What are you trying to accomplish?
> > If you told us what your application is, we would be able to help you
> > better. What you are asking for doesn't seem to make much sense. What
> > is that memory going to be used for?- Hide quoted text -
> Agreed. A general description of the problem you are trying to solve
> might help us to come up with an alternative solution. Forth is
> sufficiently old that most eggs were cracked a long time ago!

Why not add more advanced dynamic memory allocation to the Forth Standard?  Maybe something like string stacks?

Diaperglu Forth has had string stacks for a long time, and the latest version even supports local string variables which are automatically allocated and deallocated for you.

[toc] | [prev] | [next] | [standalone]

Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →

csiph-web

Appending definition to another definition

Contents

#14751

#14638

#14666 — heap implementation (was: Appending definition to another definition)

#14674 — Re: heap implementation

#14688 — Re: heap implementation

#14695 — Re: heap implementation

#14699 — Re: heap implementation

#14729 — Re: heap implementation

#14714 — Re: heap implementation

#14716 — Re: heap implementation

#14725 — Re: heap implementation

#14675 — Re: heap implementation (was: Appending definition to another definition)

#14621

#14639

#14640

#14644

#14658

#14603

#14609

#14631