                     Conservative Garbage Collection
                     ===============================


This note is to help me while I design and implement one, and comments on
changes from a previous world. This is a fresh version of these notes and
is being started in April 2016. I had done some redesign in October 2017, but
the latest changes are from June 2018.

The work discussed here is to arrange that CSL uses a mostly-copying
conservative collector. The details are substantially tuned to the expected
patterns of memory usage. Although the initial implementation will be
single threaded the intent is to allow for a threaded system in the future.
Following CST project work by Jamie Davenport I now intend to try for a
design that is conservative, generational and somewhat threaded.

Memory is used in pages (of size CSL_PAGE_SIZE). Each page will either hold
CONS cells (and other items of size 2*CELL) or larger items (typically
symbol headers and vectors). Given an arbitrary bit-pattern it will be
possible to tell if it refers to an address within an object in one of these
pages.

With each page I will have a bitmap that is used to record a concept "pinned"
that can be associated with any object in the page. An item will be marked
pinned if an ambiguous pointer refers to it, and hence it may not be moved.
I will have an array of size half a page for each thread and will bse that
to collect a list of all the pinned items in a single page of memory that
I am evacuating. The size here is only half a page because the smallest items
ever allocated within the heap are 2 pointers large. I rather expect that the
full capacity of this will never be approached.

Pages of memory are classified as
  Current (C). These are the pages within which the mutator allocates
     new material. There may be several such if allocation of CONS cells
     and vectors use different dedicated current pages and in a multi-thread
     world each thread would have its own current page or pages.
  Recent (R). When a current page becomes full it is replaced with a new
     empty page, and the full page that had been current is re-badged as
     "recent". When that happens the previous recent page will have its
     content evacuated - that process representing a minor garbage collection.
  Stable (S). When live material is moved out of an "old recent" page it is
     copied into the stable region. This will come to be the bulk of the
     active heap and uses as many pages as are called for. At any one time
     one of these pages will be the "stable fringe" where new material is
     added. In a multi-thread world there is just a single pool of stable
     heap.
  Free (F). Pages that are not in use are in this group. When a minor garbage
     collection places in a fresh stable page such that |S| > |F| then a full
     garbage collection is triggered. At that stage R is empty. The stable
     fringe page is then deemed part of what will become the new heap, and
     all material apart from that is copied into that and subsequent pages
     taken from free. The vacated pages are then re-labelled to form the
     new free region. While doing this the current page does not have
     its content relocated.

Within the heap I can maintain "dirty bits" that mark parts of pages where
data has been updated. I will arrange that as the start of a minor garbage
collection I will have two maps of dirty memory, one corresponding to the
time period while R was being filled and a second to the time that C was
filled. Clearly at the end of a minor collection the old C becomes the new
R and a fresh C is allocated, so the previous C-map becomes the new R-map
and the new C-map starts off fresh and clean. These maps will not be needed
or used during a major collection.

Now I will be explicit about the expectations that I have that lead to this
plan. They are
(1) Much material that is allocated will only remain active for a short
while. The time that it takes to fill up the C page will be long enough that
when C becomes full everything in R will have had time for this infant
mortality to take effect, and so on a minor garbage collection a substantial
fraction of R will be garbage and hence does not get copied into S.
(2) In CSL the only place the ambiguous pointers reside is on the stack.
An especially large number of entries towards the top of the stack will
refer to data that is just in the process of being worked on, and this will
mean that most ambiguous pointers that refer to anything at all will refer
to locations within C. In particular I hope that there will be few ambiguous
references into R. Neither minor nor major garbage collection will relocate
data that is in C, so my expectation and hope is that there will be only
minor disruption to storage elsewhere because of conservatism.
(3) The schemes I have for identifying dirty regions of memory are based on
storage protection and accepting an exception when a region is first written
to. Because subsequent memory access is unimpeded I expect this will have low
overhead. Memory protection is performed at a granularity substantially
smaller then the size of whole pages. My expectation is that almost all
writes to memory will be in either C or in symbol headers. Symbol headers will
be distributed across S, but I can imagine arranging that the major
garbage collection would copy all symbol headers from the oblist in such
a way as to leave them as a compact block (along with the vectors that
represent the object list itself). The hope is that the amount of memory
identified as dirty will be a rather small fraction of the full size of the
heap.
(4) Given an arbitrary bit pattern it will first be possible to tell if it
could be an address within any active page of the heap, and if it is then
the starting address and Lisp type of any item it points within can be
discerned reasonably efficiently.
(5) Given a region of memory within a page it will be possible to identify all
Lisp objects that overlap it, and on that basis scan all pointers that flow
out from it. Maybe the key issue here is when the region covers the end but
not the start of some Lisp vector and it is thus necessary to search back
through lower-address parts of a page to find the start and hence the length
of the vector. The assumption here requires consideration of any cases where
ojjects in memory are not strictly laid out one after the other but where
there are gaps between them.
(6) With a conservative collector it is necessary to leave some items
unrelocated in a manner that leaves the free space at the end of garbage
collection fragmented. In pathological cases this could lead to major
inefficient in  attempots to allocate vectors, and premature failure to
allocate. This may include failure to allocate while performing the
copying operstions of a major garbage collection! Perhaps I can manage a
temporary recovery if I find myself gummed up during garbage collection by
just giving up and not evacuating some data. That way I can at least get
back to normal operation, albeit without enough memory freed up to be
able to start any subsequent garbage collection with any confidence at all.
But that may be sufficient to allow be to display a disgnostic about
running out of memory and then close down. Note that within CSL I allocate
vectors of size up to (1/4) of the page size, and allowing for the fact
that pages contain headers and bitmaps this means that (only) up to 3
maximum size vector chunks can fit on one page, and a wasted gap can be left.
I need to consider whether vector allocation should have two strands: the
simple one being linear allocation at the end of the current page, but
building any space that has to be skipped either because of pinned data or
the granularity of the pages into a free-chain, and then an alternative scheme
that allocates from within this free-chain so that smaller allocations can
be used to fill in the gaps. I think that sort of plan may be especially
useful for the allocations that are performed for material that is being
copied during a major garbage collection.


The overall pattern for a minor garbage collection will be
(1) Clear the pinned map for R
(2) Scan all ambiguous bases (ie the stack) and if any item there could
    be a pointer into R then set the pinned bit against the head of the
    relevant object. Keep a list of all those objects in the "pinned table".
(3) Scan all unambiguous bases, all locations within objects that are
    in regions of memory dirty since R was set up and all locations within
    the pinned items in R. In each case if the reference is to a non-pinned
    item in R then on the first visit evacuate that item into S, and on
    a subsequent visit just use the forwarding pointer set up on the first
    visit. Update the base. Note that I expect all of C to be dirty, and
    so scanning it may perhaps be done more elegantly than checking every
    part of it for dirty bits.
(4) Scan the material newly places in S. If references into R are found
    then evacuate more or follow the forwarding address. This may expand
    the region in S that is used - grab further pages for it as needed.
    Stop when the scan has covered everything moved into S.
(5) Now the only live data in R should be the things that are pinned.
    Use the pinned table to build the structures within it that support
    allocation. [Note: after this step the pin map and table are both no
    longer needed].
(6) Check if S is now over-full and if so trigger a major garbage collection.
(7) Swap the interpretation of C and R, and update the dirty maps to match.
    Well the dirty map issue is maybe ugly. Any part of S that has been
    updated such that it refers into C must now be tagged in the map, and
    all of what used to be R but is now C can have map info cleared.


A major garbage collection has a slightly simpler structure because while it
must cope with ambiguous pointers it processes all data apart from C.
(1) Clear pinned map for R and S.
(2) Scan ambiguous bases, marking items referred to in R or S as pinned.
    Build a table of the pinned items first using the pinned-table and if
    that overflows building a linked list in pages from F.
    If there had been a list of pinned items left in S by the previous
    collection then scan down it clearing any pinned bits on its entries,
    because that data is now not needed.
(3) Scan unambiguous bases and pointers out of C relocating anything except
    references into C or to things that are pinned. This copies material to
    new pages in F.
(4) Scan the new material in F much as step (4) in the minor case.
(5) Using information about pinned data in the table and any overflow list
    build up freespace tables/maps/chains in all the blocks from S.
(6) Swap interpretation of S and F, and allocate a new empty block for R
    (which in step 7 of the minor GC will then instantly become the new C).
(7) Consider dirty bits. What is needed is to mark any segment of memory
    containing a reference to C as dirty, and all others as clean. I rather
    hope to be able to build up that information as part of steps (3) and
    (4) since they already need to test for references into C.
 