next up previous

5.2 Garbage Collection

The initial motivation for using the ``free locus pointer'' flp was that each new mutation needs to be inserted at a locus that does not contain any other mutations. The straightforward way to make sure this happens is to keep extending the genome and insert new mutations at the locus pointed to by flp. The drawback to this simple approach is that the genome keeps growing as the simulation progresses, and along with it the time to execute procedures that depend on the length of the genome.

Another method for inserting mutations at loci that do not contain other mutations is to search for homozygous loci.

If a locus is homozygous wild (all 0s), no matter if it is to the left or right of flp, we can safely insert new mutations at that locus. Loci to the left of flp often revert to homozygous wild state. For example, a new mutation introduced into a population will not be passed on if the individual that carries it does not reproduce or if none of its offspring survive. Thus this locus can be ``reused'' the next time we need to insert a new mutation.

If a locus is a homozygous mutant, i.e. all 1s, then the mutation is fixed and is automatically part of every new individual created from that point on. We can reuse these loci, also, by keeping a count of the number of fixed mutations. Let nf be the number of fixed mutations. Every time a mutation becomes fixed -- i.e. the corresponding locus has two 1 bits in every individual in the population -- we can increment nf by 1 and clear all the 1s to 0s. With this modification, the equation for the relative fitness of an individual is

where n1 is the number of heterozygous mutations in the individual, n2 is the number of homozygous (double) mutations in this individual, and nf is the number of fixed mutations in the population. (Don't throw away the procedure that uses the old fitness function; it will be resurrected and used again if you implement the third optimization, described below.)