next up previous

5.2 Garbage Collection     continued...

Given the representation of individuals as pairs of bit strings, detecting homozygous loci is straightforward. Let g[i][j][k] be the kth strand in the jth word in the genome for the ith individual in the current generation g. Since there are two strands per individual, . For each word in the genome, compute the bitwise AND across all strands:

for (j = 0; j < gl; j++) {
  a[j] = 0;
  for (i = 0; i < nsur; i++)
    a[j] &= (g[i][j][0] & g[i][j][1]);
}
Now there is a fixed mutation wherever a[j] has a 1 bit; for each such locus go back through the population and set the locus to 0, and also increment the fixed mutation counter.

A similar loop can be used to detect homozygous wild loci: simply take the bitwise OR of each segment, and then look for bits that are 0 in the resulting word.

To implement a ``garbage collection'' algorithm that looks for loci that can be reused, periodically call a procedure that builds a data structure that contains pointers to loci that can be reused. One method is to keep a ``bitmap'', a collection of words that is as long as the genome. The ith bit in the bitmap will be 1 if locus i is free, and 0 if locus i is already in use. The bitmap can be created using the bitwise AND and OR operations described above. The add_mutations procedure should then be modified so that to add n mutations it should first look in the bitmap to see if there are at least n 1 bits, and if so, OR those them with the genome being mutated and clear them out of the bitmap. If there are insufficient bits in the bitmap then the procedure should fall back on the previous method of adding new mutations at the point indicated by flp and incrementing flp.