Simulation Genomics
Main Programm

Initial Project: The Wright-Fischer Model

-e output file (default: terminal) -r number repeats (default: 3) -g number generations (default: 10) -n number individuals (default: 100) no flag alleles frequencies (default: 0.1 0.2 0.3 0.4)

Initially we only make a simple model of population genetics using the Wright Fisher model: there is population of N individual, they mate randomly and "mix" their DNA to produce the next generation of N individual

Lack of Biology

We only model genetic drift, no selection, mutation, migration and no change in population size. For this basic model we don't need to take into consideration the sex of the individuals and we can even only take into consideration the N*2 alleles and not the fact that they are grouped by 2.

lackBiology.png
Simple Wright-Fisher model

Random Matting:

randomMating.png

Results & Plots

RPlots.png
Graphs showing the evolution of the frequences of the different alleles (starting respectively wit 0.8/0.2 and 0.5/0.5
FixationTimes.png
Fixation Times of different simulations with different sizes of population, different frequences, number generation and number of repeats

Mutation Extension

-m mutation probability (default: 0)

To introduce mutation (probability of mutating: m) we just need to substract to the number of each allele $ N_i $ the number of alleles which mutates $ N_i*m $ and add the number of the alleles which mutate into $ N_i $ which is the number of other alleles times m $ (N_{tot}-N_i)*m $ divided by the number of alleles in which they can mutate (note that you can't mutate into you) so: $ N_i'= N_i - N_i*m +\frac{(N_{tot}-N_i)*m}{A-1} $ finally we divide by the number of total alleles $ N_{tot} $ to obtain frequencies:

\[ f_i'= f_i - f_i*m +\frac{(1-f_i)*m}{A-1}= f_i(1-m-\frac{m}{A-1}) + \frac{m}{A-1} = f_i(1-\frac{A*m}{A-1}) + \frac{m}{A-1} \]

Natural Selection Extension

-s alleles fitnesses (default: 0 0 0 ...)

To introduce natural selection we just need to multiply $ f_i $ by a $ s_i $ which can be between -1 and infinity (0 if no selection) and then in order to have a constant population size you divide by $ \sum f_k*(1+s_k)\ $ so:

\[ f_i'= \frac{f_i*(1+s_i)}{\sum f_k*(1+s_k)} = \frac{f_i*(1+s_i)}{\sum f_k +\sum f_ks_k} = \frac{f_i*(1+s_i)}{1 +\sum f_ks_k} \]

Bottleneck Extension

-b bottleneck effect flag (default: no)

The bottleneck effect is a sharp reduction of the population size due to envirenmental events (earthquakes, fires ...) or human activities (genocides...). It enables a huge effect of genetic drift (random selection). In this case it creates a natural disaster which randomly kills 90% of the population.

bottleneck.png
Bottleneck diagramm

Sickle Cell and Malaria Extension

-a sickle cell anemia flag (default: no)

In this "main" extension we wanted to make a model of genetic population more complex than the basic model. We chose to study the population genetics of sickle cell disease while taking into account malaria:

anemiaGenetics.png
Tree diagramm to find the probability of getting a N and a S