Skip to content

All code for recreating simulations and analyses in paper New guidance for ex situ gene conservation- sampling realistic population systems and accounting for collection attrition

Notifications You must be signed in to change notification settings

smhoban/BSAS_sampling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BSAS_sampling

All code for recreating simulations and analyses in the paper New guidance for ex situ gene conservation- sampling realistic population systems and accounting for collection attrition

As explained in the paper Methods, the process consists of three phases. First, the population system is simulated; more specifically many simulations are run over a wide parameter space. Second, each simulated system is sampled from at different levels of effort (number of individual plants sampled). Third, the genetic diversity in the sample is compared to the genetic diversity in entire system to determine what proportion of alleles (genetic variants) are actually present (and thus conserved) in the sample. This calculation is done for several types of alleles, and for the criteria of preserving at least one copy and the criteria for preserving multiple (5, 10, 25, and 50) copies of each allele. The following text provides more detail and instructions for each of these components of the work. Below this are a series of explanations of the code for creating the Figures and other calculations.

Step 1: Simulations

First you must initiate the simulations. To do this you run the file "do_sim_BSAS.R". You can edit this file for different parameter values or numebr of simulations. This code will loop over all parameter combinations, and for each combination it will create the .par files for simcoal (using the function "simc.write"), and run the simulations (using the function "simc.run"). These write and run functions are in the files in the src folder "write.run.simcoal...".

This will create 1076 folders, each with 15 (or whatever number of reps you choose) simulation output files. You should then copy all these into one folder (I did this and I called this folder "Simulation6")

Step 2 and 3: Sampling and Comparison

Next you do the sub-sampling at different numebrs of individuals. Sampling and calculations are performed using the file "BSAS_sampling.R". This will loop over all files that exist in the folder Simulations6. The code is pretty well commented but I will briefly explain. It will first convert all files from .arp to .gen format (and will delete the original .arp to save space, but you can comment out that line with file.remove()). (Step 2- sampling) It will then loop over all scenarios, calculate the number and type of alleles in the simulated system, and then loop over all sampling efforts. (Step 3- comparison) For each sampling effort it will create a genind object that represents a sample of a given effort, and then make the comparison of number of alleles in the sample to the total number of alleles, for all allele categories. The datum which is recorded is the amount of genetic diversity captured by each sample effort. It will do so for the criteria of 1, 5, 10, 25, and 50 copies of each allele. Back outside the loop over sampling effort but still inside the loop for each scenario, a higher level (more abstracted) of data is recorded using the gt95() function- The Ni or sampling effort required to capture 95% of the alleles in each allele category. This is the result reported in the paper, the minimum sampling needed.

This step utilizes functions in the "sample_funcs_BSAS.R" file (in the main folder), as well as "arp2gen_edit.R" (in the src folder). The first of these (sample_funcs) contains the thresholds for each allele category (e.g. less than 0.05 locally) and you could change these if you like. The latter file(arp2gen_edit) is a modification of the conversion script in adegenet, which corrects a small error in the original function. I have created another function called "conv_arp_gen.R" but have not yet incorporated this.

Figure creation and calculations

The figure creation and major calculations are in the file "BSAS_analysis2.R". This is well commented and is fairly self-explanatory. The first three code segments are for Figures 1, 2, and 3. The next segment is to make sure the bottleneck did reduce allele diversity. The next segment is the additional analysis of the alternate migration model (stepping stone) and a comparison to the default migration model (island model) (the comparison calculation is means_caught_mss/means_caught_mim). Next is a very short calculation for the statistic reported in the Discussion regarding the percent of alleles that are singletons (only one copy). Next is a section for calculating FSTs which are reported in the first paragraph of the Results of the paper. Next is the analysis for the one population scenario, also reported in the first paragraph of the Results of the paper. The final section is to calculate how much the allele frequencies shift and the loss in number of alleles with different population size and bottlenecks, which is in the Discussion.

About

All code for recreating simulations and analyses in paper New guidance for ex situ gene conservation- sampling realistic population systems and accounting for collection attrition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages