Input data for D-statistics in ANGSD

0

Entering edit mode

7.1 years ago

NicoN64 ▴ 30

Hello, I am interested to use to D-stat to test for ancient admixture or ILS between different species. I would like to use ANGSD software and Dfoil too (for direction of gene flow).

I have question about the type of input data needed.

Most studies run D and Dfoil statistics using whole genome sequence data or RADseq, but can we also use D-stat with exomes captured data? I have thousands of CDSs sequences, does it make sense to concatenate them (~ 3 Mbp) and use it for D-stat with a sliding windows?

For the input, I mapped reads to a reference genome which will be the outgroup in my D-stats test; and got my sorted mapped bam files. My question is do I need to run SNP calling (e.g. with GATK with base recalibration, and remove InDel and duplicate reads) before to use D-stat in ANGSD?

Thanks for advices.

next-gen D statistic abbababa test angsd SNP • 2.3k views

ADD COMMENT • link 7.1 years ago by NicoN64 ▴ 30

Login before adding your answer.