Hi, all
We have sequenced multiple individual for one species with illumina platform
the sequencing depth for our data: 7 individuals: 60X reads 32 individuals: 2~10X reads
I have called SNPs for all these individuals, now I want to use these SNP data to do further analysis, eg, population structure, LD, FST, etc.
I got strange results when using all individuals in population structure analyses-- individuals with high coverage were clustered together, although they beloning to different sub-populations (some of them are cultivates, ohters wild). And all other individuals (low coverage) were clustered together.
I have checked SNP result, and found high coverage individuals cotain much more SNPs than low coverage individuals.
So, should I exclude all these high cov individuals for further analysis?
Unfortunately, aside from excluding higher cov. individuals, I would tried downsampling all individuals to your lowest coverage, and then try the analysis.