Hi all, I have extracted the depth of coverage of some of my populations from the vcf-file and each population has 11 individuals (columns) with 11million SNPs(rows) . I have converted them into data.frame and replaced missing values with NA. The first few rows of my data.frame looks like this:
> head(pop1)
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
1 7 3 NA NA 10 NA NA NA NA NA NA
2 14 11 7 NA 12 3 4 5 14 3 6
3 13 11 7 NA 11 4 NA 4 13 3 4
4 3 NA 4 5 4 NA NA 6 17 NA 7
5 3 NA 5 5 4 NA NA 7 20 NA 8
6 6 NA 3 6 NA NA NA 5 16 NA 10
For each column (or individual), I want to calculate the proportion of SNPs that have DP more than 5! I am a bit confused how to do it in R! I now there are so many R professionals here, can someone help me how to do it in R?
Thanks @Kevin Blighe, I used something like this which worked