Hi All
I have done exome sequencing for 6 samples. Now I am calling variations using GATK. I am calling variations for each sample individually and using all the sites across the genome (dbSNP_135. vcf file). also running vqsr on the same data for single sample and all the training sets as mentioned by GATK for whole genome.
what I observe in the output is, Ti/Tv ratio obtained is very low (from what is expected form exome data, also the tranche plots show that there are lots of FPs.). I have following two questions:
1) Can anyone tell, should I call variations using all the samples or can do it for each sample individually?
2) Also, can I call variations for exome data using all the sites from reference (whole) genome or I need to give some list of regions which have been captured?
thanks in advance
Aanchal
Have you done any filtering of the resulting variants? You'll most likely need to do so.