Hi All
I have done exome sequencing for 6 samples. Now I am calling variations using GATK. I am calling variations for each sample individually and using all the sites across the genome (dbSNP_135. vcf file). also running vqsr on the same data for single sample and all the training sets as mentioned by GATK for whole genome.
what I observe in the output is, Ti/Tv ratio obtained is very low (from what is expected form exome data, also the tranche plots show that there are lots of FPs.). I have following two questions:
1) Can anyone tell, should I call variations using all the samples or can do it for each sample individually?
2) Also, can I call variations for exome data using all the sites from reference (whole) genome or I need to give some list of regions which have been captured?
thanks in advance
Have you done any filtering of the resulting variants? You'll most likely need to do so.