Hello everyone,
I am trying to run Recalibration stage from SNP calling for whole genome sequencing data. But, my reference genome do not have a known sites VCF file. So the -knownSites option is removed from my command line and i encounter the following error (Picture in Attachment):
My question is here, is it necessary for the reference genomes that do not have a known sites VCF file to perform the Recalibration step?
Cod i run:
java -jar /home/m.rafiepour222/GenomeAnalysisTK-3.8-1-0-gf15c1c3ef/GenomeAnalysisTK.jar -R /home/m.rafiepour222/GCF_000471725.1_UMD_CASPUR_WB_2.0_genomic.fa -T BaseRecalibrator -I /home/m.rafiepour222/1_BBKHU01_F/1_BBKHU01_F.sort.rmdup.bam -o /home/m.rafiepour222/1_BBKHU01_F/1_BBKHU01_F.grp enter code here
My Error:
As seen in the image, the error is associated with the same known sites VCF file...
It can be skipped as discussed here (http://evodify.com/gatk-the-best-practice-for-genotype-calling-in-a-non-model-organism/). OP raised the similar issue (base recalibration using variant information for non-model organism) in gatk forum: https://gatkforums.broadinstitute.org/gatk/discussion/4164/base-re-calibration-when-i-dont-have-a-publicly-available-dbsnp. OP proposed a method. However it didn't work. Suggestion in OP's post (mentioned above) was to skip base recalibration. I guess same would work for Variant recalibration as both of them use variant information.
many thanks for your reply,
yes, OP suggested a similar issue, but I think the proposal is both difficult and not working. I have tried a lot, but I could not get any result And this has caused me concern. I do not know what to do?
OP suggested to skip base recalibration.
Is this subject approved by the GATK team ?? If confirmed, can you submit a document ?