I am trying to use a VCF containing snps variants to change the mouse reference (GRCm38- c57BL/6J) with BALB/cJ snps.
After running this command:
java \
-jar ~/programs/GenomeAnalysisTK.jar \
-T FastaAlternateReferenceMaker \
-R ~/genome/mouse_GRCm38.p4/GRCm38.primary_assembly/GRCm38.primary_assembly.fa \
-o ~/BALBcJ.snp.primary.fa \
-V ~/BALB_cJ.snps.vcf
The following ERROR shows up:
ERROR MESSAGE: Input files /home/tiagocastro/BALB_cJ.snps.vcf and reference have incompatible contigs: The contig order in /home/tiagocastro/BALB_cJ.snps.vcf and referenceis not the same; to fix this please see: (<a href="https://www.broadinstitute.org/gatk/guide/article?id=1328" target="_blank">https://www.broadinstitute.org/gatk/guide/article?id=1328</a>), which describes reordering contigs in BAM and VCF files..
ERROR /home/tiagocastro/BALB_cJ.snps.vcf contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, X, Y]
ERROR reference contigs = [1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 3, 4, 5, 6, 7, 8, 9, MT, X, Y, JH584299.1, GL456233.1, JH584301.1, GL456211.1, GL456350.1, JH584293.1, GL456221.1, JH584297.1, JH584296.1, GL456354.1, JH584294.1, JH584298.1, JH584300.1, GL456219.1, GL456210.1, JH584303.1, JH584302.1, GL456212.1, JH584304.1, GL456379.1, GL456216.1, GL456393.1, GL456366.1, GL456367.1, GL456239.1, GL456213.1, GL456383.1, GL456385.1, GL456360.1, GL456378.1, GL456389.1, GL456372.1, GL456370.1, GL456381.1, GL456387.1, GL456390.1, GL456394.1, GL456392.1, GL456382.1, GL456359.1, GL456396.1, GL456368.1, JH584292.1, JH584295.1]
So Trying to fix, I used the perl script in the link to sort properly within the reference.
I did this:
./sortByRef.pl \
~/BALB_cJ.snps.vcf \
/home/tiagocastro/genome/mouse_GRCm38.p4/GRCm38.primary_assembly/GRCm38.primary_assembly.fa.fai > ~/BALB_cJ.snps_sorted.vcf
Using the new vcf file, a new error is shown:
ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file '/home/tiagocastro/BALB_cJ.snps_sorted.vcf' could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:
ERROR Name FeatureType Documentation
ERROR BCF2 VariantContext (this is an external codec and is not documented within GATK)
ERROR VCF VariantContext (this is an external codec and is not documented within GATK)
ERROR VCF3 VariantContext (this is an external codec and is not documented within GATK)
Looking at the head of each, sorted and basic vcf, I can see that is little different, the new file does not has the header.
Can someone help me?