Entering edit mode
9.6 years ago
idedios
▴
30
Right now I'm trying to change the contigs in the dbsnp vcf to match those of my reference genome (from 1, 2, ..., Y, MT to chrM, chr1, chr2,...)
I'm currently using JDK 1.7 u79 for compatibility with MuTect 1.7
SortVcf was used as such:
java -jar picard.jar SortVcf \
INPUT=dbsnp.vcf \
OUTPUT=dbsnp.fixed.vcf \
SEQUENCE_DICTIONARY=hg19.dict
Here's SortVcf's output:
Exception in thread "main" java.lang.NullPointerException
at htsjdk.variant.variantcontext.VariantContextComparator.compare(VariantContextComparator.java:84)
at htsjdk.variant.variantcontext.VariantContextComparator.compare(VariantContextComparator.java:21)
at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
at java.util.TimSort.sort(TimSort.java:203)
at java.util.Arrays.sort(Arrays.java:727)
at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:218)
at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:165)
at picard.vcf.SortVcf.sortInputs(SortVcf.java:154)
at picard.vcf.SortVcf.doWork(SortVcf.java:87)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:187)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)
Have you already added prefix "chr" to your dbSNP file and then using the new file as an input ? I am not sure if this is the problem but may be try the following command first on your dbSNP file:
You can now use the dbSNP_new,vcf file as an input for the picard. I have a python script that sorts vcf file based on a given order of chromosomes OR You can download it from here and modify it accordingly.
On top off adding the prefix I have to fix the mitochondrial chromosome contig from MT to chrM. Then maybe SortVcf can reorder it.
Thanks!
:-) OR
will not produce a new file. The
-i
will edit the file on the spot.Thanks I really need to learn to use awk and sed.
Yes. They are really handy and pretty fast when it comes to text manipulation. I would strongly suggest you to learn basic awk one liners.