Mutation analysis on RNAseq
1
0
Entering edit mode
8.2 years ago
Ron ★ 1.2k

Hi all,

I am doing Mutation Analysis from RNA-seq data.I used RNAseqmut(https://github.com/davidliwei/rnaseqmut) to get the mutations.I want to annotate the mutations from the output generated using snpEFF. I am getting error while annotating the vcf file.

Command:

snpEff/SnpSift.jar annotate  /BED_GTF_References/dbsnp-V-146-All.vcf  {1}  > $TMPDIR/mutation_anno/{1/.}_dbsnp_annotated.vcf


**ERROR**
VcfFileIterator.parseVcfLine(115):  Fatal error reading file '/tmp/713110.1.standard.q/mutation_anno/top1000.txt' (line: 2):
chr1    13684   C   T   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   2   0   0   0   0   0   0   0   0   0   0   0   0   0   0
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: WARNING: Unkown IUB code for SNP '0'
    at ca.mcgill.mcb.pcingola.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:116)
    at ca.mcgill.mcb.pcingola.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:167)
    at ca.mcgill.mcb.pcingola.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:56)
    at ca.mcgill.mcb.pcingola.fileIterator.FileIterator.hasNext(FileIterator.java:67)
    at ca.mcgill.mcb.pcingola.fileIterator.MarkerFileIterator.hasNext(MarkerFileIterator.java:64)
    at ca.mcgill.mcb.pcingola.snpSift.SnpSiftCmdAnnotate.annotate(SnpSiftCmdAnnotate.java:101)
    at ca.mcgill.mcb.pcingola.snpSift.SnpSiftCmdAnnotate.run(SnpSiftCmdAnnotate.java:290)
    at ca.mcgill.mcb.pcingola.snpSift.SnpSiftCmdAnnotate.run(SnpSiftCmdAnnotate.java:257)
    at ca.mcgill.mcb.pcingola.snpSift.SnpSift.run(SnpSift.java:354)
    at ca.mcgill.mcb.pcingola.snpSift.SnpSift.main(SnpSift.java:69)
Caused by: java.lang.RuntimeException: WARNING: Unkown IUB code for SNP '0'
    at ca.mcgill.mcb.pcingola.vcf.VcfEntry.parseAltSingle(VcfEntry.java:872)
    at ca.mcgill.mcb.pcingola.vcf.VcfEntry.parseAlts(VcfEntry.java:756)
    at ca.mcgill.mcb.pcingola.vcf.VcfEntry.parse(VcfEntry.java:721)
    at ca.mcgill.mcb.pcingola.vcf.VcfEntry.<init>(VcfEntry.java:102)
    at ca.mcgill.mcb.pcingola.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:113)

Does anyone would have any idea on this? Do I have to some filtration on the vcf file before doing annotation?

Thanks,

Ron

RNA-Seq mutation next-gen SNP snpEff • 2.1k views
ADD COMMENT
0
Entering edit mode

That quoted line doesn't look like its vcf format.

ADD REPLY
0
Entering edit mode
6.9 years ago

chr1 13684 C T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Looks like a formatting error as swbarnes2 commented.

VCF files are formatted as such

CHROM   POS   ID   REF   ALT ....

Here's the documentation for VCFs https://samtools.github.io/hts-specs/VCFv4.2.pdf

You are missing the "ID" entry which might explain the error. Your file looks like it's formatted for PLINK or some other software tool. In the case of a REF=C and ALT=T you should only expect two alleles , encoded as 0 or 1. Your entry has 0 and 2 which is also not in correct VCF format. Additionally if your samples are diploid, the genotypes should be 0/0, 0/1, or 1/1 for the variant you supplied.

Alternatively, your file is almost in correct input format for ANNOVAR http://annovar.openbioinformatics.org/en/latest/

chr1 13684 13684 C T

works for ANNOVAR http://annovar.openbioinformatics.org/en/latest/user-guide/input/#annovar-input-file

ADD COMMENT

Login before adding your answer.

Traffic: 2646 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6