Entering edit mode
5.5 years ago
maricom
•
0
Hi,
I annotated variants in a bacterial genome using snpEff, but when I saw snpEff_summary.html, there were 944 warnings colored in yellow. Varians were detected using gatk HaplotypeCaller.
When I saw the warnings, they were almost WARNING_TRANSCRIPT_NO_START_CODON. However, when I checked both reference genome and CDS, both have start codon.
I have no idea why they were tagged as warnings.
If anyone has any idea, that would help me a lot.
Thank you.
I annotated using this command
java -jar snpEff.jar -c snpEff.config -i vcf -o vcf bacteria1 SNPs_counted_using_HaplotypeCaller.vcf 1> res.vcf
one of the results I got
bacteria1 99501 . C A 697.6 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=2.555;DP=187;ExcessHet=3.0103;FS=0.784;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=3.73;ReadPosRankSum=0.989;SOR=0.762;ANN=A|upstream_gene_variant|MODIFIER|D9_0073|GENE_D9_0073|transcript|TRANSCRIPT_D9_0073|protein_coding||c.-4774G>T|||||4774|,A|upstream_gene_variant|MODIFIER|D9_0074|GENE_D9_0074|transcript|TRANSCRIPT_D9_0074|protein_coding||c.-4536G>T|||||4536|,A|upstream_gene_variant|MODIFIER|D9_0078|GENE_D9_0078|transcript|TRANSCRIPT_D9_0078|protein_coding||c.-768G>T|||||768|,A|upstream_gene_variant|MODIFIER|D9_0079|null|transcript|D9_0079|protein_coding||c.-698C>A|||||609|WARNING_TRANSCRIPT_NO_START_CODON,A|upstream_gene_variant|MODIFIER|D9_0080|GENE_D9_0080|transcript|TRANSCRIPT_D9_0080|protein_coding||c.-2444C>A|||||2444|,A|upstream_gene_variant|MODIFIER|D9_0081|GENE_D9_0081|transcript|TRANSCRIPT_D9_0081|protein_coding||c.-3194C>A|||||3194|,A|upstream_gene_variant|MODIFIER|D9_0082|GENE_D9_0082|transcript|TRANSCRIPT_D9_0082|protein_coding||c.-4759C>A|||||4759|,A|downstream_gene_variant|MODIFIER|D9_0075|GENE_D9_0075|transcript|TRANSCRIPT_D9_0075|protein_coding||c.*3987C>A|||||3987|,A|downstream_gene_variant|MODIFIER|D9_0076|GENE_D9_0076|transcript|TRANSCRIPT_D9_0076|protein_coding||c.*3083C>A|||||3083|,A|downstream_gene_variant|MODIFIER|D9_0077|GENE_D9_0077|transcript|TRANSCRIPT_D9_0077|protein_coding||c.*2167C>A|||||2167|,A|intergenic_region|MODIFIER|D9_0078-D9_0079|GENE_D9_0078-null|intergenic_region|GENE_D9_0078-null|||n.99501C>A|||||| GT:AD:DP:GQ:PL 0/1:157,30:187:99:705,0,5786
I created my gtf file like this
seqname source feature start end score strand frame attribute
bacteria1 bacteria1 CDS 101 1507 . + 0 gene id "D9_0001";
bacteria1 bacteria1 CDS 1569 2666 . + 0 gene id "D9_0002";
bacteria1 bacteria1 CDS 2663 4378 . + 0 gene id "D9_0003";
I created my own database and adding this to snpEff.config
bacteria1.genome :bacteria1
bacteria1.chromosomes : bacteria1
bacteria1.bacteria1.codonTable : Bacterial_and_Plant_Plastid
Perhaps also contact the author Pablo Cingolani pcingola@users.sourceforge.net and cross-reference this thread. Be sure to have a read on Asking for help to provide the necessary information.
Hi SMK, Thank you for your advice! I've sent the question to him, too.
I'm encountering the same issue. Has there been any resolution to this, or response from the author? Any help is appreciated.