Hello BioStars Community,
I have been encountering an issue with SnpEff where it assumes circular chromosomes during the annotation of a eukaryotic genome (Savannah Sparrow), which should have linear chromosomes. The genome I'm working with is a lift-over from a closely related species, and I'm unsure if this contributes to the problem.
When running the snpEff.jar dump command to inspect the database contents, I receive output suggesting that SnpEff is creating "mirror genes" due to the assumption of circular chromosomes. Here is an example of the output:
Gene 'GENE_exon-XM_058815638.1-4' spans across coordinate zero: Assuming circular chromosome, creating mirror gene at the end.
Gene : Gene_JAKOOL010000202.1:-60081-58974
New gene : Gene_JAKOOL010000202.1:186076-305131
Chrsomosome : Chromosome_JAKOOL010000202.1:1-246157
...
Total: 4 added as circular mirrored genes (appended '_circ' to IDs)
Here's an example of the dump command used:
java -jar /spack/apps/linux-centos7-x86_64/gcc-8.3.0/snpeff-2017-11-24-ouvgtdabu7an6qtjqs47fzrrbx5jxq6b/bin/snpEff.jar dump -v bPasSan1 -c snpEff.config
I'm looking for insights on the following:
- Is there a specific setting required to correct this assumption for eukaryotic genomes in the config file?
- Any suggestions on how to edit the GFF file or modify the SnpEff configuration to prevent this incorrect circular chromosome assumption?
- If it is only adding 4 circular genes, will this have a big effect on downstream analyses?
Thank you for any guidance or advice you can provide!