Entering edit mode
7.8 years ago
sanathoi
▴
10
Whats approaches can be done to find DEGs for species whose GTF/GFF file is not reported yet?
Whats approaches can be done to find DEGs for species whose GTF/GFF file is not reported yet?
Try Gene annotation using Blast2Go or RAST, which are good at gene prediction and functional annotation for non-model species without references or annotations. Then you can create your own annotation file for downstream DEGs analysis.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What have you tried? Please provide a concrete example of your organism and setup.
I was trying to find DEG by reanalyzing published reads from SRA using Galaxy server. Most of the data that mention on some tutorials have their own GTF/GFF file. For example, lets say ZIKV whose genome annotation file is not reported yet. What approaches can be used to find DEG from those available reads of ZIKV?
Good you provided more information! Assuming we are talking about Zika virus (ssRNA virus), there is a reference genome here as a genbank file: https://www.ncbi.nlm.nih.gov/nuccore/NC_012532.1?report=gbwithparts&log$=seqview If you can't find a GFF/GTF file you can convert the Genbank file to GFF, search Genbank to GFF.
Gene expression in viruses occurs only in the host and as an ssVirus+ no DNA stage there is no integration into the host genome. Whether Zika is therefore a good example for you to study differential expression could be doubted.
Assuming Zika follows the canonical flavivirus replication cycle, a single mRNA will be generated (which lacks a poly-A tail). This mRNA will be translated into a single polyprotein precursor, which is then processed into individual viral proteins. I believe that the genome and mRNA are both positive strand. To add to the confusion, negative strand copies of the mRNA are made to produce more positive strand copies, however this negative strand copy does not contribute to gene expression.
In other words, I'm not sure the biology of flaviviruses is compatible with using RNA-Seq to detect differential expression of viral genes.
See this for a review: http://www.pnas.org/content/99/18/11555.long
In general, ssRNA virus replication approaches are a bit of a pain. You have to be able to differentiate the various intermediates from the mRNAs, which isn't always easy. Some ssRNA viruses use nested transcription which I'd wager would make things worse. I always want to do this, but I've never had the patience to dig through so much crap when the host expression is far more important.
If you still want to look at viral gene expression in hosts, I would highly recommend you look at orthopox. They're big dsDNA viruses that almost behave like bacteria in terms of gene expression (although they're still viruses and pull all sorts of rule breaking). Here's a really great paper on the subject: https://www.ncbi.nlm.nih.gov/pubmed/25903347
Thank you #MichaelDondrup