Entering edit mode
6 months ago
ZuelTech
•
0
Hi,
I want to analyze the expression of stress-response related genes of a non-model organism. How will I select the best transcript for a gene?
Thank you!
There is no such thing as the best transcript, there is the possibility to select the longest transcript per gene from a transcriptome assembly. Why not include all transcripts from the transcriptome and use Salmon or Kallisto for transcript abundance?
Thank you. I actually have a csv file containing TPM values produced from Salmon. How can I use this as your suggestion?
What Michael said. You have a few options: You can pick the "canonical" = longest transcript, or you can use something like MANE to pick the most used (in papers, etc.) transcript.
For a way to pick the "best" transcript per gene (the algorithm that VEP uses), see: https://useast.ensembl.org/info/docs/tools/vep/script/vep_other.html#pick_options
The canonical is not always the longest transcript. I support the suggestion to do gene level analysis. Meaning, use something like tximport from Bioconductor to sum transcript counts per gene into a single gene level value.
Maybe the later versions have corrected but canonical defaults to the longest, atleast using snpEff. VEP seems to handle this better. I do always go for the MANE/RefSeq-MANE as that's always accurate. ESR1 and BRCA1 are examples where canonical by snpEff is definitely wrong. I think older VEP versions used to get this wrong too, but v100+ (at least) get this right.
Thanks! @ATpoint I actually have two csv files containing TPM values produced by Salmon. One csv file has the Transcript IDs, and the other has the Gene IDs. Given that your suggestion to do gene level analysis, how will I use this to create a heatmap for the expression of stress-related genes?
Those are two different goals. To analyze salmon output (say for differential expression analysis), you can import it using tximport and use DESeq2. You can create heatmaps of any metrics using ComplexHeatmap, but comparing across samples is not straightforward.
How about the gene level analysis?
tximport + DESeq2. The former has options using which you can pick transcript/gene level.
thanks. Can I use the gene level analysis basis in creating a heatmap for expression analysis?
Heatmaps are not useful for any kind of robust analysis. They can provide a good visual. Please don't use heatmaps for anything but eyeballing to spot potential areas of investigation.
maybe I mislead you in my question. I just wanted to know how can I use gene level analysis instead of transcript level, especially in expression analysis. Is this a way for me to choose a representative (or best) transcript?
@Ram Thanks. Is this tool VEP applicable for non-model organism? I only have denovo assembly. Also, I would appreciate if you could share a video tutorial on this one.
You should see tutorials if you google "VEP tutorial video". I don't think it can work with non model organisms, check out their website though, they have an extensive list of organisms.