Dear all,
Currently, I am working with a non-model organism (Acropora hemprichii). We have some sequencing reads from Illumina and some from direct RNA Nanopore reads. We were thinking of predicting a genome using both the short and long reads. At the end, I need the annotation of the genes, because I want to identify isoforms abundance. We are going to study differential isoform usage between two different conditions.
What is the best approach to studying isoforms? Should I do a de-novo transcriptome assembly or a Reference-guided assembly?
Should I do the transcriptome assembly only with the long reads or include the Illumina reads too?
How can I assess the quality of a transcriptome assembly?
I appreciate any guidance you can give me. Thank you
You are going to assemble a transcriptome using short reads and then confirm some of those transcripts with your long read nanopore data (which may not always represent a complete transcript). There is no way to predict the genome if you only have expression data.
If you have independent genome sequence data available then you could either directly align your data (short and long reads) to it and/or use it for reference guided assembly.
If the genome is less well known then there are going to be limits on how well you can achieve your aim of differential isoform analysis.