Question

Transcriptome analysis of two closely related plant species

0

Entering edit mode

4.4 years ago

aces • 0

Hi,

I am sorry for a basic question; I am a wet lab biologist trying to perform RNA-Seq with only limited experience. I hope this kind community can help me with some guidance.

I have two plant species that I want to look at their response to an insect.There is a reference genome (with an annotation) for one of the species. I mapped reads of the second species to the genome of the first one using HISAT2. I found ca. 65% map uniquely to the genome with only 2% mapped >1 location (76.76% overall alignment rate).

I would like to generate a master set of transcripts for differential expression analysis of both species. With the decent mapping rate, I am confused whether it is OK to perform de novo assembly of the unmapped reads and annexing them as transcripts specific for the second species? Alternatively, I can try with genome-guided assembly of the second species using its raw reads and later find ortholog groups with the first species. If I go this route, I am just afraid that some of the annotate genes from the first species may be clustered into the same homolog groups and will interfere with differential expression analysis.

Thanks in advance for all comments!

rna-seq • 1.4k views

ADD COMMENT • link 4.4 years ago by aces • 0

0

Entering edit mode

Thanks a lot for the advice! I will try looking at TIN as you suggested. :)

Just in case TIN doesn't look good, would you recommend de novo assembly and orthofinder?

ADD REPLY • link 4.4 years ago by aces • 0

0

Entering edit mode

If TIN numbers don't look good you should perform a gene level analysis. If the transcripts are not covered evenly the data won't be able to reliably distinguish between different transcripts.

ADD REPLY • link 4.4 years ago by Istvan Albert 101k

score 1 · Answer 1 · 2020-07-06

1

Entering edit mode

4.4 years ago

Istvan Albert 101k

I would look into the concept called transcript integrity (TIN) to evaluate how well do the reads conform to the transcripts across both species, if the TIN numbers look good and most transcripts are covered you could get meaningful results without assembling anything.

You can compute TIN with the tool called tin.py of the rseqc package

http://rseqc.sourceforge.net/