Strategy to mapping RNAseq to a de novo transcriptome assembly?
0
0
Entering edit mode
5.4 years ago

Hello,

I am new to bioinformatics (and biostars...). Recently I built 2 de novo assemblies for the 2 species I am working on (no genome or transcriptome data available out there). I am using these assemblies to get expression data for both species. The goal is to compare expression of certain orthogroups across species and look for differences (I know, not the most accurate or best practice, but we are giving it a shot).

It was recommended to me that I :

  1. Trim my reads (originally PE, 150 bp each) down to 50 bp and map them in single end mode

  2. Use Hisat2 to map, and HTseq-count to get counts (rather than the automated Bowtie, RSEM pipeline that most people use after running Trinity)

  3. NOT map the reads to the longest_orfs.cds file I got from Transdecoder (though I have to use the gff3 file output by Transdecoder when running hisat2), but that I should map them to my transcriptome assembly fasta file (which has been filtered some)

I do not understand the reasoning behind #1 and #3, can anyone explain why I should do things this way?

Does anyone know why I might use Hisat2/HTseq over Bowtie/RSEM for a transcriptome assembly? I have not been able to find examples of Hisat2/HTseq being used with de novo assemblies and am concerned.

Thank you in advance for your help!

rna-seq transcriptome assembly mapping • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 2069 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6