Question

Hisat2 build index for genome + transcriptome files together

0

Entering edit mode

7.5 years ago

bioinfo17 ▴ 30

Hi,

Could anyone please explain how to use Hisat2 when both the genome and transcriptome is available for the same reference strain. The genome is annoated using maker and hence have the gene models in gff format. The transcriptome is also available but is present only as an assembled fasta format. Is it OK to merge both the genome + transcriptome files together and build an index?

Thanks for your time.

rna-seq • 3.1k views

ADD COMMENT • link updated 7.5 years ago by Devon Ryan 105k • written 7.5 years ago by bioinfo17 ▴ 30

score 1 · Answer 1 · 2017-07-05

1

Entering edit mode

7.5 years ago

Devon Ryan 105k

Merging the genome and transcriptome would result in largely useless output (you'd have mostly multimappers). Better would be to either align against the transcriptome or against the genome (having created the index with the GFF file).

ADD COMMENT • link 7.5 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks Devon for your quick reply. The % alignment rate was very low (~2-20%) when the sample reads were mapped against the individual genome fasta file, however, the alignment % significantly increased (~60-80%) when the sample reads were mapped against the transcriptome alone. I am unsure which one to use for downstream differential expression analysis using Stringtie etc. ps: I did not use the gff for building index, only used either the genome or transcriptome fasta file for creating index.

Any suggestions please? Many thanks

ADD REPLY • link 7.5 years ago by bioinfo17 ▴ 30

0

Entering edit mode

It sounds like the genome assembly isn't very good. Align against the transcriptome and don't use stringTie, but rather salmon.