Question

Identify jumbled transcripts (scrambled exons) using HISAT2, stringtie, and GFFcompare?

0

Entering edit mode

5.6 years ago

wolfgang.rumpf ▴ 30

I have some PacBio RNA-seq data that should have a jumbled gene in it (e.g. the exons are not in the canonical order but instead go something like 1, 2, 3, 5, 6, 4, 5, 7, 8 etc - scrambled exons). I thought that by mapping my FASTQ with HISAT2 followed by mapping the resulting .bam to the reference GTF that I would see this jumbling event in the resulting GTF for my BAM - but nothing - the codes are all "=" for this gene when I do a GFFCompare. If I open the BAM in IGV I see the jumbling event, but what I'm looking for is a way of find other jumbling events that I don't already know about. Any suggestions?

RNA-SEQ hisat2 stringtie gffcompare • 1.3k views

ADD COMMENT • link updated 5.4 years ago by Biostar 20 • written 5.6 years ago by wolfgang.rumpf ▴ 30

0

Entering edit mode

If you have PacBio data why are you using HISAT2? A proper long read aligner (that can not only accommodate the error profile plus the length) like minimap2 would be a much better choice.

ADD REPLY • link 5.6 years ago by GenoMax 147k

0

Entering edit mode

Okay, that makes sense. But what about the downstream pipeline after minimap2? Was I right in assuming that stringtie and GFFcompare should show me jumbled/scrambled exons?

ADD REPLY • link 5.6 years ago by wolfgang.rumpf ▴ 30

0

Entering edit mode

Are your individual reads long enough that they will span these shuffled exons and also give you read depth to generate confidence (number of reads aligned) in the alignments? You will have to carefully examine the alignments to see how minimap2 aligns the reads.

Just to clarify. If HISAT2 pipeline has produced results that make sense to you then great. I am just saying that it would be useful to examine what minimap2 does in addition.

ADD REPLY • link 5.6 years ago by GenoMax 147k