Question

How to fix 5'UTR annotation with RNA-seq data?

0

Entering edit mode

7.4 years ago

I0110 ▴ 160

Hi,

I am studying the transcriptome of Arabidopsis. Interestingly, the 5' UTR of the annotation is usually too long.
Here is an example. You can see the RNA-seq reads covers a much smaller region of the annotated 5' UTR. Is there a way to fix that? I hope to get the gtf of the shorter isoform from my RNA-seq data.

Thanks!

RNA-Seq • 2.7k views

ADD COMMENT • link updated 7.4 years ago by rajeev.vikram ▴ 40 • written 7.4 years ago by I0110 ▴ 160

score 0 · Answer 1 · 2018-02-25

0

Entering edit mode

7.4 years ago

rajeev.vikram ▴ 40

Hello,

how many samples are you using? are all of them consistently showing hits to the same region of the annotated transcript (UTR)? Did you verify the reported isoforms with your alignment hits? There is "No" way to change the pattern of reported hits of an alignments. You can include/exclude some hits by tinkering with the parameters of the aligner you have used. Alternately, you can use two different aligners and compare their aligned hits against the annotated transcripts. If the results still seem consistant, you may be looking at an isoform. You would need to confirm it though.

Here's a more descriptive paper on identifying variants: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3791257/

Hope it helped a bit.

Cheers.

ADD COMMENT • link 7.4 years ago by rajeev.vikram ▴ 40

0

Entering edit mode

Yes, this has been verified in many data sets, using many different aligners. Is there a tool or method to systematically fixing this type of annotation issue (i.e. give me a new gtf with updated 5'UTR for all genes)?

ADD REPLY • link 7.4 years ago by I0110 ▴ 160

1

Entering edit mode

It seems that you want to generate a consensus transcriptome for your samples, you can use the StringTie transcript assembler for the purpose. specifically, the option: StringTie --merge to generate a merged (consensus) gtf file from your samples. It will also generate the consensus isoforms. You should use this gtf file for further differential analysis.

Here's the link for the workflow: https://ccb.jhu.edu/software/stringtie/index.shtml?t=manual

ADD REPLY • link 7.4 years ago by rajeev.vikram ▴ 40