Hi there,
I was wondering if there are any tools that can identify mRNA 3' end and also 5' end by bulk RNA-seq data.
My goal is to determine the transcription start site and the transcription termination site.
I think maybe extracting mapping information from BAM file maybe an approach? But I'm not sure how to do with that.
Could you give me some advice?
Thanks in advance
Hi, If your organism of interest has reference annotation available (like genome sequence and gene information in GTF format), then a transcriptome assembly followed by mapping, should give you genomic coordinates of 5' and 3' ends _as per your data.. when seen through prior knowledge of available annotation_ My experience is with mammalian genomes and aligners such as STAR or HISAT2, followed by StringTie would get you the answer.
Thank you for your reply. But sadly there is no reference annotation for 5' and 3' UTRs, there is only CDS information. :(
StringTie should still work for the assembly