Align RNA-seq data to a custom list of exons?
2
0
Entering edit mode
6.1 years ago
Cumol ▴ 40

A recent publication investigated splice variants of a gene I am interested in (using SMRT seq) and they described different/additional exons compared to what I find in NCBI or ENSEMBL.

I wanted to analyze splice variants and exon counts of this gene using the described exons from this publication in my RNA-seq data. I have a file with exon number and sequence.

How do I align my RNA-seq data to this list of exons? I thought about taking the normal .gtf file from ENSMBL and edit it to accommodate the exon changes. Is that the recommended way of doing so? And if so, how do I do it?

Thank you for your help!

RNA-Seq alignment exon • 1.5k views
ADD COMMENT
1
Entering edit mode
6.1 years ago

Yes you can directly change the gtf file to add the exon of interest. Then use featurecounts to count the number of reads per exon (using -t exon -g exon_id (if there is an exon_id in the gtf file).

ADD COMMENT
0
Entering edit mode

What is the best way to edit the GTF file?

Would I remove the lines associated with the Gene I am interested in and then add my custom lines?

I will probably need the exact start and end of each exon on the genome I am using for indexing, right?

ADD REPLY
0
Entering edit mode

awk?

ADD REPLY
0
Entering edit mode

I agree with Nicolas. Adding custom transcript annotations to the gtf is correct way forward. One thing to remember is to add the fasta sequence of custom exon annotations to their specific start positions in the chromosome/scaffold of interest in the genome .fa file.

ADD REPLY
0
Entering edit mode
6.0 years ago
Cumol ▴ 40

How do I turn my exon sequences into the fasta format? The only idea I had was to blast them against the target genome (using ensembl) and then convert the output into GTF. But it doesn't seem to be so straight forward.

Is there a better option?

ADD COMMENT

Login before adding your answer.

Traffic: 2281 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6