Entering edit mode
7.1 years ago
mail2steff
▴
70
I am trying to analyze alternative splicing events in five different samples. I've aligned the samples to the reference genome using Tophat and assembled the transcripts using cufflinks. I obtained 5 different gtf files from cufflinks. I have to identify and count muti-exon genes from each samples. How can I get the number of multi-exon genes?
Have you looked at featureCounts? If you have 5 separate GTF files you can use them individually to get gene level counts.
No. I have not. Will I get the number of "multi-exons" genes from each sample?
It is possible outside R. Btw, what is the cut off for multi exon genes. Following code is for genes with more than 1 exon.
Example gtf:
Output (need to add gene symbol and exon number as header:)):
To get genes with one exon (2872 genes):
To get genes with more than one exon (25155 genes):
Thank you so much for the answer. Ill try this. And there is no cut-off for number of exons for a single gene
np. next time, please post example data.