I've got RNA-seq data (Homo sapiens, single ended, 75 bp, Illumina) and want to look for general missplicing - intron retention (IR) - in a given condition vs control. For that I've tried MISO with the supplied gff3 files. However, these contain only annotated IR events whereas I want to look at all introns in the genome. The problem is that I am struggling to create the appropriate gff file for MISO. The supplied IR events gff looks like this:
head -20 RI.hg19.revised.gff3
chr1 RI gene 17233 17742 . - . ID=chr1:17601:17742:-@chr1:17233:17364:-;Name=chr1:17601:17742:-@chr1:17233:17364:-
chr1 RI mRNA 17233 17742 . - . ID=chr1:17601:17742:-@chr1:17233:17364:-.A;Parent=chr1:17601:17742:-@chr1:17233:17364:-
chr1 RI mRNA 17233 17742 . - . ID=chr1:17601:17742:-@chr1:17233:17364:-.B;Parent=chr1:17601:17742:-@chr1:17233:17364:-
chr1 RI exon 17233 17742 . - . ID=chr1:17601:17742:-@chr1:17233:17364:-.A.withRI;Parent=chr1:17601:17742:-@chr1:17233:17364:-.A
chr1 RI exon 17601 17742 . - . ID=chr1:17601:17742:-@chr1:17233:17364:-.B.up;Parent=chr1:17601:17742:-@chr1:17233:17364:-.B
chr1 RI exon 17233 17364 . - . ID=chr1:17601:17742:-@chr1:17233:17364:-.B.dn;Parent=chr1:17601:17742:-@chr1:17233:17364:-.B
chr1 RI gene 17233 17742 . - . ID=chr1:17606:17742:-@chr1:17233:17368:-;Name=chr1:17606:17742:-@chr1:17233:17368:-
chr1 RI mRNA 17233 17742 . - . ID=chr1:17606:17742:-@chr1:17233:17368:-.A;Parent=chr1:17606:17742:-@chr1:17233:17368:-
chr1 RI mRNA 17233 17742 . - . ID=chr1:17606:17742:-@chr1:17233:17368:-.B;Parent=chr1:17606:17742:-@chr1:17233:17368:-
chr1 RI exon 17233 17742 . - . ID=chr1:17606:17742:-@chr1:17233:17368:-.A.withRI;Parent=chr1:17606:17742:-@chr1:17233:17368:-.A
chr1 RI exon 17606 17742 . - . ID=chr1:17606:17742:-@chr1:17233:17368:-.B.up;Parent=chr1:17606:17742:-@chr1:17233:17368:-.B
chr1 RI exon 17233 17368 . - . ID=chr1:17606:17742:-@chr1:17233:17368:-.B.dn;Parent=chr1:17606:17742:-@chr1:17233:17368:-.B
chr1 RI gene 17606 18061 . - . ID=chr1:17915:18061:-@chr1:17606:17742:-;Name=chr1:17915:18061:-@chr1:17606:17742:-
chr1 RI mRNA 17606 18061 . - . ID=chr1:17915:18061:-@chr1:17606:17742:-.A;Parent=chr1:17915:18061:-@chr1:17606:17742:-
chr1 RI mRNA 17606 18061 . - . ID=chr1:17915:18061:-@chr1:17606:17742:-.B;Parent=chr1:17915:18061:-@chr1:17606:17742:-
chr1 RI exon 17606 18061 . - . ID=chr1:17915:18061:-@chr1:17606:17742:-.A.withRI;Parent=chr1:17915:18061:-@chr1:17606:17742:-.A
chr1 RI exon 17915 18061 . - . ID=chr1:17915:18061:-@chr1:17606:17742:-.B.up;Parent=chr1:17915:18061:-@chr1:17606:17742:-.B
chr1 RI exon 17606 17742 . - . ID=chr1:17915:18061:-@chr1:17606:17742:-.B.dn;Parent=chr1:17915:18061:-@chr1:17606:17742:-.B
chr1 RI gene 14407 16765 . - . ID=chr1:14970:16765:-@chr1:14407:14829:-;Name=chr1:14970:16765:-@chr1:14407:14829:-
chr1 RI mRNA 14407 16765 . - . ID=chr1:14970:16765:-@chr1:14407:14829:-.A;Parent=chr1:14970:16765:-@chr1:14407:14829:-
1) Any suggestions on how to create something like this for all (RefSeq) UCSC annotated exons? I've tried a few things now but nothing seemed to work.
2) Is there any other software/R package that could be used for IR?
Cheers.