Entering edit mode
7.9 years ago
BehMah
▴
50
Hi, I have mapped my RNA-seq data using Tophat then Tophat-Fusion to identify circRNA and now looking for a R/perl/python script to calculate RPM (circRNA reads per million mapped reads) and mapped reads should be the mean of tophat and tophat-fusion mapped reads.
I have circRNA identified (bed file) for each sample. Sorry I am new to bioinformatic and your help is appreciated. :)
What is your goal? compare circRNA between samples? RNA vs circRNA in the same sample? etc.
I am doing differential expression of circRNA in some samples and trying to normalise circRNA to mapped reads from Tophat and Tophat-Fusion.
Are you comparing the circRNAs to their native form?
Hi, there is a simple perl script I wrote for calculating RPKM. You can adjust the code for RPM by removing 'transcript_length' ($len_col=$ARGV[2]) variable.
And also remove it from the final calculation,
Replace,
With
Usage with test data after editing:
Hi EagleEye,
Thank you sooooo much for your help, I will run it and see how it will go.
Just as my tophat & tophat-fusion and output of circ_finder tool (to get circRNA list) are different files, for mean reads of TopH and TopH-F, I assumed I should put the mean number in $libarray[$i] ? Thanks :)
If you can post your few lines of your data, I will be able to tell.
I have the alignment information for both top-h &top-Fusion so I can get the mean of mapped reads for each sample. Also I have circRNA read number for each sample as well (see below). Need this RMP: circ read (in circ file) per mean of mapped reads (from TopH and Top-Fusion) for 40 samples for expression analysis.
circRNA file (below) is a bed file (chrm, read number, host gene, genome coordinate) that I got by running top-Fusion outfile through a circRNA finder script :
chrm start end circ_name read num Host gene 8 2134780 2159644 circular_RNA_1 15 DHR 5 2134780 2293345 circular_RNA_2 30 CSH 12 2821949 2829687 circular_RNA_3 29 ZFY 6 4924929 4925500 circular_RNA_4 21 PCD 11 6863844 6911166 circular_RNA_5 10 TBL