If in case you used GTF file as reference annotation,
1) You can just convert the annotation into table format.
Example: C: How do I get the gene annotation for the latest version (GRCh38)?
2) Import you GTF converted table (Geneid GeneSymbol Chromosome Start End Class Strand Length
) and your matrix from featurecounts (Geneid sample1expr Sample2expr Sample3expr
) into R and use 'merge
' by 'Geneid' column.
x <- read.table("featurecounts.matrix", header=T, sep="\t")
annotation <- read.table("annotation.txt", header=T, sep="\t")
featurecounts_annotated <- merge( annotation, x, by='Geneid')
3) Then you can sum the counts in the sample column based on RNA class you are interested in.
Two-step:
### Two-step 1) sum the reads by column class
sample1_countSum <- aggregate(cbind(featurecounts_annotated$sample1expr) ~ Class, data = featurecounts_annotated, sum)
### Two-step 2) calculate percentage
sample1_countSum[,"percentage"] <- ( sample1_countSum$V1/sum( sample1_countSum$V1))*100
Single-step:
sample1_result <- aggregate((cbind(featurecounts_annotated$sample1expr)/sum(featurecounts_annotated$sample1expr))*100 ~ Class, data = featurecounts_annotated, sum)
Final output you will have Class of RNAs with corresponding percentage mapped reads from sample1.
Hi EagleEye, I've already generated a table in the terminal that looks like the following:
saved as a .txt file. Would I still have to carry out 2) or would I be able to go straight to 3).
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.This comment belongs under @EagleEye's answer.
You mean you got matrix like this,
Yes exactly, thats the matrix I've got !
Consider this matrix as 'featurecounts.matrix' in the below example. Follow other steps I mentioned.