Hi, I'm a beginner analyzing genomic data and using R and I have a doubt that I don't know how to solve. I'm using Rsamtools to analyze mRNAseq_data but once I have the reads covering each gene I would like to add the information about their chromosome, start and stop of each gene. I have downloaded the gene information from the UCSC browser as a RangeList following this command line:
library(GenomicFeatures)
txdb = makeTranscriptDbFromUCSC(genome='hg19',tablename='ensGene')
I have tried to coerce the RangeList into a data frame in order to get this "extra data" but the number of observations changes so I don't know how to extract from the RangeList the information I need of each gene.
Right now I have this data frame:
Gene Reads
AGR37373.F06 5
AGR37273.F04 0
And I would like to have this:
Gene Chr Start Stop Reads
AGR37373.F06 1 20 150 5
AGR37273.F04 1 300 410 0
I'd be very grateful if someone could give me any advice about how to solve this problem.
Thanks!
Did you try to merge the two dataframes? read ?merge and merge on the the geneid.