coverage for unique transcript and df process
1
0
Entering edit mode
7.5 years ago
Lila M ★ 1.3k

Hi guys, I would like to process a data frame in R

Chr start   end strand  transcript  Length  number_bp_overlap
chr1    879583  882140  -   uc031pkq    2858    297
chr1    1571100 1647617 -   uc001ags    76818   270
chr1    33117259    33151812    +   uc010ohk    34854   200
chr1    33117259    33151812    +   uc010ohk    34854   200
chr1    33117259    33151812    +   uc010ohk    34854   211
chr1    39670723    39748740    +   uc010oit    78318   386

What I want to do is to calculate the % of coverage for each transcript. So for each unique trasncript (e.g transcript uc010ohkm) what I need to do is to sum the number_bp_overlaps (200+200+211), and create a new data frame in which I could store the unique transcrpit with the total number_bp_overlap for each one.

what I am trying is

coverage <- ddply(df, "transcript", transform, coverage=sum(number_bp_overlap))
coverage <- subset(coverage, !duplicated(transcript))

but is not working at all As I am new in R, any clues about how can I do this quickly?

Thanks!

R data frame coverage • 1.7k views
ADD COMMENT
2
Entering edit mode

I think you could use the dplyr library

ADD REPLY
0
Entering edit mode

Yes, I've just edit my question, but the code is not working at all because it remove the duplicated.

ADD REPLY
0
Entering edit mode

!duplicated(transcript) this line actually removes duplication

ADD REPLY
0
Entering edit mode

Oook, and also the function sort the df alphabetically according with trasncript (I thought that I lost some date but is the way in which is sorted). Do you know any options to respect the initial order?

Thanks!

ADD REPLY
0
Entering edit mode
7.5 years ago

You might be looking for this, if not please add reproducible input and output. Thanks!

library(dplyr)

df = data.frame(x = c("a","b","b"), y = c(1:3)) df %>% group_by(x) %>% summarise(y = sum(y))

ADD COMMENT

Login before adding your answer.

Traffic: 1879 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6