Entering edit mode
2.6 years ago
msening
•
0
Hi, I am using multiple Bam files to get the coverage for each of them.
file1 <- BamFile("bamfile.sorted.bam")
x<-readGAlignments(file1)
xcov<-coverage(x)
And then I wanted to draw the coverage plots:
lapply(xcov, function(x) {
plot(as.numeric(x), type="l")
})
But I have up to 3000 segments in xcov
(the RleList object), so I would like to filter them and keep the ones that have more than 1 run. For example, I don't want these in my coverage plots:
RleList of length 3138
$seg1
integer-Rle of length 10731 with 1 run
Lengths: 10731
Values : 0
$seg2
integer-Rle of length 2789 with 1 run
Lengths: 2789
Values : 0
I tried to do this:
lapply(xcov, function(x) {
plot(as.numeric(x>0), type="l")
})
But it didn't do what I wanted it to do, is there any other way to fix this?
Hi, I tried it but it gave me an error like this:
I tried your xcov and it works, but my xcov is different, which is probably because I read it from a bam file. I just don't really understand the error and wonder if you could help me with it?
True. It is a package specific derivative data type. But actually it is even easier with this than a generic RLE list.
I am using
sapply
to apply a function to each element ofxcov.
I am converting the RLE it back to the numeric vector (notice that the function used here isas.numeric
, which only works because it is redefined by the package, usually you would need to useinverse.rle
for this). Then I calculate the sum, which is 0 for uncovered segments (having no runs) and a positive number for all others. This information needs to be translated to a logical TRUE (keep) and FALSE (delete) pattern.as.logical
conveniently does exactly this, but asum(as.numeric(x)) > 0
would do exactly the same. The output of this is then directly used to subset the original dataxcov[...]
.