Question

Diffbind dba.count doesn't give FRiP score

0

Entering edit mode

2.4 years ago

gyspace • 0

Hi I am completely new to DiffBind, R and programming in general. I want to use diffbind to analyze peaks called with macs2. When I use dba.count(), I can not get FRiP score at all. I googled this question, but didn't work.

Here is the format of sample.csv:

SampleID Replicate bamReads Peaks PeakCaller

Here is the code:

library(DiffBind)

sample1 <- dba(sampleSheet="Z:/sample.csv")
sample1

10 Samples, 931 sites in matrix (15048 total):
ID Replicate Intervals
1  cfDNA21         1      2425
2  cfDNA22         1      2769
3  cfDNA23         1      1887
4  cfDNA24         1       587
5  cfDNA25         1      1269
6  cfDNA26         1      1082
7  cfDNA27         1       994
8  cfDNA28         1      2529
9  cfDNA29         1      2427
10 cfDNA30         1      1637

tam.counts <- dba.count(sample1)
(Sample: Z:/cfDNA30.bam125    Reads will be counted as Paired-end.
 Warning messages:1: In serialize(data, node$con) :'package:stats' may not be available when loading)

tam.counts

10 Samples, 443 sites in matrix:

ID Replicate     Reads
1  cfDNA21         1 1267699.0
2  cfDNA22         1  751270.0
3  cfDNA23         1  410182.5
4  cfDNA24         1  264162.5
5  cfDNA25         1  239933.0
6  cfDNA26         1  172226.0
7  cfDNA27         1  196008.0
8  cfDNA28         1  753888.5
9  cfDNA29         1 1561326.5
10 cfDNA30         1  340597.0

There is no FRiP column, I dont know why. Someone encountered the same problem, but it was not successfully solved. Can anyone answer this question?

Diffbind FRiP • 3.5k views

ADD COMMENT • link updated 2.3 years ago by Rory Stark ★ 2.1k • written 2.4 years ago by gyspace • 0

1

Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY • link 2.4 years ago by Ram 45k

0

Entering edit mode

Thank you very much! I will try next time.

ADD REPLY • link 2.4 years ago by gyspace • 0

score 2 · Answer 1 · 2023-04-04

I've had a look at some sample data and I can see what is happening.

The issue is that only a tiny fraction of reads overlap any consensus. All of the calculated FRiP scores round down to 0.00 so they are not reported.

In your original post, we can see that while there are more than 15,000 peaks, less than 1,000 of them overlap in any 2 of the 10 samples, indicating that there is not much of a consistent signal in these samples. Only a few reads overlap the peaks in any one sample; more than 99.99% in each bam file do not map to any of the consensus peaks.

A number of things could be causing this. I ran a quick ChIPQC report on the sample data and it showed that a high proportion of the reads in the bam file have very low mapq scores; about 70% of the scores are zero (usually indicating multi-mapped, but could also be low-quality sequencing) and 75% overall are filtered out given the default mapQCth=15. Would you expect to have a very high rate of non-unique mapping in this dataset? Also, no reads are marked as duplicated, which either means that no duplicate marking was done, or that they have been removed in some way (as there are usually at least some duplicates).

There could also be an issue upstream in the peak calling phase, or it could just be that there is no real enrichment in these samples and they are essentially reporting a "background" (like we would expect to see in an Input control).

Hope this helps!

score 1 · Answer 2 · 2023-03-30

1

Entering edit mode

2.4 years ago

Rory Stark ★ 2.1k

It would be helpful to know what version of DiffBind are you running (output of sessionInfo()). There have been some fixes in this area in more recent versions.

ADD COMMENT • link 2.4 years ago by Rory Stark ★ 2.1k

0

Entering edit mode

DiffBind 3.8.4. It should be the lasted version.

ADD REPLY • link 2.4 years ago by gyspace • 0

1

Entering edit mode

I can't reproduce this in the current version.

One possibility is that all of the FRiP values are equal to 1, meaning all of your reads overlap at least one consensus peak. In this case, the FRiP values are not reported when you print out the DBA object. You can check for this as follows:

dba.show(tam.counts)$FRiP

The value should either be NULL or a vector of length 10 containing all 1 values. If it is all 1s, that is the issue there. If it is NULL, I can give you some ways to calculate the FRiP yourself, and if you can give me access to some of your data (samplesheet and peak/bam files for 2-3 samples), I can track down what is happening.

ADD REPLY • link 2.4 years ago by Rory Stark ★ 2.1k

0

Entering edit mode

Thank you! I tried the command and dba.show() returned NULL. So how can I send my sample data?

ADD REPLY • link 2.4 years ago by gyspace • 0

0

Entering edit mode

If you can make it available on a server (like Dropbox, GDrive, iCloud) and email me a link.

ADD REPLY • link 2.4 years ago by Rory Stark ★ 2.1k

0

Entering edit mode

I found your email in https://bioinfotraining.bio.cam.ac.uk/staff/rory-stark-phd, is the email right? I sent a google drive link to you. If the link is unavaliable, please let me know.

ADD REPLY • link 2.4 years ago by gyspace • 0