Bedtools intersect returns 0, intersections present on IGV
2
0
Entering edit mode
7.0 years ago
kerrypop • 0

Hello,

I am trying to use bedtools to find intersections between H3K4me1 peaks and my regions of interest. Though I can see overlap on IGV, intersect returns 0 without any error messages.

Here is command I'm using and screenshot from IGV:

bedtools intersect -a gene.ROI -b wgEncodeBroadHistoneHuvecH3k4me1StdPk.broadPeak -u | wc -l

IGV screenshot: https://imgur.com/a/JYUzp

Is there something I'm missing?

Thank you, Kerry

bedtools ChIP-Seq • 3.0k views
ADD COMMENT
1
Entering edit mode

do file use the same chromosome notation ? are the files sorted ? are both files BED files ?

ADD REPLY
0
Entering edit mode

Yes, same chromosome notation. Files are not sorted - I've done this analysis before and didn't sort them. Do you think sorting would overcome this? The ROI file is just a document, created using nano in terminal. The other is a broad peak file.

ADD REPLY
0
Entering edit mode

The ROI file is just a document, created using nano in terminal.

Check columns are tab separated and there are no funny, non-printable characters. If you do something like:

cat -vet gene.ROI | head

You should see columns separated by ^I, e.g.:

chr1^I67208778^I67216822^INM_032291_utr3_24_0_chr1_67208779_f^I0^I+$
chr1^I67208778^I67216822^INM_001308203_utr3_21_0_chr1_67208779_f^I0^I+$
chr1^I33585783^I33586132^INM_001301825_utr3_8_0_chr1_33585784_f^I0^I+$
chr1^I8404073^I8404227^INM_001080397_utr3_8_0_chr1_8404074_f^I0^I+$
ADD REPLY
0
Entering edit mode

I only used chromosome coordinates but don't see any extraneous characters

cat -vet gene.ROI | head
Chr6^I11711888^I11781280$
Chr6^I35702809^I35718690$
Chr1^I162349520^I162358608$
ADD REPLY
0
Entering edit mode

It looks like the dollar sign at the end of each line might be causing trouble. You can use this command to remove the last character from each line:

sed 's/.$//' gene.ROI > geneROI.nolast
ADD REPLY
1
Entering edit mode

That is because of options (-vet) used for cat command.

-e     equivalent to -vE
-E, --show-ends
              display $ at end of each line
-t     equivalent to -vT

-T, --show-tabs
          display TAB characters as ^I
-v, --show-nonprinting
              use ^ and M- notation, except for LFD and TAB
ADD REPLY
1
Entering edit mode

Your IGV screenshot seems to show the entire genome. It may be that once you zoom in into one of the ROI genes there is actually no intersection.

ADD REPLY
0
Entering edit mode

Here's zoomed in screenshot of the first ROI. There should be 4 intersections. https://imgur.com/a/lsiEQ

ADD REPLY
2
Entering edit mode
7.0 years ago

It looks like your chromosome notation is NOT identical. Your IGV screenshot says 'chr6' but your gene.ROI says 'Chr6'. Linux is case-sensitive.

ADD COMMENT
0
Entering edit mode

That did it - thank you!

ADD REPLY
0
Entering edit mode
7.0 years ago

I see you have Chr6, Chr1 instead of chr6, chr1 (lowercase c). That may be the problem.

ADD COMMENT

Login before adding your answer.

Traffic: 1739 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6