Hi y'all, I'm new to BEDtools and this type of data analysis in general, and I'm having a lot of trouble getting going.
I have two excel files that I saved as .txt files and then changed to .bed files and put in the Bedtools folder I'm using. They each have a column for chromosome number, start site, and end site. These are listed at the top but are commented out with #.
I can use bedtools and follow along with the tutorial and intersect will work. However, when I try to use my own files ("bedtools intersect -a EuarchCons6.bed -b PbxEmx6.bed | head -5") the terminal takes a minute to load and then displays nothing, as if there's no overlap or something, but I'm sure there is. I even created a test file with made up sequences that I knew overlapped with the first several regions in one of the files and it returned nothing.
I'm sure I'm probably missing something simple here, but if anyone could help me understand why intersect doesn't appear to be working that'd be really helpful. Thanks so much.
hi, a couple of checks - 1) Is the chromosome name in both BEDs of same format? Sometime different genome releases of the same organism have chr name in different formats. 2) I am not sure if its necessary, but are the BEDs coord. sorted? 3) BED files are tab-delimited. I reckon you are not using a text-editor. Ensure that while saving the excel as .txt, you choose the tab-delim .txt option.
Thanks so much for you reply!
1) I think the chromosome name is in the same format? It's just listed as chr1, chr2, chr3....etc in both files.
2) I'm not actually sure what coord. sorted means, so I'm not sure. Is this something I should do?
3) I did save as the tab delimited .txt option, then once it was saved I changed from .txt to .bed.
If you have any further input or guidance that'd be really helpful...
That is the problem of converting .txt to .bed format directly. You have to change the file encoding. Is it working in MAC OSX or windows or linux? If you are using MAC then first open the file in any text editor and change format from Classic MAC (CR) to Unix LF try to rename the file in terminal to .bed and then sort. Finally run your command. It should work
Try to print first 5 lines of both the file in the question and let us see what is the problem. It might be due to the fact that either the formatting is not correct for the bed file or it is not sorted.
Hi, thanks for your help.
Here is the format of my two files:
chr1 5483534 5483544 emx2-pbx1 220 -
chr1 5665040 5665050 emx2-pbx1 226 -
chr1 5693479 5693489 emx2-pbx1 203 -
chr1 8264531 8264541 emx2-pbx1 216 +
chr1 10019964 10019974 emx2-pbx1 220 +
And:
chr1 3000305 3002480 chr1.1
chr1 3002511 3004262 chr1.2
chr1 3004282 3004535 chr1.3
chr1 3017203 3017692 chr1.4
chr1 3017906 3019013 chr1.5
I have them open in Textedit but I'm not quite sure how to change the format or sort like you mentioned.
Sorry these are such basic issues....really don't have much experience with anything like this.
This should be fine. It only needs first 3 columns and seems to be sorted to be as well
there should be text wrangler for MAC , open the files in text wrangler and change the format at the bottom bar from Classic Mac (CR) to Unix LF and save. Then use command line terminal command of bedtool on both the file. if you have opened once the file in .txt the format is mac so everything is in one line and so you need to save it in unix format.