Do two files have to have the exact same information in them to intersect the two files? I'm trying to intersect these two files with a format similar to below:
Example of file 1:
chrom exon_start exon_end strand isoform exon_numer gene coding_length total_mutations_reported total_exonic_mutations exonic_splicing_mutations total_splice_site_mutations 3_ss_mutations 5_ss_mutations
chr17 7125985 7126184 + NM_000018 10 ACADVL 199 15 11 0 4 2 2
chr17 7126962 7127049 + NM_000018 12 ACADVL 87 7 4 0 3 1 2
chr11 108016928 108017086 + NM_000019 11 ACAT1 158 10 7 1 3 2 1
chr12 52307342 52307554 + NM_000020 4 ACVRL1 212 10 7 0 3 1 2
Example of file 2:
chr1 957580 957842 NM_198576 2 + AGRN 262 0 0 0 0 0 0 exon GTTCGGGTCTGGCGGTACTTGAAGGGCAAAGACCTGGTGGCCCGGGAGAGCCTGCTGGACGGCGGCAACAAGGTGGTGATCAGCGGCTTTGGAGACCCCCTCATCTGTGACAACCAGGTGTCCACTGGGGACACCAGGATCTTCTTTGTGAACCCTGCACCCCCATACCTGTGGCCAGCCCACAAGAACGAGCTGATGCTCAACTCCAGCCTCATGCGGATCACCCTGCGGAACCTGGAGGAGGTGGAGTTCTGTGTGGAAG 0.72692929292929
chr1 989132 989357 NM_198576 34 + AGRN 225 0 0 0 0 0 0 exon CGAGAAGGCACTGCAGAGCAACCACTTTGAACTGAGCCTGCGCACTGAGGCCACGCAGGGGCTGGTGCTCTGGAGTGGCAAGGCCACGGAGCGGGCAGACTATGTGGCACTGGCCATTGTGGACGGGCACCTGCAACTGAGCTACAACCTGGGCTCCCAGCCCGTGGTGCTGCGTTCCACCGTGCCCGTCAACACCAACCGCTGGTTGCGGGTCGTGGCACATAG 0.72252380952381
I want my output of intersect to be the information from file one. However, when I try bedtools intersect -a file1.bed -b file2.bed -wa
I get this Error: unable to open file or unable to determine types for file total_splice_site_mut_greater3.bed
. I just want to compare the first three columns of my files for returns, but I'm assuming that's not possible with bedtools if the files don't contain the same information. Am I correct? Any ideas on how I can achieve my desired result?
Edit: total_splice_sit_mut_greater3.bed
is file1.bed
your file1 doesn't seem to follow proper BED file format.