I have two files file A and file B and I need to map the names of file B in file A.
File A is
bl_bl/Mir7_O-E
bowl_O-E
btd_Ss-Bg
cad_+14_construct
CG9571_O-E
cnc_+5_construct
eve_stripe1
X 6014154 6015890
X 6023769 6025039
X 6022460 6023762
X 6018273 6020650
File B looks like this
X 5987411 5987911 Unspecified_STARR-S2-5230
X 5997666 5998166 Unspecified_STARR-S2-4940
X 6000535 6001035 Unspecified_STARR-OSC-2712
X 6002496 6002996 Unspecified_STARR-S2-4953
X 6027445 6027945 Unspecified_STARR-S2-1989
X 6069973 6072234 Unspecified_VT57592
X 6074286 6074786 Unspecified_STARR-OSC-3266
X 6075128 6075628 Unspecified_STARR-S2-2715
X 6108152 6108652 Unspecified_STARR-OSC-4388
X 6132403 6132903 Unspecified_STARR-OSC-2588
X 6132527 6133027 Unspecified_STARR-S2-1212
How can I map file A to file B to find the common regions? Especially considering the first few hits of File A whose genomic coordinates are not given.
I'm going to go out on a limb and say this is impossible to do in a robust way. The two files seem to have nothing in common. Perhaps you can give more details as to the nature of their contents, but I'm still doubtful there is a solution. Maybe if you can describe in more detail what you want to do and why, someone will have an alternative.
The file A has two columns in the first few lines and three columns in the next few lines. It threw an error:
bedtools sort -i a.bed
It looks as though you have less than 3 columns at line: 1. Are you sure your files are tab-delimited?
Also, can you give some insights on bedmap function? I have not used it before.
You're going to need to do some work to fix your inputs. As Brian noted, you haven't provided enough information for anyone to really do this for you. Read up on the UCSC BED format specification. Also, if you click on the link in my answer, you'll see a link to the documentation page that describes
bedmap
andsort-bed
.