Bedtools accepts files with formats like 'chr start end name score strand'
However, is it possible to take these columns into account (e.g name column) when using bedtools intersect or bedtools merge.
For example I have a bed file like so, with a 'name' column for transposable elements
FileA:
scaffold_1 913038 914038 DNA_transposon
scaffold_1 925018 926018 LTR
File B
scaffold_1 1026586 1027586 Mariner
scaffold_1 925211 935660 DNA_transposon
scaffold_1 925880 926300 LTR
but I would only like bedtools to intersect the 2nd line from FileA with the 3rd line from FileB, ignoring the 2nd line in FileB because it has "DNA_transposon" and not "LTR" in column4.
I can't find something like this in the manual file for bedtools. Knowing how to do it in bedtools would be helpful because it has the option to compare one bed file with many others (options -a and -b), in addition to many other options.
True, it would just be a lot of "grepped" files to make for each unique name in column4. This example I made is just simplification.
EDIT: (The -F is the delimiter but it's not needed for tab-separate, see below). While this doesn't help not make lots of files, you could just do
awk -F '{print > "file_"$4}' file.bed
and that'll make you a new file for each unique string in that column, then you've not got to spit each on individually.By the way. I need some clarification, the AWK command you mentioned does it actually work ?
oops, I just realised I removed the file delimiter but not the flag (you don't need the -F if it's tab-separated). But other than that, yeah it works as this:
Great Awesome.......