Entering edit mode
7.1 years ago
mittu1602
▴
200
I have two bed files:
file1: (since its a compilation of 50 files the header names are also needed for reference)
==> /home/blade/1-processedData/sample.coverageExon.csv <==
space start end V4 avgCoverage
chr10 133525740 133525741 CYP2E1 475
chr10 133527062 133527065 CYP2E1 402
chr10 133527323 133527328 CYP2E1 441
chr10 133531206 133531209 CYP2E1 104
chr10 133534752 133534755 CYP2E1 395
chr10 133535862 133535865 CYP2E1 278
chr10 133537632 133537635 CYP2E1 0
chr10 43572708 43572779 RET 789
file2:
chr10 37868205 37868205
chr10 37880220 37880220
chr10 37880220 37880233
chr10 37880261 37880261
chr10 37880261 37880261
chr10 37881000 37881000
chr10 37881003 37881011
chr10 37881332 37881332
chr10 37881616 37881616
chr6 152415537 152415537
I want to intersect file2 on file1, below is my expected output:
==> /home/blade/1-processedData/sample.coverageExon.csv <==
chr10 37868205 37868205 CYP2E1 402
chr10 37880220 37880220 CYP2E1 441
chr10 37880220 37880233 CYP2E1 104
chr10 37880261 37880261 CYP2E1 395
chr10 37880261 37880261 CYP2E1 278
chr10 37881000 37881000 CYP2E1 0
chr10 37881003 37881011 RET 789
I have already tried bedtools intersect but since the headers are included in the file it is not able to read the file. Is there any other way of doing it
remove the headers...
sorry I need those headers, as mentioned in the question: (since its a compilation of 50 files the header names are also needed for reference)
this is basic linux , you can always add the header later (!)
I am aware of extracting headers and later adding it, but there are 50 headers which are in the middle of the file some thing like this:
and so on..! it will be difficult to remove headers for 50-100 files.
If you do not remove the headers, these files are not in a format that any existing tools use out-of-the-box. You can always write some custom scripts to process the data any way you like.
I don't understand where are the headers in your example. It looks like a
head
command on multiple files.