I have 2 files as below:
File 1:
SpoScf_15890 12 2376
SpoScf_00032 299634 316568
SpoScf_15890 656772 669220
SpoScf_00032 667632 674746
SpoScf_07684 10 4075
SpoScf_07684 64 4276
SpoScf_00032 820227 826573
Super_scaffold_60 74732 78743
Super_scaffold_60 1101694 1102317
Super_scaffold_60 74955 77543
File 2:
SpoScf_15890 1 2976
SpoScf_23593 1 2413
SpoScf_51672 1 1782
SpoScf_07684 91 4078
SpoScf_03142 8164 12518
SpoScf_04517 8723 11547
SpoScf_02476 10671 14488
SpoScf_01270 63995 66773
SpoScf_00853 73199 75746
Super_scaffold_60 74936 77943
I would like to compare these files using awk, Perl, or python, specifically if column 1 of file1 is equal to column 1 of file2 (condition 1) and column 2 of file 1 is less than column 2 of file2 (condition 2) or column3 of file1 is less than column 3 of file2 (condition 2) and/or both column 2 and 3 of file 1 are less than column 2 and 3 of file2. If conditions satisfy then extract the entire line of file1. The file sizes are not the same. Also, there is no one to one correspondence between the lines in the two files.
Result
SpoScf_15890 12 2376
SpoScf_07684 10 4075
SpoScf_07684 64 4278
Super_scaffold_60 74732 78743
Super_scaffold_60 74955 77543
Any help appreciated.
Thanks
So a match between the files is:
What if one column entry is larger and the other smaller? Is this possible?
Only less than, or less than and equal to?
Thanks to you! Both scripts work well.
I would like to add one more condition if it is possible for you to edit the script. Another condition: Column 2 of file 1 is less than column 2 of file2 "but not more than 1000 (1K)" and also Column 3 of file 1 is less than column 3 of file2 "but not more than 1000 (1K)?
Please refrain from adding new questions as answers - they should be comments. I have moved them this time.
It is also generally bad practice/forum etiquette to continually ask for modifications and updates to the original question as it muddles the information for future users who might need to learn from this thread.