Hi All
I am a young scientist. I am trying to compare several tab delimited files which each file contains two different columns (first column is a list of ids and the second column is a list of numeric values assigned to ids) to find the match entries among them. I need to return only similar ids in all files along with similar values which this value can shift 20 digits to left or right (less or more).
file 1 file 2 file 3
AYJT01000009.1 6703 AYJT01000009.1 6703 AYJT01000009.1 6713
AYJT01000020.1 3082 AYJT01000020.1 3082 AYJT01000020.1 3082
AYJT01000020.1 10479 AYJT01000045.1 4861 AYJT01000114.1 4191
AYJT01000045.1 4861 AYJT01000120.1 1003 AYJT01000118.1 2213
AYJT01000118.1 2209 AYJT01000123.1 3453 AYJT01000120.1 1003
AYJT01000120.1 1003 AYJT01000123.1 3453 AYJT01000123.1 1039
AYJT01000123.1 3453 AYJT01000127.1 4084 AYJT01000123.1 3453
AYJT01000127.1 4405 AYJT01000146.1 121 AYJT01000127.1 4084
AYJT01000305.1 7736 AYJT01000305.1 7736 AYJT01000146.1 209
AYJT01000372.1 8646 AYJT01000372.1 8638 AYJT01000305.1 7736
file 1 file 2 file 3
AYJT01000009.1 6703 AYJT01000009.1 6703 AYJT01000009.1 6713
AYJT01000020.1 3082 AYJT01000020.1 3082 AYJT01000020.1 3082
AYJT01000120.1 1003 AYJT01000120.1 1003 AYJT01000120.1 1003
AYJT01000123.1 3453 AYJT01000123.1 3453 AYJT01000123.1 3453
AYJT01000305.1 7736 AYJT01000305.1 7736 AYJT01000305.1 7736
- The value of AYJT01000009.1 has a 10 digits shift to up in file 3
I would be so appreciated in advance if any one could write me an script with perl or python. I am keeping my eyes open to see your comments.
Which file has the value to compare the others to +/- 20?
all files contain ids and values. Each file contains more than 10 thousand lines
This question is so hard to understand and we already have two answers. Perhaps I am in the minority ...
While we're nitpicking, I don't know what a verdant scientist is. Please fix this oversight immediately.