I have a file with 2 columns, the first column have a read ID from fastq, the second column have a ncbi taxi ID, like this:
ST-E00243:601:HWJFHCCXY:8:1101:19431:10468 9606
ST-E00243:601:HWJFHCCXY:8:1101:19451:10468 14230
ST-E00243:601:HWJFHCCXY:8:1101:19471:10468 10468
ST-E00243:601:HWJFHCCXY:8:1101:19492:10468 1512
ST-E00243:601:HWJFHCCXY:8:1101:19512:10468 512
ST-E00243:601:HWJFHCCXY:8:1101:19532:10468 1421067
and I have an archive with the taxid list of a specific taxos, like this:
2485233
2485231
2059665
2029516
2022430
2022429
1987726
1980986
1738445
1737346
I want to extract the lines with the taxid present on archive 2, but, some of the taxids presents the same number present on read ID. So I tried extract only the correct lines with fgrep, but, fgrep can't specify a column (in the case the column 2).
Anyone can help me? Briefly I need extract the whole line detecting a pattern in column 2 using a file with patterns.
Thanks !
output:
input: