Entering edit mode
11.9 years ago
macmath
▴
170
Need to extract common list of Organisms from different files based on the second tab separated column
Example :
Below is the Query: File A:
- YP_001173296 Pseudomonas stutzeri A1501
- ZP_10847128 Pseudomonas fragi A22
- YP_006325640 Pseudomonas fluorescens A506
- ZP_08518919 Aeromonas caviae Ae398
- ZP_10474222 Rickettsiella grylli
- EKB21198 Aeromonas veronii AMC34
File B:
- P_02062827 Rickettsiella grylli
- YP_004473601 Pseudomonas fulva 12-X
- ZP_10438680 Pseudomonas extremaustralis 14-3 substr. 14-3b
- YP_528004 Aeromonas caviae Ae398
- ZP_11138829 Gallaecimonas xiamenensis 3-C-1
- YP_52800475 Pseudomonas stutzeri A1501
File C:
- P_02062827 Pseudomonas extremaustralis
- YP_004473601 Pseudomonas fulva 12-X
- ZP_10438680 Pseudomonas extremaustralis 14-3 substr. 14-3b
- YP_528004 Aeromonas caviae Ae398
- ZP_11138829 Rickettsiella grylli
- YP_52800475 Pseudomonas stutzeri A1501
Expected Result file : Common organism
- Aeromonas caviae Ae398
- Pseudomonas stutzeri A1501
- Rickettsiella grylli
actual i have 1000 files and i want to get common set of organisms using tab separated 2nd column I checked this command. Could you explain if I am wrong, the 3 stands for three files? In each file duplicates are present of the same organism
yes, "3" is for "3" files. Try : (for F in *.tsv; do cut -d ' ' -f2 $F | sort | uniq: done )| sort | uniq -c | grep -E '^ 1000 ' | cut -c 9-
I received syntax error near unexpected token `)'
I let you solve that syntax error as an exercise. :-)
hehe :) thanks dude