Entering edit mode
6.9 years ago
mittu1602
▴
200
I have 2 files tab separated as follows:
file1:
rs8192678 GG VO2max PPARGC1A Higher
rs2282679 AC Vitamin D deficiency
file2:
rs8192678 GG
rs2282679 AC
rs8192678 CC
rs2282679 AG
result: where even its complementary should also match with file1
rs8192678 GG GG VO2max PPARGC1A Higher
rs8192678 GG CC VO2max PPARGC1A Higher
rs2282679 AC AC Vitamin D deficiency
rs2282679 AC AG Vitamin D deficiency
Thank you
Hi Kevin, Thanks for the reply. But my aim is to look for complementary genotypes and there are high chances that file2 would also contain genotypes that are not complementary to file1 genotypes. if something can be looked in that way. but none the less, thank you so much.
Please explain better with some sample input and output
File1:
File2:
Result:
Are those asterisks in the actual data or did you just put them there for this example?
for this example only, wanted to bold them to stand out!
I see. Yes, the formatting doesn't always come out right on these forums.
You can try this:
Here I find the complementary base with the first part of the pipe with
paste <(cut -f2 file1 | tr [ATGC] [TACG]) file1
, and this is then piped intoawk
as stdin. If there are also more columns to print out, just add morevar[2]"\t"
to the finalprint
command.Hope that this works!
Thank you so much @Kevin it really worked and helped me. Much appreciated for your time over this! Can u also suggest me that if I can use it as a shell script? Happy New Year!
Yes this should be fine within a shell script (bash)