Hi,
I have not much script programming experience but a problem and would really appreciate some sort of hint.
I have two pretty large tables:
Table 1:
0 rs4345758 0 28663 2 1
0 rs10399793 0 39161 2 1
0 rs2462492 0 44539 2 1
0 rs3107975 0 45189 1 2
0 rs4420028 0 63065 0 1
...
Table 2:
rs10399793-128_B_F_1501305891 rs10399793 39161 A G
rs2462492-128_T_R_1551017801 rs2462492 44539 A G
rs3107975-128_B_F_1551017858 rs3107975 45189 A G
rs4420028-128_T_R_1551213529 rs4420028 63065 A G
rs2462495-128_B_F_1551083146 rs2462495 68896 A G
rs3878915-128_B_F_1551017866 rs3878915 70249 A C
...
What I need to do is the following:
- Lookup value $2 (rs4345758, rs10399793 and so on) in table 1 and find the matching one in table 2 (
$2
). - Find the corresponding variants (
$4
,$5
) in table 2 for a snp and insert them in table 1, replacing the 0, 1 or 2 ($5
,$6
) - Create a third table with the snps that had no match in table 2 for later exclusion.
Could someone help me with this?
Thanks
Markus
A python or perl script will will be more handy for this. But you can use combination of 'cut', 'fgrep', 'paste' or a simple 'awk' one liner in unix. There should be plenty of examples available online.