I have two small RNA matrix files, having almost 87% overlap. I want to extract those columns which are only specific to file 1 and specific to file 2, I am giving an example of my data:
File1.Sample 1:
AAAAAAACAAGGATCAACAAGACT 0.0835 0 0.2743 0.197 0.069 0.44 0.195 0.31
AAAAAAACACTCGGCAAAGAACCC 0.3343 0.0 1.641 2.170 1.82 0.88 0.758
AAAAAAACCCTCTGACGCAGCACC 0.167 0 1.455 0.096 0.487 0 0 0 1.55
AAAAAAACCGCCACTAGAAATCGT 0.0835 0.0843 0.557 0.888 1.35 0.88 0.66
AAAAAAACGTACTTCGTGCCGACT 0.0835 0.599 0 0 0 0 0.351 0 0 0
AAAAAAACTCGGAACCCTAATCTG 0.083 0.2569 0.364 0.260 0.286 0.10 0.35
File2. Sample2:
AAAAAACACTCGGCAAAGAAGGCT 0.167 0 0.674 1.0531 0.3878 0.61838 0.08543 0.387
AAAAAACACTCGGCAAAGGCTTTG 0.51 0.22 1.82 0.888 0.87699 1.6497 0.17659
AAAAAACAGACTTTGTATCGACT 2.846 0.0300 0.1824 0.39 0.94 0.4692 0.31817
AAAAAACAGATGCCGAAGATGT 1.8389 0.4282 4.0117 2.562 0.54 1.649477
AAAAAACAGTATTCGAAACGGGAC 0.1677 0.08511 1.55052 0.6997 0.58733 1.75284
File3.Overlap:
AAAAAAACGTACTTCGTGCCGACT 0.0835 0.599 0 0 0 0 0.351 0 0 0
AAAAAAACTCGGAACCCTAATCTG 0.083 0.2569 0.364 0.260 0.286 0.10 0.35
AAAAAACACTCGGCAAAGAAGGCT 0.167 0 0.674 1.0531 0.3878 0.61838 0.08543 0.387
AAAAAACACTCGGCAAAGGCTTTG 0.51 0.22 1.82 0.888 0.87699 1.6497 0.17659
These are the three files, file 1 is sample 1, file 2 is sample 2 and file 2 overlap or common between file 1 and 2 based on column 1. I want to extract those specific sequences which are specific to the respective file along with the matrix values . I have tried these several commands also got from biostar through search, includes:
cat sorted_b73matrix.txt sorted_mo17matrix.txt|sort |uniq -u |awk '$1==1' > 123.txt
grep -vxFf sorted_b73matrix.txt sorted_mo17matrix.txt > B73_specific_martix.txt
grep -vxFf sorted_mo17matrix.txt sorted_b73matrix.txt > M017_specific_matrix.txt
cat file1.tx file2.txt |sort |uniq -c |awk '$1==1'
But the result is not correct. maybe my parameters are wrong. Please tell me how i will get my matrix file specific to the respective files, not the overlap.
From your example it seems like you want to extract specific rows not columns, right ?