Entering edit mode
6.0 years ago
Siavash Salek Ardestani
▴
20
How can extract these individuals "A", "B", "C" and their genotypes in this dataset? Is there a command in linux for this?
A 11112121112121121102111121110
B 11211112112110211111211121110
C 20222222202020220202222222220
D 11111112112110211111211121110
E 20222222202020220202222222220
F 11112121112121121102111121110
G 11211112112110211111211121110
H 11211112112110211111211121110
Thanks!
Hello siyavash_damdar ,
please explain the data format a bit more. Also show an example how the output should look like.
Thanks!
fin swimmer
Dear finswimmer, Actually, it is a small example and my real data is much bigger than this dataset. The format is text (txt). Here the first column is included the individuals and the second column is their genotypes. I want extract the"A","B" and "C" genotypes in text file like this:
You can use
grep
command with-f
option providing the file of IDs which you want to extract. For e.g.Dear siyavash_damdar ,
you are just repeating the things you've already said in your first post. Unfortunately this doesn't help me to understand what you are trying to do. So please rephrase.
Do you just want to have the second column? Do you want one file per sample? ...
fin swimmer
Dear finswimmer, In this data first column is included samples (A,B,C,etc), the second column is included genotypes (each number per SNP). So, I need to know, for example, how can I extract A B C samples and their genotypes together (all data in A B C rows) in a text file.