I want to transpose a file like below where the 2nd column on-words each column header (B4, B3, E0 )can take two values. I want all the values for B4 B3...to be in one row. which means B4 B3 E0 will be seperate rows. hOw can it be done awk sed or python. I can do simple transpose in python but dont understand how to solve this particular problem.Ill appreciate any help.
Input : 2nd and 3rd column have the same column name i.e B4, similarly 4th and 5th column have same column name i.e B3 and so on..when we transpose both the values corresponding to B4 should transpose together as a unit like 12 13 13 14 13 13 12 13 13 13 12 13 ..it should be in one line
EDIT: file consists of over 20 columns and 2000 rows
ID B4 B4 B3 B3
1 12 13 19 21
2 13 14 19 21
3 13 13 19 21
4 12 13 19 19
5 13 13 18 19
6 12 13 19 21
Desired Output
ID 1 1 2 2 3 3 4 4 5 5 6 6
B4 12 13 13 14 13 13 12 13 13 13 12 13
B3 19 21 19 21 19 21 19 19 18 19 19 21
What is the relation to bioinformatics?
It is microsatellite data that i am trying to merge with snp data
I would:
Loop over the file, create a dictionary with key = marker and value = list containing genotypes. Then write out the dictionary to a file. A defaultdict will be useful (see collections.defaultdict).
Perhaps you could provide more simple examples of input and output or alternatively describe your problem more clearly..
To improve your example data, you should do the following:
I edited the question to make the example more clear. I have also added headers to the columns without header and removed the > from the beginning. Now, the input can be really used for testing.
Thanks! That sure does make it easy to understand.
I also can see now that you have incomplete IDs in the output, you need 12 IDs but have 6, you need to replicate the IDs as well.
You are right! I changed that in the desired output section.
Thanks everyone! With all your help I got it done!
Please use
ADD COMMENT
orADD REPLY
to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.