Hi, I have my vcf file and sending it as raw to my colleagues. They open it in Excel and they specifically need to extract one field called AB from INFO field. My INFO fileld looks like this
DP=623;AB=0.058;AO=36;RO=472;QR=10544;QA=561;SRF=472;SRR=0;SAF=36;SAR=0;TYPE=snp;CIGAR=1M1X1M;LEN=1;MQM=60;MQMR=60;PARSER=FREEBAYES
Could you please help me to find Excel function to extract particular info from that to separate cell. Thanks.
Although this seems obvious, if you open the file specifying
;
as field separator, you'll visualize eachINFO
field in separated columns and it could be easy to select only theAB
column. Is this what you are looking for?Well this will probably do the trick. The problem might be that if there is for example DP record missing then it is not guaranteed that it will be in the same column for all records.
I can't resist citing that tweet:
I did have known it is a provocative question! The answer to my question seems to be here http://stackoverflow.com/questions/21674222/extract-characters-after-certain-other-characters-excel
If you are colleagues are Biologists with no sense of command line programming you can actually make a table fetching the concerned field with the proper awk command since it is a tab delimited file and send them the metrics.