Entering edit mode
4.0 years ago
markgodek
▴
50
I've got a file that looks like this:
[1]CHROM [2]POS [3]REF [4]ALT [5]AF [6]GNOMAD_AF [7]FUNCOTATION [8]DP [9]AC [10]AN [11]HITS622847:GT [12]HITS622847:AD [13]HITS622849:GT [14]HITS622849:AD [15]HITS622851:GT [16]HITS622851:AD [17]HITS622853:GT [18]HITS622853:AD [19]HITS622855:GT [20]HITS622855:AD [21]HITS622856:GT [22]HITS622856:AD [23]HITS622858:GT [24]HITS622858:AD [25]HITS622860:GT [26]HITS622860:AD [27]HITS622862:GT [28]HITS622862:AD [29]HITS622864:GT [30]HITS622864:AD [31]HITS622866:GT [32]HITS622866:AD [33]HITS622868:GT [34]HITS622868:AD [35]HITS622870:GT [36]HITS622870:AD [37]HITS622872:GT [38]HITS622872:AD [39]HITS622875:GT [40]HITS622875:AD [41]HITS622877:GT [42]HITS622877:AD
1 715348 T G 1 1 [RP11-206L10.2|hg19|chr1|715348|715348|FIVE_PRIME_FLANK||SNP|T|T|G|g.chr1:715348T>G|ENST00000428504.1|-||||||0.4937655860349127|GTGGAACCCTTTCTCTACAAA||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||true|false|0_%2C_1|false|false|0|false|false|false|LOC100288069:100288069|true|false|false|true|true|false|false|false|false|false|false|false|false|false|false|false|false|false|true|false|3131984|715348|true|false|0|true|0|false|0.000137372_%2C_0.999863|false|false|false|SNV|true|0x050100020005040136000100|1|false|103|rs3131984|] 8 4 4 ./. 1,0 ./. 0,0 ./. 1,0 ./. 1,0 ./. 1,0 ./. 0,0 1/1 0,2 ./. 0,0 ./. 0,0 ./. 0,0 ./. 0,0 ./. 0,0 ./. 0,0 1/1 0,2 ./. 0,0 ./. 0,0
I was using
awk 'BEGIN { OFS = "\t" } { gsub("\\|.*\\||\\[|\\]" ,"", $7); print $0 }'
to cut the Funcotation down to "RP11-206L10.2" but I realized I also need to pull out the 6th value in each variant as well but my regex powers just aren't there yet. Basically, trimming leading and trailing brackets, and keeping only the string before the first bar and the string between the 5th and 6th bars.
Any help in getting the entries like that into the format below is appreciated.
1 715348 T G 1 1 RP11-206L10.2 FIVE_PRIME_FLANK 8 4 4 ./. 1,0 ./. 0,0 ./. 1,0 ./. 1,0 ./. 1,0 ./. 0,0 1/1 0,2 ./. 0,0 ./. 0,0 ./. 0,0 ./. 0,0 ./. 0,0 ./. 0,0 1/1 0,2 ./. 0,0 ./. 0,0