Making pattern of GT:AD:DP:PL column values to be replaced by o and 1.
1
0
Entering edit mode
4.8 years ago
S AR ▴ 80

Im trying to replace GT:AD:DP:GQ:PL values from all samples in a multi vcf file with 1.

My file looks like this:

Pos 6620    6734    6737    6742    6750    7151    7212    7362
Alt A   A   A   G   GC  A   T   G
ERR038741   0   1:1,213:214:99:7391,0   0   0   0   0   1:2,205:207:99:7078,0   1:0,191:191:99:7383,0
ERR040140   0   1:1,336:337:99:11415,0  0   0   0   0   0   0
ERR046796   0   1:4,180:184:99:5672,0   0   0   1:3,182:185:99:6609,0   0   0   0
ERR046903   0   1:1,170:171:99:6142,0   0   0   1:1,158:159:99:5954,0   0   0   0
ERR067581   0   1:0,86:86:99:3037,0 0   0   0   0   0   0
ERR067593   0   1:0,90:90:99:3229,0 0   0   0   0   0   0
ERR067606   0   1:0,65:65:99:2267,0 0   0   0   0   1:0,58:58:99:1971,0 1:0,62:62:99:2353,0
ERR067607   0   1:0,73:73:99:2593,0 0   0   0   0   0   0
ERR067608   0   1:0,70:70:99:2390,0 0   0   0   0   0   0
ERR067609   0   1:0,80:80:99:2826,0 0   0   0   0   1:0,75:75:99:2574,0 1:1,82:83:99:3049,0

I want the "1:0,73:73:99:2311,0: type of text to be replaced with 1. I m able to pick the pattern but how to replace it with one. The grep command for pattern matching is:

grep "1:0,.*" susc_ml_combine.vcf

Now how to replace it? can any one help please.

Grep linux python vcf machineLearning • 1.4k views
ADD COMMENT
0
Entering edit mode

Did you try using awk?

ADD REPLY
0
Entering edit mode

i tried gsub from R but it didnt work.

As well i tried:

ex -s -c '%1:1,.*/1/g|x' file.vcf
ADD REPLY
0
Entering edit mode
4.8 years ago

Just use sed on your command line:

cat temp 
Pos 6620    6734    6737    6742    6750    7151    7212    7362
Alt A   A   A   G   GC  A   T   G
ERR038741   0   1:1,213:214:99:7391,0   0   0   0   0   1:2,205:207:99:7078,0   1:0,191:191:99:7383,0
ERR040140   0   1:1,336:337:99:11415,0  0   0   0   0   0   0
ERR046796   0   1:4,180:184:99:5672,0   0   0   1:3,182:185:99:6609,0   0   0   0
ERR046903   0   1:1,170:171:99:6142,0   0   0   1:1,158:159:99:5954,0   0   0   0
ERR067581   0   1:0,86:86:99:3037,0 0   0   0   0   0   0
ERR067593   0   1:0,90:90:99:3229,0 0   0   0   0   0   0
ERR067606   0   1:0,65:65:99:2267,0 0   0   0   0   1:0,58:58:99:1971,0 1:0,62:62:99:2353,0
ERR067607   0   1:0,73:73:99:2593,0 0   0   0   0   0   0
ERR067608   0   1:0,70:70:99:2390,0 0   0   0   0   0   0
ERR067609   0   1:0,80:80:99:2826,0 0   0   0   0   1:0,75:75:99:2574,0 1:1,82:83:99:3049,0


sed 's/ \+/\t/g' temp | sed 's/1:0,[0-9\\:\\,]*/1/g'
Pos 6620    6734    6737    6742    6750    7151    7212    7362
Alt A   A   A   G   GC  A   T   G
ERR038741   0   1:1,213:214:99:7391,0   0   0   0   0   1:2,205:207:99:7078,0   1
ERR040140   0   1:1,336:337:99:11415,0  0   0   0   0   0   0
ERR046796   0   1:4,180:184:99:5672,0   0   0   1:3,182:185:99:6609,0   0   0   0
ERR046903   0   1:1,170:171:99:6142,0   0   0   1:1,158:159:99:5954,0   0   0   0
ERR067581   0   1   0   0   0   0   0   0
ERR067593   0   1   0   0   0   0   0   0
ERR067606   0   1   0   0   0   0   1   1
ERR067607   0   1   0   0   0   0   0   0
ERR067608   0   1   0   0   0   0   0   0
ERR067609   0   1   0   0   0   0   1   1:1,82:83:99:3049,0
ADD COMMENT

Login before adding your answer.

Traffic: 2543 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6