Entering edit mode
4.0 years ago
waqasnayab
▴
250
Hi,
I have a line in VCF file:
chr1 11785112 . T G,T . str10 AC=6,0;ADP=72;AN=8;SF=0f,1f;STATUS=. GT:ADR:ABQ:RDR:FREQ:RDF:ADF:PVAL:AD:SDP:RBQ:DP:RD:GQ 0/0:0:0:0:0%:0:0:1E0:0:46:0:46:0:0 1/1:5:47:0:100%:0:82:6.9142E-52:87:87:0:87:0:255 1/1:4:50:0:100%:0:51:1.0149E-32:55:84:0:84:0:255 1/1:3:49:0:100%:0:31:3.5146E-20:34:71:0:70:0:194
and might be many more like this. I want to remove such lines where REF base is repeated in the ALT bases. I tried:
awk '!($4~$5)' FA-MO-D1B-D1S.mpileup.output.snps.indel_srt_smplrnme_d1b_d1s.vcf > family.vcf
but no luck.
Any help is appreciated.
Waqas.
There are dedicated tools such as bcftools, gatk or vcftools as suggested in Remove positions that are non-variant in a subset of samples from a vcf file
Thanks Pierre, it worked like a charm!
Regards,
Waqas.