Hi,
Is there a way for filtering/removing the structural variants identified in ChrUn and other random contigs or in non-main chromosomes? I am using hg38 assembly.
Will 'vcftools' --chr filtering work here?
I followed 'https://www.biostars.org/p/201603/#273150'. Tried the below code, but it didn't filter the non-main chromosomes in my vcf.
grep -w '^#\|chr[1-9]\|chr[1-2][0-9]\|chr[X]\|chr[Y]' my.vcf > my_filtered.vcf
Example of Manta vcf:
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Normal1 Tumor1
chr1 30405827 MantaBND:0:70561:70576:1:0:0:0 T ]chr1_KI270760v1_alt:58532]T . PASS SVTYPE=BND;MATEID=MantaBND:0:70561:70576:1:0:0:1;IMPRECISE;CIPOS=-568,568;SOMATIC;SOMATICSCORE=41;BND_DEPTH=54;MATE_BND_DEPTH =24 PR 16,0 26,7
In the case of structural variants, translocation events will be present. So, I will have to remove the random chromosomes from ALT columns too. I am trying to do this to keep only chr1-22, X, Y in the Manta structural vcf file to do a circos plot.
Thanks for the help!
Thanks for the reply. This way the ALT column was not modified. Sorry, I forgot to mention about that. I have modified my question.
updated the code. Entire row is modified. Please be careful of output as it is a generic regex.
Great! This one worked. Thank you!