The following script for http://lindenb.github.io/jvarkit/VcfFilterJdk.html seems to work:
if(variant.getNAlleles()!=2 || !variant.hasAttribute("AA")) return true;
final String aa = variant.getAttributeAsString("AA","");
if(!variant.getAlleles().get(1).getDisplayString().equalsIgnoreCase(aa)) return true;
VariantContextBuilder vb=new VariantContextBuilder(variant);
Allele oldalt = variant.getAlleles().get(1);
Allele oldref = variant.getAlleles().get(0);
Allele ref= Allele.create(oldalt.getDisplayString(),true);
Allele alt= Allele.create(oldref.getDisplayString(),false);
vb.alleles(Arrays.asList(ref,alt));
List<Genotype> genotypes= new ArrayList<>();
for(Genotype g: variant.getGenotypes())
{
if(!g.isCalled()) { genotypes.add(g); continue;}
GenotypeBuilder gb = new GenotypeBuilder(g);
List<Allele> alleles = new ArrayList<>();
for(Allele a:g.getAlleles())
{
if(a.equals(oldalt)) { a=ref;}
else if(a.equals(oldref)) { a=alt;}
alleles.add(a);
}
genotypes.add(gb.alleles(alleles).make());
}
vb.genotypes(genotypes);
return vb.make();
usage:
java -jar dist/vcffilterjdk.jar -f script.js input.vcf
1 10575 . C G 67.27 PASS AA=.;AC=0;AF=0.00;AN=18;DP=85;set=Intersection GT:AD:DP:GQ:PL 0/0:1,0:1:3:0,3,28 0/0:11,0:11:30:0,30,450 0/0:7,0:7:21:0,21,226 0/0:9,0:9:24:0,24,360 0/0:6,0:6:18:0,18,168 0/0:26,0:26:60:0,60,900 0/0:7,0:7:21:0,21,218 0/0:10,0:10:27:0,27,405 0/0:8,0:8:24:0,24,246
1 12882 . C G 660.23 PASS AA=c;AC=0;AF=0.00;AN=18;DP=143;set=Intersection GT:AD:DP:GQ:PL 0/0:7,0:7:21:0,21,253 0/0:11,0:11:33:0,33,377 0/0:12,0:12:36:0,36,417 0/0:23,0:23:66:0,66,990 0/0:16,0:16:13:0,13,470 0/0:29,0:29:81:0,81,839 0/0:24,0:24:63:0,63,945 0/0:15,0:15:45:0,45,481 0/0:6,0:6:18:0,18,193
1 13079 . C G 1594.47 PASS AA=c;AC=3;AF=0.167;AN=18;DP=340;set=Intersection GT:AD:DP:GQ:PL 0/1:10,4:14:78:78,0,226 0/1:32,4:36:9:9,0,744 0/0:29,3:32:4:0,4,710 0/0:37,0:37:99:0,109,1206 0/0:56,4:60:61:0,61,1501 0/0:33,0:33:99:0,99,1155 0/1:51,8:59:41:41,0,1276 0/0:55,0:55:0:0,0,1269 0/0:14,0:14:42:0,42,425
1 17730 . C A 9050.45 PASS AA=-;AC=2;AF=0.111;AN=18;DP=230;set=Intersection GT:AD:DP:GQ:PGT:PID:PL 0/0:28,0:28:48:.:.:0,48,696 0/0:31,0:31:93:.:.:0,93,959 0/0:28,2:30:1:0|1:17722_A_G:0,1,1171 0/0:46,0:46:99:.:.:0,113,1434 0/0:35,0:35:69:.:.:0,69,1054 0/0:27,0:27:46:.:.:0,46,884 0/1:9,4:13:99:0|1:17722_A_G:141,0,617 0/1:7,2:9:63:0|1:17722_A_G:63,0,320 0/0:11,0:11:14:.:.:0,14,274
1 49272 rs370116346 G A 397.31 PASS AA=N;AC=4;AF=0.286;AN=14;DP=180;set=Intersection GT:AD:DP:GQ:PL 1/1:0,3:3:9:95,9,0 0/0:25,2:27:28:0,28,742 0/0:31,0:31:90:0,90,1350 1/1:0,4:4:12:125,12,0 0/0:36,4:40:1:0,1,939 0/0:36,0:36:81:0,81,1215 ./.:0,0:0:.:0,0,0 0/0:39,0:39:83:0,83,1138 ./.:0,0:0:.:0,0,0
1 936210 rs3121569 A C 124552 PASS AA=A;AC=17;AF=0.944;AN=18;DP=224;set=Intersection GT:AD:DP:GQ:PGT:PID:PL 0/0:0,37:37:99:.:.:1098,108,0 0/0:0,23:23:69:1|1:936194_A_G:987,69,0 0/0:0,29:29:85:.:.:860,85,0 0/0:0,28:28:83:.:.:911,83,0 0/0:0,20:20:60:.:.:673,60,0 0/0:0,24:24:75:1|1:936194_A_G:1071,75,0 0/0:0,29:29:87:.:.:1011,87,0 1/0:7,7:14:99:.:.:206,0,144 0/0:1,19:20:34:.:.:514,34,0
please, post a http://gist.github.com/ with a full VCF (header+ a few variants to be changed or not )
Hi @Pierre I have shared the Sample VCF file (header + few variants with possible conditions) Please have a look (two files). Sample.vcf at gist
Thanks Pierre for these tools! I was looking for this kind of solution a while back ago. I have some problems though with the code. Here's what I'm getting :
Do you have any idea of what's happening here? And once it's working, will it work with the vcf coming from the 1000G project where the AA field has multiple values separated with "|" character?
Thanks a lot!
JC
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.SUBMIT ANSWER
is for new answers to original question.