Filter vaiants having no AF information
0
0
Entering edit mode
5.6 years ago

I have > 1000 samples and I want to filter out variants based on minor allele frequency, My input dataset is a vcf file in this format:

CHROM   POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  88  108 139 159 265 350

1   55  .   C   T   40  PASS    DP=6720;EFF=intergenic_region(MODIFIER||||||||||1)  GT:GQ:DP    ./.:.:. 0|0:36:4    0|0:32:9    0|0:30:4    ./.:.:. ./.:.:.

1   56  .   T   A   40  PASS    DP=6785;EFF=intergenic_region(MODIFIER||||||||||1)  GT:GQ:DP    ./.:.:. ./.:.:. 0|0:32:9    0|0:30:4    ./.:.:. ./.:.:.

1   63  .   T   C   40  PASS    DP=7053;EFF=intergenic_region(MODIFIER||||||||||1)  GT:GQ:DP    ./.:.:. 0|0:40:5    0|0:32:9    0|0:38:5    ./.:.:. ./.:.:.

1   73  .   C   A   40  PASS    DP=8169;EFF=intergenic_region(MODIFIER||||||||||1)  GT:GQ:DP    ./.:.:. 0|0:40:5    0|0:40:9    0|0:38:6    ./.:.:. ./.:.:.

How can I keep snps with minor allele frequecny >= 0.05

snp minor allele frequency vcf • 1.3k views
ADD COMMENT
0
Entering edit mode

I am trying to compile vcffilterjdk but getting this error:

Task :vcffilterjdk FAILED Downloading http://central.maven.org/maven2/com/github/samtools/htsjdk/2.19.0/htsjdk-2.19.0.jar to /home/waqas/jvarkit/lib/com/github/samtools/htsjdk/2.19.0/htsjdk-2.19.0.jar

FAILURE: Build failed with an exception.

BUILD FAILED in 0s 1 actionable task: 1 executed

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I have fixed the proxy settings and VcfFilterJdk took around 3 hours to complete the process, but it just updated the header, it didn't updated the INFO field, do I need to update the AN/AC fields first and then have to apply this script, as you can see from INFO (DP=6720;EFF=intergenic_region(MODIFIER||||||||||1)) filed that i don't have AN/AC tags in it? The currnet output of VcfFilterJdk is:

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##FILTER=<ID=q25,Description="Quality below 25">
##FILTER=<ID=q30,Description="Quality below 30">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=FT,Number=.,Type=String,Description="Genotype-level filter">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=EFF,Number=.,Type=String,Description="Predicted effects for this variant.Format: 'Effect ( Effect_Impact | Functional_Class | Codon_Change | Amino_Acid_Change| Amino_Acid_length | Gene_Name | Transcript_BioType | Gene_Coding | Transcript_ID | Exon_Rank  | Genotype_Number [ | ERRORS | WARNINGS ] )'">
##INFO=<ID=LOF,Number=.,Type=String,Description="Predicted loss of function effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected'">
##INFO=<ID=MAF,Number=1,Type=Float,Description="Min Allele Frequency">
##INFO=<ID=NMD,Number=.,Type=String,Description="Predicted nonsense mediated decay effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected'">
##vcffilterjdk.meta=compilation:20190507213457 githash:ca6efffb htsjdk:2.19.0 date:20190507225208 cmd:-e VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt( AN ,0); if(ac>0) { List<Float> af = variant.getAttributeAsIntList( AC ,0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute( AF ,af);vcb.attribute( MAF ,af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make();
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  88  108 139
1   55  .   C   T   40  PASS    DP=6720;EFF=intergenic_region(MODIFIER||||||||||1)  GT:DP:GQ    ./. 0|0:4:36    0|0:9:32
1   56  .   T   A   40  PASS    DP=6785;EFF=intergenic_region(MODIFIER||||||||||1)  GT:DP:GQ    ./. ./. 0|0:9:32
1   63  .   T   C   40  PASS    DP=7053;EFF=intergenic_region(MODIFIER||||||||||1)  GT:DP:GQ    ./. 0|0:5:40    0|0:9:32
1   73  .   C   A   40  PASS    DP=8169;EFF=intergenic_region(MODIFIER||||||||||1)  GT:DP:GQ    ./. 0|0:5:40    0|0:9:40
ADD REPLY

Login before adding your answer.

Traffic: 1468 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6