bcftools: DP filter based on chromosome and sex
1
1
Entering edit mode
6.2 years ago
bsmith030465 ▴ 240

Hi,

I want to filter my bcf file such that:

  1. Autosomal chromosomes are filtered with DP > 20
  2. X chromosome, for females is filtered with DP > 20
  3. X chromosome, for males, with DP > 10 (i.e relax this criteria for males)

My current plan is to split the data into autosomal (and chrY) and X chromosome using something like (any way to specify chr1-chr22?):

bcftools view -i 'DP>20' input.bcf --regions chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrY -Ob -o autosomalY.bcf


## sample ids as they appear in input.bcf file
bcftools query -l input.bcf > allsampleids.txt

I can process the X chromosome using:

## assuming femaleIDs.txt contains female sample ids
bcftools view -i 'DP>20' input.bcf --regions chrX --samples-file femaleIDs.txt -Ob -o Xfemales.bcf
bcftools view -i 'DP>10' input.bcf --regions chrX --samples-file ^femaleIDs.txt -Ob -o Xmales.bcf

My questions:

  1. How should I recombine the outputs, such that I again have one bcf dataset? i.e. how do I combine autosomalY.bcf, Xfemales.bcf and Xmales.bcf? Not sure if this is possible or even a good idea....
  2. Is there a cleaner way to do this?

Thanks!

bcftools wgs samtools vcftools pyvcf • 3.8k views
ADD COMMENT
0
Entering edit mode
6.2 years ago

using vcffilterjdk

with the code:

if(variant.getContig().equals("Y")) {
    return false;
    }
else if(variant.getContig().equals("X")) {
final Set<String> setM = CollectionUtil.makeSet("S1","S2");
final Set<String> setF = CollectionUtil.makeSet("S2","S4","S5");

boolean ok_m = variant.getGenotypes().stream().filter(G->G.hasDP() && setM.contains(G.getSampleName())).mapToInt(G->G.getDP()).sum() > 10;
boolean ok_f = variant.getGenotypes().stream().filter(G->G.hasDP() && setF.contains(G.getSampleName())).mapToInt(G->G.getDP()).sum() > 20;

return ok_m && ok_f;
}
else
{
return variant.getAttributeAsInt("DP",0)>20;
}

exemple:

java -jar dist/vcffilterjdk.jar -f input.code input.vcf

ADD COMMENT
0
Entering edit mode

Hi Pierre,

Thanks!!!

It would be extremely helpful if there was some documentation with this (where is it reading the male/female ids?). Also, what if I wanted to to do 'DP>10 & GQ >10', how do I incorporate multiple conditions?

ADD REPLY
0
Entering edit mode

t would be extremely helpful if there was some documentation

my bad ! I forgot to add the link : http://lindenb.github.io/jvarkit/VcfFilterJdk.html

where is it reading the male/female ids

it's hard coded:

final Set<String> setM = CollectionUtil.makeSet("S1","S2");

how do I incorporate multiple conditions?

basically it requires to a knowledge of java and the library for HTS. But there are many examples or links to previous biostar post in the manual : http://lindenb.github.io/jvarkit/VcfFilterJdk.html

ADD REPLY

Login before adding your answer.

Traffic: 2932 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6