To extract uniques SNPs froma. multisample VCF file
1
0
Entering edit mode
5 months ago
sam ▴ 30

Hi,

I have a multi-sample VCF file. This VCF contains around 20 samples (Sample1 - Sample20).

I intend to extract unique SNPs of Sample1 compared to all the other samples ()samples2 - samepls20).

Samtools has a contrast plugin to extract genotypes that are not observed in another. Here I want to compare my sample of interest against the rest.

Is there any best way I could achieve the unique SNPs? Or should I split my multi-sample VCF file into individual VCF files and do the comparison?

Thank you!

VCF • 437 views
ADD COMMENT
0
Entering edit mode
5 months ago

using vcffilterjdk: https://jvarkit.readthedocs.io/en/latest/VcfFilterJdk/

java -jar jvarkit.jar vcffilterjdk -e 'final String sn1="sampleName"; final Genotype g1=variant.getGenotype(sn1); return variant.getGenotypes().stream().filter(G->!G.getSampleName().equals(sn1)).allMatch(G->!G.sameGenotype(g1));' input.vcf.gz
ADD COMMENT
0
Entering edit mode

Thanks Pierre! I shall try this solution.

ADD REPLY
0
Entering edit mode

Hi Pierre,

I tried to install the software and i get the following error:

* What went wrong:
  Execution failed for jvarkit com.github.lindenb.jvarkit.tools.jvarkit.JvarkitCentral.
  > Compile failed; see the compiler error output for details

Sorry, I am not good at java. I guessed you would already know the solution for this error. So posting here again.

I could write a python script but would like to know if there is a package that could do this already.

Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1663 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6