Hello,
I was given a set of VCF files, comparing variants. The Readme gives the following command
bcftools isec -p dir -n-1 -c all ref.vcf.gz S1.vcf.gz S2.vcf.gz S3.vcf.gz S4.vcf.GZ S5.vcf.gz S5.vcf.gz
First, if I understand the doc correctly, -n-1 means "looks for SNPs found at most in a single file"? Correct? But then why doesn't it write (and I have seen this years ago when I used it a bit)
dir/0000.vcf for records private to ref.vcf.gz
dir/0001.vcf for records private to S1.vcf.gz
etc.
Instead, it writes:
dir/0000.vcf for stripped ref.vcf.gz
dir/0001.vcf for stripped S1.vcf.gz
etc
I have two questions. The first one is, is my interpretation of the command correct? I find the doc difficult. The second question is, what does "stripped" mean? I am not a native English speaker, and translations I find in dictionaries let me puzzled. What is "stripped" supposed to mean? Is it how it means "private" in a new update? The translations I find mean "without attribute" or even "naked". I don't get what it would mean in this context.
I would appreciate any insight.
PS: For this kind of task, I prefer the plugin +contrast, which gives clearer answers. Some of you will probably remember my questions about that a few days ago.
Thanks a lot, really.
EDIT: according to this thread Explaining bcftools isec output
and what I find in the "sites.txt" which are always 000100, so always one 1 and 5 0, it seems my interpretation is correct. I still wonder why they ditched the "private" for "stripped". It's confusing for non-native (at least for us).
Thank you, yes, for some reason, I found it AFTER making my post. Feel free to lock/delete it if you think it's too much of a useless duplicate. I still find it confusing, not the Ram's explanation, but why they presented the results that way.