Entering edit mode
5 months ago
majd.abdulghani
▴
20
Dear Biostars community,
Where can I find individual human VCF files as you would get them from a sequencing facility for example? I tried the 1000 Genomes International Genome Sample Resource, but they have separate VCFs for each chromosome, or VCFs of multiple individuals merged together. I just want 4-5 files to practice variant filtering on, but I need them to be real data and as similar as what a sequencing facility would send as possible.
You can easily merge or split these files using bcftools as part of your excercise. A single VCF file per individual is not the norm, why would you want them? For example, you cannot filter by allele frequency if only a single sample per file exists. If you want to make sure, the file is as similar to the ones generated by your facility, you could ask them for a sample.
Thanks, Michael! The thing is, I want to simulate what happens in a clinical setting, so in this case, a VCF is only for one patient. The point is to try and build a small database of variants from different patients.
I see. In order to optimally prepare for what is coming, you need to know their pipeline first. Try to ask the core facility for example data from an earlier run. If they cannot provide this, ask for the software and versions they are going to use.