How do I generate a VCF file from the Practical Haplotype Graph that includes the haplotypes? Isaak has a VCF file for Cassava that was generated by Buckler Lab but he doesn't have the code to create the file
How do I generate a VCF file from the Practical Haplotype Graph that includes the haplotypes? Isaak has a VCF file for Cassava that was generated by Buckler Lab but he doesn't have the code to create the file
I think you're looking for the PathsToVCFHaplotypesPlugin. This exports diploid or haploid paths to a VCF file with haplotype allele values (Not SNPs).
The VCF file is created by first calling HaplotypeGraphBuilderPlugin to create a graph that includes haplotypes based on the user specified methods. This graph is passed along with a PATH method name and optional list of taxa to the ImportDiploidPathPlugin. The ImportDiploidPathPlugin returns the graph along with a map of haplotype paths. Finally, the data from the ImportDiploidPathPlugin output is sent as input to the PathsToVCFHaplotypesPlugin.
Note that running the ImportDiploidPathPlugin is optional. If this step isn't run, the paths are created from the haplotypes in the graph.
When running the PathsToVCFPlugin or PathsToVCFHaplotypesPlugin we recommend using a positions list to limit the number of entries in the output VCF File to something manageable. The positions list can be specified by Genotype file (i.e. VCF, Hapmap, etc.), bed file, or json file containing the requested positions.
An example of chaining these plugins together to get a VCF with haplotypes is below. Replace the parameter values shown here surrounded by < > with your own parameter names. There are other optional parameters to these methods, but below is a basic command
docker run --name pipeline_container --rm -v <baseDir>:/phg/ -t <dockerImageName> \
/tassel-5-standalone/run_pipeline.pl -Xmx200G -debug -configParameters <configFile> \
-HaplotypeGraphBuilderPlugin -configFile <configFile> -methods <haplotypeMethod1> \
-includeVariantContexts true -includeSequences false -taxa <taxa1, taxa2> -endPlugin \
-ImportDiploidPathPlugin -pathMethodName <pathMethod1> -endPlugin \
-PathsToVCFHaplotypesPlugin -outputFile <vcfOutputFile> -referenceFasta <referenceFasta.fa> \
-positions <positions> -endPlugin
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks I have the PathsToVCFHaplotypesPlugin working. Now I have to decide if Breedbase is the best way to display that data. It might be better to represent data in JBrowse or an R Shiny app.