Question

creating vcf file after giraffe alignment

0

Entering edit mode

24 months ago

Michal • 0

I am using the HPRC human pangenome as a reference for aligning my whole genome sequencing data. My focus is on the HLA region, and I anticipate improved alignment results using the pangenome reference. To achieve this, I make use of the provided HPRC data files (.gbwt, .gg, .dist, and .min) in combination with the Giraffe alignment tool. As a result of this alignment process, I now possess a gam file.

I would like to generate a VCF file that encompasses all the variants, referencing to the hg38 genome (which I understand supposed to be one of the paths in the graph). Can I directly utilize the existing files for the "vg call", or is additional preprocessing of these files necessary? What parameters should I use for the vg call command?

giraffe vg vcf • 1.1k views

ADD COMMENT • link 24 months ago by Michal • 0

score 0 · Answer 1 · 2023-08-10

0

Entering edit mode

24 months ago

colindaven 7.7k

I use (in nextflow) something like this

VG_FULL_TRACEBACK=1
vg pack -t $task.cpus -x $gbz -g $gam -Q5 -o ${prefix}.aln.pack
vg call -t $task.cpus $gbz -C 100 -k ${prefix}.aln.pack --min-support $params.min_support -a -r $snarls -z -s $sample_name > ${prefix}.vcf
bgzip ${prefix}.vcf
tabix ${prefix}.vcf.gz

ADD COMMENT • link 24 months ago by colindaven 7.7k

0

Entering edit mode

Thank you! It is very helpful! Where in your commands do you define the GRCh38 as reference? in -r $snarls? how can I know that path name? should I run vg snarls first?

ADD REPLY • link 24 months ago by Michal • 0