Find "phased segments" from trio VCF?
0
0
Entering edit mode
29 days ago
cmdcolin ★ 4.1k

I was trying to find interesting patterns in trio VCF and was wondering if there are any tools that can:

  • input: phased trio VCF
  • output: genome regions where the child matches the parent haplotypes

I am new to this area of bioinfo, and so I was trying to find the keywords to search for, and I found "local ancestry" and "identity by descent" and maybe "chromosome painting" to be possible keywords

local ancestry seems to identify segments that are similar to a 'population', while identity by descent for closer relatedness e.g. families and trios. chromosome painting is often related to the local ancestry concept also.

One program i found was hapIBD, which I tried running on a trio VCF from 1000 genomes but the segments that it outputs are relatively short, and my feeling is that, for trio VCF, that the genome should be much more completely covered by segments

https://github.com/browning-lab/hap-ibd

here is how i invoked hap-ibd

for i in {1..22}; do
    java -jar hap-ibd.jar gt=HG02024_VN049_KHVTrio.chr$i.vcf.gz map=maps/plink.chr$i.GRCh38.map out=out$i;
done;

enter image description here

fig 1. screenshot showing the output of hapIBD in jbrowse 2. there are very few and short "segments" (orange blocks), where my expectation would be to more completely cover the genome

enter image description here

fig 2. showing a zoom in on a block. it does seem to have some valid output, but it also only matches one haplotype there, while i'd want to also get the other haplotype

phased vcf genotyping variants phasing • 515 views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I tried out the KING tool with the "IBD segments" function, which seemed like a good lead, but it did not output anything

 plink --vcf trio.vcf --make-bed --out ex
 king -b ex.bed --ibdseg --prefix ex2

this did not output any "segments" just a couple files

ex2.kin0 ex2X.kin ex2X.kin0

ADD REPLY
0
Entering edit mode

If you do not phase your input VCF to hap-ibd, I don't know what output you get, in fact I would expect the program to crash. Is there not an option to hap-ibd that tells it you're giving it a parent parent child trio?

ADD REPLY
0
Entering edit mode

thanks for the feedback, I could not figure out how to create the .fam files or something like that to provide the proper 'family structure' to "king" or "plink", which i think do allow providing these params. hap-ibd does not appear have a parameter for 'family structure'. https://github.com/browning-lab/hap-ibd

note that the VCF is phased though (has e.g. 0|1 type calls), i am testing out using data from https://hgdownload.soe.ucsc.edu/gbdb/hg38/1000Genomes/trio/HG02024_VN049_KHV/

subtext: I am a tool developer trying to find a way of visualizing these large haplotyp blocks in JBrowse 2 and just wondering what tools, if any, are used in the real world for this. I recently made JBrowse able to split the different phase into multiple rows, as seen in the screenshot.

ADD REPLY
0
Entering edit mode

I see. I have used hap-ibd in the past with success, from a trio you should get plenty of segments longer than 1 Mbp.

I would expect that people mostly use PLINK for these matters, however. I have not used it myself for this, yet, so can't help you with what options are correct. There should be tutorials all over the interwebs though.

ADD REPLY
0
Entering edit mode

If you have a tutorial that might help, please link it as an answer. even better if it works on the specific dataset that I linked. if I find an answer, I will update this thread but for now I am still looking.

ADD REPLY

Login before adding your answer.

Traffic: 1902 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6