Hi, one of the reasons to build a pangenome is to capture genes which might be absent in the reference genome of interest. While following the documentation of the PHG you get to a point where you need to "Create bedfile of intervals for PHG reference ranges".
My question is how to add non-reference genes to such a file, should I do it by adding genomic ranges in the reference genome that will roughly match the equivalent regions in other assemblies that contain those non-reference genes?
Consider the following region from chr3H in barley adapted from barley_pangenes:
chr | start | end | reference gene | assembly2 gene |
---|---|---|---|---|
3H | 331491 | 334552 | HORVU.MOREX.r3.3HG0218270 | Horvu_10350_3H01G001500 |
3H | NA | NA | NA | Horvu_10350_3H01G001600 |
3H | 414358 | 417904 | HORVU.MOREX.r3.3HG0218320 | Horvu_10350_3H01G001700 |
- Should I merge these gene models and create a custom range such as 3H:331491-417904 to make sure reads matching non-reference gene Horvu_10350_3H01G001600 are used to build haplotypes?
Thanks for your help, Bruno