Chris - thanks for a very timely question.
N.B.: I will frame this answer not so much with respect to the HPRC as its predecessor, the T2T consortium. Generally, both the strengths and weaknesses I mention for T2T CHM13v2, below, are amplified when considering HPRC at present because it is even newer, even fewer tools and annotations exist, but it is in the abstract even better than T2T.
At present, I'd argue whether or not you use the pangenome assembly (or CHM13 v2) depends on the specific use case(s) that motivated generation of the datasets to-be-assembled. This is because of a key trade-off: generally, T2T and HPRC assemblies both increase the power to detect certain genomic elements and reduce false positives; i.e., they are "better" for base level genomics.
However, these theoretical advantages have to be considered against practical barriers that still exist at present. I'll try to provide a more granular view, below:
1) The use of gapless assemblies has already been shown to improve performance on the core genomic tasks of assembly, alignment, variant ascertainment, etc., as presented in a nice T2T companion paper, Aganezov et al.. Its worth reading through this in toto, but as stated in the abstract:
We identify hundreds of thousands of variants per sample in previously
unresolved regions, showcasing the promise of the T2T-CHM13 reference
for evolutionary and biomedical discovery. Simultaneously, this
reference eliminates tens of thousands of spurious variants per
sample, including reduction of false positives in 269 medically
relevant genes by up to a factor of 12. Because of these improvements
in variant discovery coupled with population and functional genomic
resources, T2T-CHM13 is positioned to replace GRCh38 as the prevailing
reference for human genetics.
while this was written about T2T-CHM13, similar things are true regarding the pangenome reference assembly, with the added bonus that, if you are studying individuals from ancestral populations that have not had much sequencing performed to date, these gains might be amplified a bit.
The abstract mentions false positive reduction in medically relevant genes by "up to a factor of 12". Part of knowing whether it is "worth it" to go to the new assemblies is knowing where these regions are. You can find annotations for where the "new" sequences that were either gaps, flagged, or misrepresented on prior builds, are, by reading documentation for GRCh38, through resources like the CHM13 unique track, reading the core T2T/HPRC papers (e.g. Nurk et al. 2022) etc. If you release you are studying genes in such regions, it might be relatively more important for you. This is in effect what motivated Evan Eichler to start the process of recommending that a CHM be used as a genomic resource more than 20 years ago (he is interested in Autism).
2) Tradeoff: different levels of annotation
There are some very practical reasons not to switch yet that relate to the amount/variety of data resources and companion sets annotated for T2T/HPRC contra GRCh38. To get a sense for this, scroll through the annotations presented on the CHM13 github page: https://github.com/marbl/CHM13. As you can see, key resources like ClinVar have been lifted-over, and these efforts are impressive. Nevertheless, there are FAR more annotations for the GAIB builds at present. Visually, you can see this same thing by going to the UCSC genome browser URL for each build: Check out the number of tracks available at the bottom of the page for GRCh38 (https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38) in comparison to T2T CHM13v2 (https://genome.ucsc.edu/cgi-bin/hgTracks?db=hub_3671779_hs1).
3) Structural variant detection, epistasis, haplotyping, and if you have 3rd gen (nanopore, SMRT) sequences
Generally, if you are making a serious study of SV, long(er) range epistasis, or haplotype effects, I think it is worth at least exploring the newer assemblies, in particular if you have 3rd gen sequencing data. This is because ascertainment of these phenomena are particularly improved; for instance we now know we were missing a large percentage of (especially large) inversion variants. While these gains are predicated upon the use of newer technologies, the choice of the new build to use also plays a role. So, for questions relating to the above, I might actually use both and draw on the strengths of one or the other depending on the goal.
Summary:
In summary, I'd recommend carefully considering what the goals of your study are. If you think you are likely to care more about whether a region you are interested in has abundant epigenomic / enhancer annotations, GRCh38 is probably going to prove easier. If you are studying a rare disease you think could be motivated by a SV or haplotype effects, I'd recommend both 3rd gen sequencing and use of T2T/HPRC resources.
Data Availability: You also mentioned question relating to data availability. Both T2T and HPRC are open consortia. To get the most up to date data (and you are going to want this if you are seriously considering using HPRC), I'd join the consortia and stay apprised of new and upcoming data releases. The number of genomes in the HPRC is steadily increasing, while major databases are likely to lag behind slightly.
Wow. Thanks LauferVA for a very detailed comment!
Chris , it was an answer not a comment. Please see: https://www.nature.com/articles/s41586-023-05895-y/figures/3
Yes, I mean need time to read your answer. I appreciate it!
no worries...
SO?! what will you choose?
I am moving to another problem so I need more time to digest your information. I know it is an answer. Thanks Laufer!