Hello all,
Sorry if the question has been asked before ...
I have to do human genome alignments. I would like to use the grch38 version but I don't know which version exactly to use. On the NCBI site there are several versions:
- Full analysis set
- Full plus
- no alt analysis
- no alt plus
From what i understood the aligners like bwa mem are alt awared so i think that i could used the full analysis or full analysis plus but i want to be sure that all will work with all the next step (i follow the GATK best practices for germline variants).
When i used bwa mem i should put the path to the folder with all index file of the genome? Because in the folder corresponding to the bwa mem indexes there is no .fa file.
Thanks a lot in advance :-)
Which human reference genome should I use?
Thank's a lot but I don't really understand the article as it seems to do some useless things. Or maybe is to old? For exemple in the genome reference consortium website in the description of the no_alt_analysis reference it's wrote "The two PAR regions on chromosome Y, and duplicate copies of centromeric arrays and WGS on chromosomes 5, 14, 19, 21 & 22, have been hard-masked with Ns" "The full_analysis_set contains the alternate locus scaffolds in addition to all the sequences present in the no_alt_analysis_set.
The full_plus_hs38d1_analysis_set contains the human decoy sequences from hs38d1 (GCA_000786075.2) in addition to all the sequences present in the full_analysis set."
So i think that the best way is to use the full_plus_hs38d1 ? I will try..
Also read https://lh3.github.io/2017/11/13/which-human-reference-genome-to-use Heng Li is probably the definitive source on these issues, so best to go with him.
Thanks. But again it's an article from 2017. And he write to use the no alt version because the tools are not aware. But now the tool like bwa mem are alt aware... It's really dificult to found a recent source.
Even the GATK article is not really up to date as it used the gatk 3... Maybe as i don't know which genome used, is better tout used the no alt to be sure even if i will miss some intersting future about alt contig.