I am looking at the human genome available through Bioconductor packages: NCBI GRCh38 and UCSC.hg19. And I do not get all the different sequence names I see. Could you help please?
In the UCSC.hg19, I do have chromosomes 1 to 22 + X and Y. But I also have chrM, chr1_gl000191_random, chr4_ctg9_hap1, chrUn_gl000212....and so on.
In the NCBI GRCh38, I can see sequences called MT, HSCHR1_CTG2_UNLOCALIZED, HSCHR3UN_CTG2, HSCHR2_RANDOM_CTG1....
What are those _random, _unlocalized, chrUn, _ctg9_hap1...please??
Should all those sequences be used when trying to align NGS reads to the genome for instance? or only a subset?
many thanks
Thanks a lot !!!