Hi,
When creating a reference genome for human NGS studies, do people generally just use the major chromosomal contigs (chr1-22, chrX,Y,M) or do they also include the unplaced (chrUn) and random contigs (chrrandom*)?
I had initially assumed I should just go with the main contigs but now have begun to question my original reasoning.
Is there a particular reason you're questioning your original reasoning? Usually, I use chromosomal contigs for alignment, but now your question is leaving me wondering if I'm missing something...
I question my original decision because it was based on a whim and I noted that there are known polymorphisms associated with the unplaced and random contigs. Since these are sequences that do not map to any of the reference chromosomes, I believe it is probably best to include them in a reference genome whilst excluding the haplotype files.