Hi everyone,
I have a list of human genomic regions (bed file enlisting genomic coordinates). From that list I want to extract those regions which are human-specific. I want to ask what can be appropriate pipeline for that.
I found one study using this pipeline: Using liftOver utility for converting the coordinates to mouse genome (mm9), then those not converted to the marmoset genome (CalJac3), and again those not converted to the chimp genome (PanTro2) to finally get human-specific regions.
Should I follow this, or could there be some alternate or more efficient way?
Thanks in advance
I think you are looking at mappability to identify unique regions. You can take a look at this recent paper.
Thank you. Just looked into this paper. What I understood is, it is about identifying unique regions within a single genome.
This is a little tricky because of the "orthology" That Alex mentions below. There will always be some sequence similarity between humans and our close relatives (monkeys, mice etc) so there may be no true
human-specific
regions that are coding.Something like 96% of human and chimpanzee sequence is identical, by one measure. This may be a very difficult problem, without more detail in the question.
Thank you for the guide. I will look into more details.