I am interested to use repeat sequences as a neutral model in order to find out the selection signatures on non-coding dataset in primates. However, i have no clue how could i find the repeat sequences which are ancestral to the species under question. can any one please help me in this regard?
You can get ancestral repeats from the UCSC table browser. For example, if you want ancestral repeats for human and mouse, referenced to the human hg19 build, you would select genome="human", assembly="Feb 2009 (GRCh37/hg19), group="Repeats", track="RepeatMasker", then click the "Create" button in the intersection area. On the Intersect With RepeatMasker page this takes you to, you would select group="Comparative Genomics", track="Placental Chain/Net", table="Mouse Chain (chainMm10)", select the radio button for "All RepeatMasker records that have at least [XX]% overlap with Placental Chain/Net" and set XX=100 (which implies the whole RepeatMasker element must exist in both species' sequence). Click "submit" at the bottom and, back on the main Table Browser page, put in whatever options you want for your output (send to browser, save as file, etc.) and click "Get Ouput". If you want to do this for only a specific region of the genome, you can define the region in the box provided or upload a set of regions you have saved to a file.
Actually the issue is to find the ancestral repeats.. i mean how could i get to know that which repeats are ancestral to primates? any help specifically in this regard..!
RepBase allows you to get repeats by taxon (http://www.girinst.org/repbase/update/browse.php), and to download the results in a format of your choice. Selecting "Homo sapiens" under taxon gives you two download options - one with ancestral repeats and one without. It's a pretty simple matter to then filter for those appearing in the ancestral list but not in the other. Replace "Homo sapiens" with the taxonomic level of your choice ...
ADD REPLY
• link
updated 2.9 years ago by
Ram
44k
•
written 10.2 years ago by
george.ry
★
1.2k
0
Entering edit mode
Thank you for your reply Manu and george.ry! May I get to know how we'll find the conservation score of repeats.
And is there any way to find out ancestral repeats in a genomic location of our interest using UCSC genome browser!
Thanks in advance!
ADD REPLY
• link
updated 2.9 years ago by
Ram
44k
•
written 10.2 years ago by
AISHA
▴
140
0
Entering edit mode
Using the phastCons scores to filter out repeats to use for a neutral model is not advised because phastCons scores represent probability of purifying selection, not positional conservation. If you use repeats that have high phastCons scores to estimate a neutral model, it will significantly underestimate the actual substitution rates. This means you'll have very little power to detect signatures of selection because you've effectively diluted out most of your signal!
ADD REPLY
• link
updated 2.9 years ago by
Ram
44k
•
written 9.7 years ago by
agd27
▴
130
upvoted this because it needs to be highlighted as the best answer.