Hi all! How I could download the genotypes STRs of the 1000 genomes project? (both alleles for each marker in the different individuals studied)
Hope you can help me! Thanks
Hi all! How I could download the genotypes STRs of the 1000 genomes project? (both alleles for each marker in the different individuals studied)
Hope you can help me! Thanks
Search engines can help you. Funnily enough, when searching for STRs of the 1000 genomes project
, your post is the first hit for DuckDuckGo and Bing, but the second hit for these engines, and the first for google, is:
Short Tandem Repeats added to the 1000 Genomes Release #ASHG14
These papers are also of interest:
The landscape of human STR variation
Identification of conserved and polymorphic STRs for personal genomes
I am essentially copying over Wouter's response from the Biostar Moderator Slack discussion forum, but he mentioned the following programs (I believe for longer STRs):
eXTRa
To be honest, I don't believe I've tried any of them. However, I wanted to add this because it seemed relevant to the discussion (and it might even be useful to be at some later point)
I also found this programs while Googling: STRScan However, these admittedly don't answer your original question about how to find something already pre-processed for the 1000 Genomes project.
That said, the first link in h.mon's answer does mention some 1000 Genomes STR call sets (for LobSTR and RepeatSeq)
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
STR genotypes are not commonly called by variant callers which are used for SNVs and other small variants. It is definitely possible that some markers have been called when they're just a short indel variant, but you shouldn't count on those being correct, as longer expansions would be problematic for most tools.
The most correct way would be to download all bams for the 1000 genomes project and use an appropriate tool for this. That's going to be a lot of work though.