obstain 100-way alignment of 100 species for a certain gene
1
0
Entering edit mode
5.4 years ago
qwzhang0601 ▴ 80

I want to get the alignment file (Amino acid for protein, and nucleotide for CDS) for a particular gene from 100-way alignment of 100 species (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/multiz100way). Is there an easy way to get it?

Thanks

multiple-sequence-alignment 100-way-alignment • 1.6k views
ADD COMMENT
1
Entering edit mode

If your focus is a single gene (largely CDS) then it would be good to take a look at ensemble compara For example (https://www.ensembl.org/Homo_sapiens/Gene/Compara_Ortholog?db=core;g=ENSG00000139618;r=13:32315086-32400266) . You can download alignment of 100 plus species for one to one orthologs ) The UCSC alignments typically are genome alignments.

ADD REPLY
0
Entering edit mode

Thank you for your suggestion. I tried to get the alignment, but it seem it generate the alignment on the gene DNA sequence level (including noncoding part), right? I see many long gaps in the alignment. Is there a way to only include CDS sequence or protein sequence? Thanks

ADD REPLY
2
Entering edit mode
5.4 years ago
GenoMax 148k

You can find those alignments here. You will need to download the entire set and then find the gene/exons you need in that file.

It may also be worth looking at NCBI's homologene project where you would be able to search for one specific gene.

ADD COMMENT
0
Entering edit mode

Thanks. I got the files (like refGene.exonAA.fa.gz) from the link you provided. But it seems, it only includes alignment on exon level. We need to merge all exons by myself. Am I right? Not sure whether there is a direct way to do this, or any tool can help to merge alignment of exons into alignment of a gene.

I also looked at NCBI homologenes, but only found alignment for 10 species.

Thank you again for your suggestions.

ADD REPLY

Login before adding your answer.

Traffic: 1569 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6