Question

Entire Alignment Of A Conservation Track In Ucsc

0

Entering edit mode

11.7 years ago

anuragm ▴ 130

I am interested in looking at the conservation of the Xenopus genome at certain positions. After that I want to be able to extract the entire alignment of this region (preferably as a fasta file). Suppose I am looking at the tyr gene in the Genome Browser, if I click on one of the exons, I get to a page that gives me the option for 'CDS Fasta Alignment'. Clicking on it, I look at an alignment but this alignment is much shorter than the tyr gene itself.

Am I doing something wrong ? Is there another way of getting the multiple alignment for regions of interest for a genome ?

ucsc conservation alignment fasta • 2.8k views

ADD COMMENT • link updated 11.7 years ago by Alex Reynolds 36k • written 11.7 years ago by anuragm ▴ 130

score 0 · Answer 1 · 2013-08-29

0

Entering edit mode

11.7 years ago

Alex Reynolds 36k

Have you looked here: http://hgdownload.soe.ucsc.edu/goldenPath/xenTro3/multiz9way/

ADD COMMENT • link 11.7 years ago by Alex Reynolds 36k

0

Entering edit mode

I would want alignments for certain regions I am interested in. Just like I can view the positions I am interested in the genome browser and then click DNA to get the sequence. So, the whole files available for download dont exactly help me.

ADD REPLY • link 11.7 years ago by anuragm ▴ 130

0

Entering edit mode

You can do this on the command line. Convert the alignments to BED with an awk/Perl/whatever statement, sort them with sort-bed and run them through bedops --element-of -1 to get elements within a particular genomic range. See the BEDOPS docs for binaries and examples: http://code.google.com/p/bedops/

ADD REPLY • link 11.7 years ago by Alex Reynolds 36k

0

Entering edit mode

As a for instance:

$ head -1 ensGene.exonNuc.fa
>ENSXETT00000006166_xenTro3_1_24 111 0 0 GL172637:5210959-5211069+
ATGGCATCTATCATGGAAGGACCTTTGAGCAAATGGACAAACGTGATGAAAGGCTGGCAGTACCGTTGGTTTGTGTTGGATTACAACGCCGGGCTGCTCTCCTATTATACG

All the needed data are there to turn this into a UCSC BED file with a Perl script (or a more convoluted awk script). You can then filter the BED file with grep and run set operations on it with bedops to get elements of the desired genome and genomic range.

ADD REPLY • link 11.7 years ago by Alex Reynolds 36k