Entering edit mode
5.3 years ago
Palgrave
▴
130
I have a fasta assembly for a fish species that is not very well characterized, regarding UTRs. Using this fasta I would like to find putative 3'UTR sequences by aligning to a closely related fish species, zebrafish, using the UTRs of zebrafish.
How would you approach this to get a set of 3'UTRs that are conserved in zebrafish?
Go to the UCSC genome browser here
Choose track = "Ensembl genes"
Region = "genome"
Output Format = "bed"
Then select output. Here you can select 3' UTRs.
As previously suggested here, you can use bedtools to convert bed format to fasta format using 'bedtools getfasta.
Hi, I am not analyzing human sample, but a rare fish species
Well yes, the first link provided will give you 3' UTRs in zebrafish. My understanding of the question is: you would like to align your poorly annotated species to the 3' UTRs of zebrafish, to identity putative 3' UTRs?
Sorry, I did not see that. So I can also the the 3utr sequence by choosing output format=sequence?
You have to select Output format = "BED". Selecting other formats will not let you select 3' UTR.
Once you download the BED file, go ahead and download the genome fasta file for zebrafish. bedtools will work by pulling out sequences in the zebrafish fasta file that correspond to the coordinates in your BED file. Here is a link to run it bedtools getfasta. I'd appreciate it if you accepted the first comment as an answer cough cough @ATPoint (who moved my original answer to a comment?)
Are you sure?
Hey presto! Looks correct to me.