Entering edit mode
9.7 years ago
vigneshprbh37
▴
30
So I got nucleotide fasta sequence compiled for a set of genes iam working on, and I need to pull out 5' and 3'utrs from them.
Before you ask I tried working with a source file containing entire database of utr, I was able to retrieve utr for around half of my geneset. many showed sequence being unavailable
From ncbi I was able to obtain the length of total gene sequence and cds and deduce the utr sequences
For example gene x:
1..6521 cds : 455..4453
so 5'utr:
1-454 3'utr : 4454-6521
I am going with this rationale
Can you suggest a program to retrieve the set number of nucleotides for each gene
If you have both files (a file with nucleotide sequences and a file with CDS localization) I could help you to write a script to extract UTR sequences for each gene.
Hi OP,
Any updates on your progress? I am also facing the same problems as you. Please let me know if you have found a solution to yours.
I have tried the UCSC table browser. Apparently my species is only arbitrarily annotated with -/+ 200bp to the start/end codon as UTRs.