Entering edit mode
7.2 years ago
Joe.waldron
•
0
I have some next generation sequencing data that I am planning to map to the refseq transcriptome. However, the FASTA file that I have dowloaded doesn't have the positions of the UTRs/CDSs unlike the GENCODE FASTA which does have this information. I have been struggling to find the annotation for these transcripts on the refseq website. Can anyone point me in the right direction? Many thanks
RefSeq is not a genome annotation system to start with, it's a collection of sequences that are annotated/linked to other NCBI resources. I would suggest using an annotated reference genome resource such as provided by Ensembl (which includes the GENCODE data).
OK Thanks very much for your reply
Why don't you just use GENCODE?
We're particularly interested in 5'UTRs and I've read that the annotation of 5' ends is better in refseq
Which organism are we talking about?
humans
GTF files from all (REFSEQ, GENCODE, UCSC) have CDS information. Why not use GTF file to extract UTRs?