Question

[hg19] Reading genes with CCDS

0

Entering edit mode

7.2 years ago

mollitz ▴ 90

Hey,

I got a bit lost while trying to access gene sequences in the GRCH37. I downloaded CCDS coordinates (I tried datasets from Enseml an NCBI: ftp://ftp.ensembl.org/pub/release-75/fasta/homo_sapiens/cds/ http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ccdsGene.txt.gz along with the provided FASTA files of each chromosomes from both databases) though, the provided CCDS coordinates for the genes never match:

I "flatten" the fasta file (e.g. for chromosome 1), so that it doesn't contain any newlines or fasta-headers, lookup the sequence for a given gene from the CCDS file and compare the sequence to an online viewer (for example Ensembl). The sequences never match. Searching for the sequence (which I copy from Ensemble) in my chromosome file, I can find it with a notable offset.

Failing on such a simple task shows that I lack of experience so I kindly ask for two advices.

How to get/download the genome (GRCh37) with matching gene annotations so I can search (on my machine) for gene sequences?
I didn't find a free/good edX/Coursera/... course or any other good tutorial. I've got a CS background, so I'm fine with algorithms and Python etc., though is there a good resource online which gives a good overview about datasets and how to work with them?

Best wishes and thanks for replies

GRCH37 hg19 CCDS coordinates • 2.3k views

ADD COMMENT • link updated 7.2 years ago by Satyajeet Khare ★ 1.6k • written 7.2 years ago by mollitz ▴ 90

0

Entering edit mode

Can you please give an example of a CCDS where it doesn't seem to match up. We can use this to try to work out what's going wrong.

ADD REPLY • link 7.2 years ago by Emily 24k

score 2 · Accepted Answer · 2017-09-17

2

Entering edit mode

7.2 years ago

Satyajeet Khare ★ 1.6k

You can get them from RefSeq, UCSC, Ensembl etc. Here is Gencode link to hg37. You can download genome sequence, gene GTF files that match in annotation. BTW, hg38 is also available now. You can get the link to hg38 on the same portal.

ADD COMMENT • link 7.2 years ago by Satyajeet Khare ★ 1.6k