Working with NCBI downloadable Datasets
2
0
Entering edit mode
12 months ago

Hi all,

I'm an postgraduate student currently working on an assigment in the field of "Analysis Molecular Data".

We've been instructed to examine polymoprhism in the promoter of the gene MMP3 in humans, and how that might affect expression and causality for genetic disorders. Specifically, we should investigate the change of bindings of TF's in the regino of the sequence where this SNP occurs.

Now, in order to achieve this my idea was to dowload the gDNA sequence of the gene + 3000bp upstream from the first exon from NCBI. I would like the different sections of the sequence to be annotated, e.g. the exons/introns/CDS and any other annotation available from NCBI (TSS/Promoter etc.).

For this I downloaded the datasets of the gene with the data folder and gene.fna etc. But I'm not sure on how to work with this?

Is there a graphical interface where the sections of the sequence is annotated or coloured?

Best,

NCBI Genedata Datasets • 1.2k views
ADD COMMENT
2
Entering edit mode

GTF files have these annotations. You can get that from NCBI, Ensembl, GENCODE. Promoters are not annotated, it is usually approximated -500bp upstream of the TSS. I do not know of any GUI, and I also don't see how one could help here since you're essentially not asking a precise question. If you have trouble with assignments then talk to the lecturer and ask for clarification. They probably have a strategy in mind that you can take inspiration from.

ADD REPLY
0
Entering edit mode

Maybe I can rephrase my question outside of the context of this assignment; how would one usually work with downloaded datasets from NCBI? What are usually the steps after downloading a dataset?

Can I open the JSON files in another program perhaps?

Many thanks for the respons!

ADD REPLY
0
Entering edit mode

NCBI has so much data, and people use it in so many ways, that this question has no single answer.

ADD REPLY
0
Entering edit mode

Again, there is no answer to this. Usually you parse what you need with Unix or any programming language. GUIs are uncommon in bioinformatics. Your question is too broad.

ADD REPLY
0
Entering edit mode

Thanks for letting me know and taking the time to answer still

ADD REPLY
1
Entering edit mode
12 months ago

I suggest that you download the human genome (a fasta file) and its accompanying annotation (a gff file) and look at them in IGV which I guess you may be able to do on the web but it's more flexible if you download it. As ATpoint mentioned the promoters are not usually annotated, but you can go to the gene of interest and then scroll around the area where the promoter should be. IGV allows you to add extra "tracks" so you can add one that shows common mutations from, say, dbSNP overlaid.

UCSC also has a genome browser that is commonly used; it may have promoter annotations. Or maybe you should do a literature search to find out where that gene's promoter is thought to be. It's harder to find promoter locations than coding sequence boundaries.

ADD COMMENT
0
Entering edit mode

Thanks a bunch for the tips

ADD REPLY
1
Entering edit mode
12 months ago
rfran010 ★ 1.3k

I would start with UCSC's genome browser which is a nice GUI that includes many resources that are easily loaded and the option to load your own data (which can be downloaded from NCBI or ENCODE). A "track" is essentially a loaded data set. In the pic below, the gene annotation (essentially the GTF file) is loaded as a track at the top and denotes the annotated introns and exons. I also loaded an SNP database and a predicted TF binding site database so you can start to see how the layout of these around the MMP9 gene promoter (in this case these databases are available using options below the browser interface. There's dropdowns of various databases that you can select "Show" or "full" to reveal the database tracks in the browser section).

It seems like this may be the sort of thing you are looking for. UCSC can be a lot to take in, so I recommend playing around and googling as much as possible.

enter image description here

ADD COMMENT
0
Entering edit mode

Many thanks for the elaborate help :)

ADD REPLY

Login before adding your answer.

Traffic: 2231 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6