download genbank sequences with exon sequences highlighted
1
0
Entering edit mode
6.3 years ago
kspata ▴ 90

Hi,

I wish to download genbank sequence for KRAS with all the exonic regions highlighted. GenBank has an option of 'Highlight Sequence Feature' which displays exons one at a time. But I want to highlight these sequences (exons only) in the downloaded GenBank file.

https://www.ncbi.nlm.nih.gov/nuccore/NG_007524.1?&feature=any#feature_NG_007524.1_exon_0

Is there a way to do this with NCBI while downloading the sequence? Or I have to do it manually which will take time and is more prone to errors.

Thanks in advance.

gene Genbank • 1.7k views
ADD COMMENT
1
Entering edit mode

If you are after coding sequences then following would work:

esearch -db nuccore -query NG_007524.1|efetch -format fasta_cds_na

However, like genomax2 mentioned in the comment, it is not possible to 'highlight sequences'.

ADD REPLY
0
Entering edit mode

Sequence is normally in text format so any annotation (like the highlighting that you refer to) is applied on top/after the fact.

You may be able to use UCSC Table browser which offers an option of downloading genomic sequences with Exons in upper case, everything else in lower case. That may fit your need of being able to distinguish the exons from the rest of the sequence.

ADD REPLY
0
Entering edit mode

Hi genomax,

Thank you for replying. Your suggestion is the most close to what I need. Also, on Ensemble you can download the sequence with exons highlighted in the RTF (Rich Text Format).

https://useast.ensembl.org/Homo_sapiens/Gene/Sequence?g=ENSG00000133703;r=12:25204789-25250936

However, as you and Sej mentioned there is no other way to highlight a sequence in Genbank format.

Thank you for your help!!!

ADD REPLY
0
Entering edit mode
6.3 years ago

You can use efetch from ncbi-entrez-direct utilities. But first, you must have the accessions (with start and end of seuences which you want to extract). Make one file for this data and use following command in any script.

efetch -db nuccore -id NG_007524.1 -format fasta -chr_start start -chr_stop stop

ADD COMMENT
0
Entering edit mode

I know you want to help but please check the requirements in original question before you provide answers.

But I want to highlight these sequences (exons only) in the downloaded GenBank file.

Your solution does not satisfy that requirement. It only retrieves the sequence.

ADD REPLY

Login before adding your answer.

Traffic: 2182 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6