Database To Find Intron Sequences ?
5
1
Entering edit mode
12.8 years ago

I'm looking for databases that have intron sequences . I was looking on the NCBI nucleotide database but had little luck .

intron sequence database nucleotide • 12k views
ADD COMMENT
0
Entering edit mode

Which organism?

ADD REPLY
0
Entering edit mode

any model organism would be fine . i'd prefer planarians or drosophila but when i go to th RefSeq format on NCBI to see where the introns are i cant extract them .

ADD REPLY
2
Entering edit mode
12.8 years ago

Years ago I made "GENERECORDS", a filemaker tool that semi-automatically parses Genbank records and extracts CDSs, INTRONS and EXONS by reading the CDS features in each record. Moreover, the tool extracts in distinct databases whatever feature is in the record. Please find here the paper and the software and tutorials for more details. I hope it will be useful at least as a model for your algorithm.

ADD COMMENT
1
Entering edit mode
12.8 years ago
raunakms ★ 1.1k

If you parse the CDS position readily available in a GenBank format of the sequence (in NCBI nucleotide database) you can easily calculate the position of the introns as well as the exons !!!

ADD COMMENT
0
Entering edit mode

how would one parse the CDS position ?

ADD REPLY
0
Entering edit mode
12.8 years ago

Well you didn't specify which species you need to extract introns. Anyway there is a simple trick in Galaxy where you can use "Extract Features" to convert gene to exon/intron/codon regions. This works only if you have a known gene list in BED format. Alternatively you can look into kent tools.

Good luck!

ADD COMMENT
0
Entering edit mode
12.8 years ago
  • Download this amazing tool.
  • open it and choose genbank
  • select "type CDS", or directly put "t=CDS" in the query field
  • press "extract seqs. to file"
  • choose your favorite output format (text,fasta, genbank)
  • select "extract feature" -> intron
  • you won!

PS: you can filter your list by any criteria: size, species, keyword ex: t=CDS AND sp=homo sapiens

ADD COMMENT
0
Entering edit mode
12.8 years ago
SES 8.6k

For the Drosophila introns, you can go to FlyBase, click on the species of interest, and select the "all-introns" file under the "Fasta" section.

I don't think there is such a direct way to get intron sequences for worms. A pretty simple method would be to download the annotation file in GFF format, and use the Perl example on the Data Mining page to get the intron sequences (you may have to play around with that code, but it is only a few lines of Perl).

For Arabidopsis, you can go to the Bulk Data Download page, click on "ftp server", click "TAIR10_blastsets" and then just download the file of introns.

ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

"You can't open the application "GeneRecords" because PowerPC applications are no longer supported."

How do I get around this ?

ADD REPLY

Login before adding your answer.

Traffic: 1675 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6