How To Store Large Genome Sequence In Oracle Table?
3
3
Entering edit mode
12.7 years ago
Mmhs ▴ 30

UPDATE: I have got my answer. Thanks to Pascal

Using data type CLOB solves the problem for upto size 2GB (in some cases more). For now, that's all I needed.

However, below remains my problem for which I opened this thread:

Hello there,

I am trying to store some genome sequences of plants into my local server for research.

I retrieve them from NCBI collection.

I have been able to convert the FASTA files into CSV and attempted to load onto the table.

Reasonably, I am not able to import because of the field size limit.

But, I hope there is some way to store this.

Could anyone please guide me finding the solution?

My system is CENTOS 6.2, ORACLE 11gR2 enterprise edition.

-thanks

Update:

  1. We are a group of researchers working with such genome data. Our intention is to store sequence in our local server and make it available to team members.

  2. We were previously working with flat fasta files but it seems time consuming, since we have to go through the filing system. Rather we would be happy to have our scripts talk to Oracle and get sequences as and when required.

  3. Please some one, if you have answers help me!

genome sequence • 4.3k views
ADD COMMENT
0
Entering edit mode

Why are you trying to convert fastas to CSV and load them in a database? Why not work directly with the fasta files? Please edit the question to provide more information on what you're trying to do, or we'll close it up as unanswerable.

ADD REPLY
0
Entering edit mode

well Mr. Miller, I don't know of any way to import sequence files into oracle without converting into csv. i am adding more info on it.

ADD REPLY
6
Entering edit mode
12.7 years ago
Neilfws 49k

You may want to look at the BioSQL project. Its aim is "to build a sufficiently generic schema for persistent storage of sequences, features, and annotation in a way that is interoperable between the Bio projects." It has Oracle support and bindings to the Bio projects.

ADD COMMENT
2
Entering edit mode
12.7 years ago

I'd suggest using the the database to store the metadata as you see fit and use something like Bio::DB::Fasta or the equivalent from another language to actually access the fasta files.

ADD COMMENT
2
Entering edit mode
12.7 years ago
Pascal ★ 1.5k

What datatype are you using in your table schema? Have you considered using CLOB data type as describe here ?

ADD COMMENT
0
Entering edit mode

Many thanks! I got my answer for now! Much appreciated.

ADD REPLY

Login before adding your answer.

Traffic: 1567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6