Multi-Fasta Sequence Input C++ Library
2
2
Entering edit mode
12.3 years ago
Lexxx233 ▴ 20

Hi, I am new to bioinformatics and I was wondering if there is any C++ library for efficiently reading and indexing multi-fasta file?

fasta library • 6.3k views
ADD COMMENT
0
Entering edit mode

What do you mean by "indexing" ? Do you mean what "samtools faidx" does ?

ADD REPLY
2
Entering edit mode
12.3 years ago
Ido Tamir 5.2k

"efficiently .. indexing" is a bit vague, but I guess you will find nice things in seqAn: http://www.seqan.de/

  1. indexing by sequence name and position (like samtools fai): http://trac.seqan.de/wiki/Tutorial/IndexedFastaIO

  2. other indices (suffix tree...): http://trac.seqan.de/wiki/Tutorial/Indices

ADD COMMENT
1
Entering edit mode
12.3 years ago

I've got something like this in my sources. See:

https://code.google.com/p/variationtoolkit/source/browse/trunk/src/fastareader.cpp

https://code.google.com/p/variationtoolkit/source/browse/trunk/src/fastareader.h

for indexing the sequence, you can use a key/value datastore like leveldb , berkeleydb,.... or a embedded sql database like sqlite etc...

ADD COMMENT

Login before adding your answer.

Traffic: 1890 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6