Read a fasta file | Java
3
0
Entering edit mode
6.6 years ago
ozdavidd • 0

Hey, I'm in my final project for finding hidden repeats in DNA sequence. I have to read a fasta file and get only the sequence without the genome's name, which starts with '>' and save it into a string.

Wish you guys could help me

Thanks

fasta java • 5.6k views
ADD COMMENT
0
Entering edit mode

what have you tried ?

ADD REPLY
0
Entering edit mode

I know how to read a regular file, but I dont know what should indicate me to start reading the nucleotides. The question is - when the gonome name ended? So I cant really write something

ADD REPLY
0
Entering edit mode

Thank you very much.

ADD REPLY
2
Entering edit mode
6.6 years ago

a solution:

ADD COMMENT
0
Entering edit mode
6.6 years ago
vmicrobio ▴ 290

Hi ozdavidd,

you may try this :

private void readFastaFile(File fastaFile) {
    InputStream flux;
    String line;
    try {
        flux = new FileInputStream(fastaFile);
        InputStreamReader lecture = new InputStreamReader(flux);
        BufferedReader buff = new BufferedReader(lecture);
        int lineNb = 0;
        StringBuilder sb = new StringBuilder();
        while ((line = buff.readLine()) != null){
            if (lineNb == 0) {
                this.header = line;
            }
            else {
                sb.append(line);
            }
            lineNb++;
        }
        this.sequence = sb.toString();
        buff.close();
    }
    catch(Exception e) {
        e.printStackTrace();
    }
}
ADD COMMENT
0
Entering edit mode

Thanks for comment. what sould I put in

this.header = line;

ADD REPLY
0
Entering edit mode

you can create a class FastaSequence containing the code above, add a 'getHeader' and 'getSequence' and then return only the sequence for your use

ADD REPLY
0
Entering edit mode

What in this code indicates u for the start of the nucleotides?

ADD REPLY
0
Entering edit mode
6.6 years ago
Hugo ▴ 380

You may have a look at SEDA (http://www.sing-group.org/seda/), which also provides a Java API for easily manipulation of FASTA sequences (https://github.com/sing-group/seda).

ADD COMMENT

Login before adding your answer.

Traffic: 2374 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6