Read a fasta file | Java
3
0
Entering edit mode
6.8 years ago
ozdavidd • 0

Hey, I'm in my final project for finding hidden repeats in DNA sequence. I have to read a fasta file and get only the sequence without the genome's name, which starts with '>' and save it into a string.

Wish you guys could help me

Thanks

fasta java • 5.7k views
ADD COMMENT
0
Entering edit mode

what have you tried ?

ADD REPLY
0
Entering edit mode

I know how to read a regular file, but I dont know what should indicate me to start reading the nucleotides. The question is - when the gonome name ended? So I cant really write something

ADD REPLY
0
Entering edit mode

Thank you very much.

ADD REPLY
2
Entering edit mode
6.8 years ago

a solution:

comile

$ javac Biostar309193.java 

execute:

$ java Biostar309193 *.fa
view raw README.md hosted with ❤ by GitHub
import java.nio.*;
import java.nio.file.*;
import java.io.*;
import java.util.*;
import java.util.function.*;
class Biostar309193
{
private static void parse(final Path path,final BiConsumer<String,CharSequence> consummer) throws IOException {
try (BufferedReader br = new BufferedReader(new InputStreamReader(Files.newInputStream(path)))) {
final StringBuilder sn = new StringBuilder();
final StringBuilder ba = new StringBuilder(100_000);
br.lines().forEach(L->{
if(L.startsWith(">")) {
if(ba.length()>0) consummer.accept(sn.toString(),ba);
sn.setLength(0);
sn.append(L.substring(1));
ba.setLength(0);
}
else
{
ba.append(L);
}
});
if(ba.length()>0) consummer.accept(sn.toString(),ba);
}
}
public static void main(final String args[]) throws IOException{
for(final String sn:args) parse(Paths.get(sn),(S,A)->{System.out.print(A);System.out.println();});
}
}

ADD COMMENT
0
Entering edit mode
6.8 years ago
vmicrobio ▴ 290

Hi ozdavidd,

you may try this :

private void readFastaFile(File fastaFile) {
    InputStream flux;
    String line;
    try {
        flux = new FileInputStream(fastaFile);
        InputStreamReader lecture = new InputStreamReader(flux);
        BufferedReader buff = new BufferedReader(lecture);
        int lineNb = 0;
        StringBuilder sb = new StringBuilder();
        while ((line = buff.readLine()) != null){
            if (lineNb == 0) {
                this.header = line;
            }
            else {
                sb.append(line);
            }
            lineNb++;
        }
        this.sequence = sb.toString();
        buff.close();
    }
    catch(Exception e) {
        e.printStackTrace();
    }
}
ADD COMMENT
0
Entering edit mode

Thanks for comment. what sould I put in

this.header = line;

ADD REPLY
0
Entering edit mode

you can create a class FastaSequence containing the code above, add a 'getHeader' and 'getSequence' and then return only the sequence for your use

ADD REPLY
0
Entering edit mode

What in this code indicates u for the start of the nucleotides?

ADD REPLY
0
Entering edit mode
6.8 years ago
Hugo ▴ 380

You may have a look at SEDA (http://www.sing-group.org/seda/), which also provides a Java API for easily manipulation of FASTA sequences (https://github.com/sing-group/seda).

ADD COMMENT

Login before adding your answer.

Traffic: 4131 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6