Read a fasta file

Read a fasta file | Java

0

Entering edit mode

7.0 years ago

ozdavidd • 0

Hey, I'm in my final project for finding hidden repeats in DNA sequence. I have to read a fasta file and get only the sequence without the genome's name, which starts with '>' and save it into a string.

Wish you guys could help me

Thanks

fasta java • 5.9k views

ADD COMMENT • link updated 2.1 years ago by Ram 45k • written 7.0 years ago by ozdavidd • 0

0

Entering edit mode

what have you tried ?

ADD REPLY • link 7.0 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

I know how to read a regular file, but I dont know what should indicate me to start reading the nucleotides. The question is - when the gonome name ended? So I cant really write something

ADD REPLY • link 7.0 years ago by ozdavidd • 0

0

Entering edit mode

Thank you very much.

ADD REPLY • link 7.0 years ago by ozdavidd • 0

2

Entering edit mode

7.0 years ago

Pierre Lindenbaum 166k

a solution:

comile

$ javac Biostar309193.java

execute:

$ java Biostar309193 *.fa

view raw README.md hosted with ❤ by GitHub

	import java.nio.*;
	import java.nio.file.*;
	import java.io.*;
	import java.util.*;
	import java.util.function.*;
	class Biostar309193
	{
	private static void parse(final Path path,final BiConsumer<String,CharSequence> consummer) throws IOException {
	try (BufferedReader br = new BufferedReader(new InputStreamReader(Files.newInputStream(path)))) {
	final StringBuilder sn = new StringBuilder();
	final StringBuilder ba = new StringBuilder(100_000);
	br.lines().forEach(L->{
	if(L.startsWith(">")) {
	if(ba.length()>0) consummer.accept(sn.toString(),ba);
	sn.setLength(0);
	sn.append(L.substring(1));
	ba.setLength(0);
	}
	else
	{
	ba.append(L);
	}
	});
	if(ba.length()>0) consummer.accept(sn.toString(),ba);
	}

	}
	public static void main(final String args[]) throws IOException{
	for(final String sn:args) parse(Paths.get(sn),(S,A)->{System.out.print(A);System.out.println();});
	}
	}

view raw Biostar309193.java hosted with ❤ by GitHub

ADD COMMENT • link 7.0 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

7.0 years ago

vmicrobio ▴ 290

Hi ozdavidd,

you may try this :

private void readFastaFile(File fastaFile) {
    InputStream flux;
    String line;
    try {
        flux = new FileInputStream(fastaFile);
        InputStreamReader lecture = new InputStreamReader(flux);
        BufferedReader buff = new BufferedReader(lecture);
        int lineNb = 0;
        StringBuilder sb = new StringBuilder();
        while ((line = buff.readLine()) != null){
            if (lineNb == 0) {
                this.header = line;
            }
            else {
                sb.append(line);
            }
            lineNb++;
        }
        this.sequence = sb.toString();
        buff.close();
    }
    catch(Exception e) {
        e.printStackTrace();
    }
}

ADD COMMENT • link 7.0 years ago by vmicrobio ▴ 290

0

Entering edit mode

Thanks for comment. what sould I put in

this.header = line;

ADD REPLY • link 7.0 years ago by ozdavidd • 0

0

Entering edit mode

you can create a class FastaSequence containing the code above, add a 'getHeader' and 'getSequence' and then return only the sequence for your use

ADD REPLY • link 7.0 years ago by vmicrobio ▴ 290

0

Entering edit mode

What in this code indicates u for the start of the nucleotides?

ADD REPLY • link 7.0 years ago by ozdavidd • 0

0

Entering edit mode

7.0 years ago

Hugo ▴ 380

You may have a look at SEDA (http://www.sing-group.org/seda/), which also provides a Java API for easily manipulation of FASTA sequences (https://github.com/sing-group/seda).

ADD COMMENT • link 7.0 years ago by Hugo ▴ 380

Login before adding your answer.