comile
$ javac Biostar309193.java
execute:
$ java Biostar309193 *.fa
Hey, I'm in my final project for finding hidden repeats in DNA sequence. I have to read a fasta file and get only the sequence without the genome's name, which starts with '>' and save it into a string.
Wish you guys could help me
Thanks
a solution:
comile
$ javac Biostar309193.java
execute:
$ java Biostar309193 *.fa
import java.nio.*; | |
import java.nio.file.*; | |
import java.io.*; | |
import java.util.*; | |
import java.util.function.*; | |
class Biostar309193 | |
{ | |
private static void parse(final Path path,final BiConsumer<String,CharSequence> consummer) throws IOException { | |
try (BufferedReader br = new BufferedReader(new InputStreamReader(Files.newInputStream(path)))) { | |
final StringBuilder sn = new StringBuilder(); | |
final StringBuilder ba = new StringBuilder(100_000); | |
br.lines().forEach(L->{ | |
if(L.startsWith(">")) { | |
if(ba.length()>0) consummer.accept(sn.toString(),ba); | |
sn.setLength(0); | |
sn.append(L.substring(1)); | |
ba.setLength(0); | |
} | |
else | |
{ | |
ba.append(L); | |
} | |
}); | |
if(ba.length()>0) consummer.accept(sn.toString(),ba); | |
} | |
} | |
public static void main(final String args[]) throws IOException{ | |
for(final String sn:args) parse(Paths.get(sn),(S,A)->{System.out.print(A);System.out.println();}); | |
} | |
} |
Hi ozdavidd,
you may try this :
private void readFastaFile(File fastaFile) {
InputStream flux;
String line;
try {
flux = new FileInputStream(fastaFile);
InputStreamReader lecture = new InputStreamReader(flux);
BufferedReader buff = new BufferedReader(lecture);
int lineNb = 0;
StringBuilder sb = new StringBuilder();
while ((line = buff.readLine()) != null){
if (lineNb == 0) {
this.header = line;
}
else {
sb.append(line);
}
lineNb++;
}
this.sequence = sb.toString();
buff.close();
}
catch(Exception e) {
e.printStackTrace();
}
}
You may have a look at SEDA (http://www.sing-group.org/seda/), which also provides a Java API for easily manipulation of FASTA sequences (https://github.com/sing-group/seda).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
what have you tried ?
I know how to read a regular file, but I dont know what should indicate me to start reading the nucleotides. The question is - when the gonome name ended? So I cant really write something
Thank you very much.