Hey, I'm in my final project for finding hidden repeats in DNA sequence. I have to read a fasta file and get only the sequence without the genome's name, which starts with '>' and save it into a string.
Wish you guys could help me
Thanks
Hey, I'm in my final project for finding hidden repeats in DNA sequence. I have to read a fasta file and get only the sequence without the genome's name, which starts with '>' and save it into a string.
Wish you guys could help me
Thanks
Hi ozdavidd,
you may try this :
private void readFastaFile(File fastaFile) {
InputStream flux;
String line;
try {
flux = new FileInputStream(fastaFile);
InputStreamReader lecture = new InputStreamReader(flux);
BufferedReader buff = new BufferedReader(lecture);
int lineNb = 0;
StringBuilder sb = new StringBuilder();
while ((line = buff.readLine()) != null){
if (lineNb == 0) {
this.header = line;
}
else {
sb.append(line);
}
lineNb++;
}
this.sequence = sb.toString();
buff.close();
}
catch(Exception e) {
e.printStackTrace();
}
}
You may have a look at SEDA (http://www.sing-group.org/seda/), which also provides a Java API for easily manipulation of FASTA sequences (https://github.com/sing-group/seda).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
what have you tried ?
I know how to read a regular file, but I dont know what should indicate me to start reading the nucleotides. The question is - when the gonome name ended? So I cant really write something
Thank you very much.