How to find a pattern for consensus sequences using java?
How to find a pattern for consensus sequences using java?
Simply i think you should learn about regular expiration and apply it on the sequence you have here is a simple tutorial list of how to use regular expiration in java Java Regex Tutorial
Guide to Regular Expressions in Java
here is a good book that may help Java Regular Expressions: Taming the java.util.regex Engine
after learning from such tutorial you can form the regular expression that you need to detect consensus sequence you have and apply it.
to read fasta file:
import java.io.*;
import java.util.*;
import org.biojava.bio.*;
import org.biojava.bio.seq.db.*;
import org.biojava.bio.seq.io.*;
import org.biojava.bio.symbol.*;
public class ReadFasta {
/**
* The program takes two args: the first is the file name of the Fasta file.
* The second is the name of the Alphabet. Acceptable names are DNA RNA or PROTEIN.
*/
public static void main(String[] args) {
try {
//setup file input
String filename = args[0];
BufferedInputStream is =
new BufferedInputStream(new FileInputStream(filename));
//get the appropriate Alphabet
Alphabet alpha = AlphabetManager.alphabetForName(args[1]);
//get a SequenceDB of all sequences in the file
SequenceDB db = SeqIOTools.readFasta(is, alpha);
}
catch (BioException ex) {
//not in fasta format or wrong alphabet
ex.printStackTrace();
}catch (NoSuchElementException ex) {
//no fasta sequences in the file
ex.printStackTrace();
}catch (FileNotFoundException ex) {
//problem reading file
ex.printStackTrace();
}
}
}
more can be found Help Me To Read The Fasta File Through Biojava.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you for this helpful answer Sir. Wanted to know if there is a way to access fasta files to determine consensus sequence using regular expression?