Entering edit mode
9.1 years ago
mcolic
•
0
What is the most convenient data set on NCBI to get the data which will be used for running a tool designed for pairwise sequencing?
Tool is going to be product of a team composed of three undergraduate students majoring in computer science and focusing on bioinformatics.
What is a pairwise alignment sequencing? Work on your terms knowledge
Probably you want to benchmark the software that you are going to write. For that you will at first probably run simulation (generating reads from reference genome). So that you know where your reads originally came from. Now is the question which genome (ploidy level, complexity - repeats) and which technology (Illumina, 454, Ion Torrent, PacBio, Solid, ON, so on).
I don't know how much understanding you have, but if you want to write a pairwise aligner, it won't work like standard BWA. It should target shorter reference sequences (not whole genome, but let's say a list of gene sequences). It's actually underappreciated part of alignment software, which can be pretty useful for that design.