Hello all,
We are working with whole genome sequence data of a Pseudomonas fluorescence strain. The de-novo assembly of the same was performed using abyss software. After that, contigs from denovo assembly was submitted in RAST server for annotation. From RAST, the closely related species to this strain was identified. When we did the alignment between our strain and the related strain using Bowtie2, it shows 71% overall alignment rate. So I want to know whether this alignment rate is good or not.
Just feel that the alignment should be a bit more between two strains of the same species. Not sure if this question is a blunder as we are new to NGS data analysis.
Also Could someone suggest any tool to find out a reference genome other than by BLAST?
Thanks in advance
Regards
Ravisankar
Thanks for the quick reply. Will try it soon.
Also Can someone help me with the second question? Tool to find out reference genome other than BLAST?
I repeated the alignment using -N 1 option and it now gives 77% overall alignment rate. So can we infer r from this result that the two strains are quite different from each other?
Thanks
To check, you can obtain the 30% reads which don't map and try to do blast and see is there any possibility of contamination? You can also explore Fastq_Screen and DeconSeq for contamination detection. UCSC blat is alternate option for BLAST.
Other thing to try is using other aligners like bwa-mem and see whether using different aligner improves.
Also any multi hits you getting? i.e one read mapping to multiple location? Can you post the alignment statistics what you getting.
If everything seems fine, then you may assemble your reference genome and then realign with that to see how much % of reads aligned and them compare the assembled reference genome with the genome you are comparing and check whether the 2 are indeed different or not?