Hello everyone! I have a trouble mapping the reads to a extremely tiny reference using bowtie/bowtie2. I have a mass of sequencing data (PE150),and I only want to focus 20 bp of the each 150 sequence, but I don't know the accurate location of the 20 bp need to notice in each sequence because of mutation and indel. At the same time, I have a expected library containing all possible sequences of the 20bp. So I want to use the expected library of the 20bp to build a reference,then mapping the reads to the constructed reference. Even I try many combinations of parameters,I didn't find the matched 20 bp sequence as the reference is too small. If you have any suggestions, please let me know .Thank you a lot.
Hi, If i understood right you have 150 bases length reads and you want to map it against multiple ref of 20 bases length ? If it's that i can tell you i got the same problem but on bwa that you can't aligned read longer then ref if i remember correctly. The solutions which worked in this case were : 1 - Blast sequences ( that is more accurate for small sequences ) 2 - Or Switch your sequences ( put reads on ref and 20 bases length as reads pb in this case you don't have quality sequencing) 3 - Or adding N to your 20 base length (both sides) to grow your references and make it bigger than your read or your fragment sizes estimate before sequencing if you are interesting of PE mapping. I hope i could help,
Best
Thanks for your reply.It is not appropriate for bowtie2 to align the longer reads to short ref. I have used blast to search the aligned sequences,but don't consider the quality of sequences in this way. Furthermore,I don't understand how to operate the second strategy, you means it is also don't take the quality of bases into account by this means? As for the third means,I maybe not get any result if I add more than 130 Ns at both sides of my 20 bp sequence,as the score is too low to reach the threshold of the min-score.
Bests
For the 2nd solution if you use 150 your bases length as references quality base will not be used during the mapping (only for the 20 bases if you have the quality base sequencing ). May be you can try to filter reads and keep only good quality 150 bases reads as references. About the 3th solution it worked for me with N with bwa but with low stringency parameter (low score to capture INDEL). By the way if i remember well when i did that kind of alignment i chose bwa and not bowtie because of the seed chose to start the alignment, which is 20 bases for bowtie by default against 4 for bwa.
Best