Entering edit mode
15 months ago
cheesefish21
▴
10
Hey there!
I am currently trying to discover similarity / align a .fasta file to a reference .bam file, which contains the extracted unaligned reads from whole genome sequenced data (also a .bam file).
I have tried to use bowtie2, BWA, BLAST+ yet all of them gave out 0 as a result.
I have further investigated only to realize that my fasta file indeed is consisting of only one sequence read, so that gives us a possible reason why the bowtie2 alignment method did not work.
Thank you for your kind help and suggestions!
It would help if you share the commands that you are trying to use to run the alignment. Also, what are you trying to use as database? a .bam file?
I am using a .fasta file, as required to be the database. I do this by converting my .bam file into a fastq, then into a fasta file using bedtools/or samtools.
The extraction is done by the simple command:
Then, I index my unalinged_example.bam (using samtools).
Then I create an index for my reference fasta in bowtie2, then I convert my .bam file to a fastq file and run the alignment:
With BLAST+, I do the same but I create a database out of my converted .bam (as mentioned beforehand).
Or with more sensitive specifications:
Thanks for your help
You need to add
-task blastn-short
when you are searching with short sequences with blast+.Thanks, I have tried it but it did not work.
I think you are on the right path, I don't think aligning your reads using blast is a good use of resources. What is the output of
bowtie2
in terms of log? you should be seeing something like this:after the pairing segment all the values are 0, so 0.00% aligned concordantly etc. overall alignment rate 0.00%
And it does detect the number of reads correctly?
Yes, it does detect the number of reads correctly, I have checked this.