Question

Single read sequence from a .fasta file cannot be aligned/BLASTN-d to my reference

0

Entering edit mode

15 months ago

cheesefish21 ▴ 10

Hey there!

I am currently trying to discover similarity / align a .fasta file to a reference .bam file, which contains the extracted unaligned reads from whole genome sequenced data (also a .bam file).

I have tried to use bowtie2, BWA, BLAST+ yet all of them gave out 0 as a result.

I have further investigated only to realize that my fasta file indeed is consisting of only one sequence read, so that gives us a possible reason why the bowtie2 alignment method did not work.

Thank you for your kind help and suggestions!

BLAST bowtie2 BWA alignment • 1.3k views

ADD COMMENT • link 15 months ago by cheesefish21 ▴ 10

1

Entering edit mode

It would help if you share the commands that you are trying to use to run the alignment. Also, what are you trying to use as database? a .bam file?

ADD REPLY • link 15 months ago by biofalconch ★ 1.3k

1

Entering edit mode

I am using a .fasta file, as required to be the database. I do this by converting my .bam file into a fastq, then into a fasta file using bedtools/or samtools.

The extraction is done by the simple command:

samtools -b -f 4 example.bam > unaligned_example.bam

Then, I index my unalinged_example.bam (using samtools).

Then I create an index for my reference fasta in bowtie2, then I convert my .bam file to a fastq file and run the alignment:

bowtie2 -x index -U unaligned_example.fastq -S mapped.sam

With BLAST+, I do the same but I create a database out of my converted .bam (as mentioned beforehand).

blastn -query example_ref.fasta -db unaligned_example_db -out alignment_results_sbs.txt -outfmt 6

Or with more sensitive specifications:

blastn -query example_ref.fasta -db unaligned_example_db -out alignment_results.txt -outfmt 6 -num_threads 4 -max_target_seqs 100 -word_size 7 -evalue 1e-7 -reward 1 -penalty -3 -gapopen 5 -gapextend 2

Thanks for your help

ADD REPLY • link updated 15 months ago by GenoMax 147k • written 15 months ago by cheesefish21 ▴ 10

1

Entering edit mode

You need to add -task blastn-short when you are searching with short sequences with blast+.

ADD REPLY • link 15 months ago by GenoMax 147k

0

Entering edit mode

Thanks, I have tried it but it did not work.

ADD REPLY • link 15 months ago by cheesefish21 ▴ 10

0

Entering edit mode

I think you are on the right path, I don't think aligning your reads using blast is a good use of resources. What is the output of bowtie2 in terms of log? you should be seeing something like this:

21404130 reads; of these:
21404130 (100.00%) were paired; of these:

21196512 (99.03%) aligned concordantly 0 times
104527 (0.49%) aligned concordantly exactly 1 time
103091 (0.48%) aligned concordantly >1 times
...

ADD REPLY • link 15 months ago by biofalconch ★ 1.3k

0

Entering edit mode

after the pairing segment all the values are 0, so 0.00% aligned concordantly etc. overall alignment rate 0.00%

ADD REPLY • link 15 months ago by cheesefish21 ▴ 10

0

Entering edit mode

And it does detect the number of reads correctly?

ADD REPLY • link 15 months ago by biofalconch ★ 1.3k

0

Entering edit mode

Yes, it does detect the number of reads correctly, I have checked this.

ADD REPLY • link 15 months ago by cheesefish21 ▴ 10