I am trying to align miRNA sequencing reads (very short reads) to multiple fasta sequence reference(~150). I am expecting my reads to align separately against individual fasta sequences. To achieve this I am using SHRiMP aligner with the following command
SHRiMP_2_2_3/bin/gmapper-ls -N 2 -o 1 -E input.fasta ../reference.fasta > output.sam
The output consists of all reads that are mapped to the all the reference sequences, but each read is mapped only once. So assuming read1 maps against my 1st fasta reference sequence, then it will not report any hit against the other reference sequence.
Is there a way to achieve what I am trying to do?
I have also tried creating index file for my reference sequence and tried aligning using bowtie
bowtie index input.fastq > output
But, even this result in the read aligning only once with the reference.
Is there a parameter I can add to SHRiMP to obtain hits for individual fasta reference?
You mean you have a big reference fasta file consisting of 150 sequences and you are aligning your reads against it. As you are using
-o1
parameter , the aligner is reporting the top alignment (maximum alignment score) for reads. I guess if a read is aligned against different fasta sequences with equal alignment score, then one of the alignment is getting reported due to-o 1
parameter. The best thing to do would be align these reads to every fasta sequence individually. You can create 150 reference indices and align the reads to them. You can then sort all the reads by queryname and then write a script that pick up a read in all sam files and check if it has been mapped to all the reference sequences with equal scores.I do have 150 reference sequences. I wanted to avoid creating 150 reference sequences and aligning them. I have tried bowtie with
-a
option.Will try to align separately also. Thank you.