dear all,
I have a fasta file with a list of 25-mers and I am trying to align it to the reference genome ref.fa using bowtie2
I did bowtie2 -x ref -f ref_25mers.fa -S ref_25mers.sam
but it gives the result
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
which is not supposed to happen as the 25mers come from the reference genome itself... does anyone know what is the correct way to align these 25mers to the reference genome?
thank you so much
ref_25mers.fa:
>ref_25mers
AGCCTCAGGAAGGAGGCAGTGCTGC
GCCTCAGGAAGGAGGCAGTGCTGCC
CCTCAGGAAGGAGGCAGTGCTGCCA
CTCAGGAAGGAGGCAGTGCTGCCAG
TCAGGAAGGAGGCAGTGCTGCCAGC
CAGGAAGGAGGCAGTGCTGCCAGCC
AGGAAGGAGGCAGTGCTGCCAGCCC
GGAAGGAGGCAGTGCTGCCAGCCCT
GAAGGAGGCAGTGCTGCCAGCCCTT
AAGGAGGCAGTGCTGCCAGCCCTTG
AGGAGGCAGTGCTGCCAGCCCTTGG
GGAGGCAGTGCTGCCAGCCCTTGGG
GAGGCAGTGCTGCCAGCCCTTGGGG
AGGCAGTGCTGCCAGCCCTTGGGGA
GGCAGTGCTGCCAGCCCTTGGGGAC
GCAGTGCTGCCAGCCCTTGGGGACA
CAGTGCTGCCAGCCCTTGGGGACAA
AGTGCTGCCAGCCCTTGGGGACAAC
GTGCTGCCAGCCCTTGGGGACAACA
TGCTGCCAGCCCTTGGGGACAACAG
GCTGCCAGCCCTTGGGGACAACAGC
CTGCCAGCCCTTGGGGACAACAGCC
TGCCAGCCCTTGGGGACAACAGCCT
GCCAGCCCTTGGGGACAACAGCCTG
CCAGCCCTTGGGGACAACAGCCTGT
CAGCCCTTGGGGACAACAGCCTGTC
AGCCCTTGGGGACAACAGCCTGTCC
GCCCTTGGGGACAACAGCCTGTCCC
this is not a multifasta file of N-kmers of 25 bp, this is just ONE big fasta file with lines of 25 characters.
if you just want to have the locations, with perfect match, you could could "just" map each of those kmer using a boyer-moore algorithm.
You probably want un-gapped alignments so instead of
bowtie v.2.x
usev.1.x
.