I built an index using the following fasta file:
INDEX FASTA (example):
>miRNA:mmu-mir-23b MI0000141 Mus musculus miR-23b stem-loop
GGCTGCTTGGGTTCCTGGCATGCTGATTTGTGACTTGAGATTAAAATCACATTGCCAGGG
ATTACCACGCAACC
>miRNA:mmu-mir-27b MI0000142 Mus musculus miR-27b stem-loop
AGGTGCAGAGCTTAGCTGATTGGTGAACAGTGATTGGTTTCCGCTTTGTTCACAGTGGCT
AAGTTCTGCACCT
I then took my FASTQ data (example):
@K00252:57:HGFMMBBXX:1:1101:4320:1209 1:N:0:AGTCAA
NCCCTGTAGATCCGAATTTGTG
+
#AAFFJJJJJJJJJJJJJJJJJ
@K00252:57:HGFMMBBXX:1:1101:5132:1209 1:N:0:AGTCAA
NAACGGAATCCCAAAAGCAGCTG
+
#AAFFJJJJJJJJJJJJJJJJJJ
Used bowtie2 to align but ended up with a bizarre sam file that said I had duplicated entires
OUTPUT SAM looks like this:
@HD VN:1.0 SO:unsorted
@SQ SN:CONTAMINATION:ADAPTER:adapters_contam1 LN:100
@SQ SN:miRNA:mmu-let-7g LN:88
@SQ SN:miRNA:mmu-let-7i LN:85
@SQ SN:miRNA:mmu-mir-1a-1 LN:77
@SQ SN:miRNA:mmu-mir-15b LN:64
@SQ SN:miRNA:mmu-mir-23b LN:74
@SQ SN:miRNA:mmu-mir-27b LN:73
@SQ SN:miRNA:mmu-mir-29b-1 LN:71
@SQ SN:miRNA:mmu-mir-30a LN:71
@SQ SN:miRNA:mmu-mir-30b LN:96
Any ideas what I might be doing wrong?
I wonder if this has to do with mapping small RNAs which are likely to map to multiple regions?
You should use bowtie v.1 if you are mapping small RNA's where you want to do ungapped alignments.