Aligning Reads From Small Rna-Seq Onto Human Genome
3
2
Entering edit mode
12.7 years ago
Leszek 4.2k

I'm working with human small RNA-Seq (50 bp, single end). Surprisingly, I'm able to align only tiny fraction of reads (<1% for 3 mismatches, <5% for 6 mismatches). I have tested 2 aligners (bowtie and gem-mapper) and got similar results. Do you have any idea why is that?

next-gen sequencing rna short aligner alignment small • 8.8k views
ADD COMMENT
0
Entering edit mode

Have you tried doing a fastqc on the reads? What do the quality look like?

ADD REPLY
0
Entering edit mode

Is it possible that you are missing something really simple? What are bowtie's default options for mapping RNA-seq reads on to a reference genome? Are you mapping to a reference genome or exome? You may want to try Tophat which does Bowtie first for short-read alignment but also allows alignment across splice junctions so it can do splice-junction mapping.

ADD REPLY
0
Entering edit mode

@DK: I trimmed reads at first base having quality <20 and discarded reads shorter than 31 bases @Dan: I'm mapping onto hg18 (bowtie --sam --all -n3 -l21). Optionally hard clipping -5 10 or -3 10

ADD REPLY
0
Entering edit mode

show us a sequence you think should have aligned

ADD REPLY
2
Entering edit mode
12.7 years ago

I'm sure you forgot to trim the 3' adapter ! try trimmomatic, fastx_clipper or cutadapt

ADD COMMENT
1
Entering edit mode

When doing small RNA-seq, the main fraction of your reads is around 22-24 nt in length (the miRNA fraction). Without clipping the adapter sequences, there is no way to map most of the reads of your experiment. I would recommend you to clip the used adapter (not only 10 nt) and then take a look at your length distribution. You should see a peak at 22-24nt... if not, there is something wrong with your experiment.

ADD REPLY
1
Entering edit mode

And if you don't know your adapter sequence, hard clip a longer fraction of the 3'end... at least 25nt. Then you should be able to map ~80%, I assume.

ADD REPLY
1
Entering edit mode

By the way: The adaptor for the Illumina HiSeq 2000 miRNA protocol is TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC.

Try this: ./fastx_clipper -i SRR324686.fastq -o SRR324686.clipped.fastq -a TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC -l 15 -M 20 -c

ADD REPLY
0
Entering edit mode

I run it as you said , but many miRNA reads trimmed still too long far from 20nt, what should I do next?

ADD REPLY
0
Entering edit mode

and the adaptor for the Illumina HiSeq 2000 miRNA protocol is TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC., when I search in miRBase, there appeared some mature miRNA accessions, then how to explain the sequences are adapters or mature miRNA sequences? And why there are different illumina adaptors like TCGTATGCCGTCTTCTGCTTGT(A: Problems with analysis of small RNAseq data - Adapter trimming)?

ADD REPLY
0
Entering edit mode

I don't thinks so, when I hard clip 10bp from 3' only 11% reads align, clipping 10bp from 5' gives 3% reads aligned.

ADD REPLY
0
Entering edit mode

And if you don't know your adapter sequence, hard clip a longer fraction of the 3'end... at least 25nt. Then you be able to map ~80%, I assume.

ADD REPLY
0
Entering edit mode
12.7 years ago
Darked89 4.7k

Check the size of the inserts. You may be running into adapter sequences. Mapping with last will give you all unique mappings, assuming you got long enough inserts. Also mapping to mirBASE makes more sense when you have 21bp inserts.

ADD COMMENT
0
Entering edit mode
12.7 years ago
Vikas Bansal ★ 2.4k

If the problem is really because of adapters, then I will recommend MicroRazerS: rapid alignment of small RNA reads.paper. Reads can be of arbitrary length and can contain adapter sequence at the 3' end, you can find rest of the information here.

ADD COMMENT

Login before adding your answer.

Traffic: 1003 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6