Question

Aligning Reads From Small Rna-Seq Onto Human Genome

2

Entering edit mode

12.9 years ago

Leszek 4.2k

I'm working with human small RNA-Seq (50 bp, single end). Surprisingly, I'm able to align only tiny fraction of reads (<1% for 3 mismatches, <5% for 6 mismatches). I have tested 2 aligners (bowtie and gem-mapper) and got similar results. Do you have any idea why is that?

next-gen sequencing rna short aligner alignment small • 8.9k views

ADD COMMENT • link updated 12.9 years ago by Vikas Bansal ★ 2.4k • written 12.9 years ago by Leszek 4.2k

0

Entering edit mode

Have you tried doing a fastqc on the reads? What do the quality look like?

ADD REPLY • link 12.9 years ago by Damian Kao 16k

0

Entering edit mode

Is it possible that you are missing something really simple? What are bowtie's default options for mapping RNA-seq reads on to a reference genome? Are you mapping to a reference genome or exome? You may want to try Tophat which does Bowtie first for short-read alignment but also allows alignment across splice junctions so it can do splice-junction mapping.

ADD REPLY • link 12.9 years ago by DG 7.3k

0

Entering edit mode

@DK: I trimmed reads at first base having quality <20 and discarded reads shorter than 31 bases @Dan: I'm mapping onto hg18 (bowtie --sam --all -n3 -l21). Optionally hard clipping -5 10 or -3 10

ADD REPLY • link 12.9 years ago by Leszek 4.2k

0

Entering edit mode

show us a sequence you think should have aligned

ADD REPLY • link 12.9 years ago by Jeremy Leipzig 23k

score 2 · Answer 1 · 2012-02-27

2

Entering edit mode

12.9 years ago

Nicolas Rosewick 11k

I'm sure you forgot to trim the 3' adapter ! try trimmomatic, fastx_clipper or cutadapt

ADD COMMENT • link 12.9 years ago by Nicolas Rosewick 11k

1

Entering edit mode

When doing small RNA-seq, the main fraction of your reads is around 22-24 nt in length (the miRNA fraction). Without clipping the adapter sequences, there is no way to map most of the reads of your experiment. I would recommend you to clip the used adapter (not only 10 nt) and then take a look at your length distribution. You should see a peak at 22-24nt... if not, there is something wrong with your experiment.

ADD REPLY • link 12.9 years ago by David Langenberger 11k

1

Entering edit mode

And if you don't know your adapter sequence, hard clip a longer fraction of the 3'end... at least 25nt. Then you should be able to map ~80%, I assume.

ADD REPLY • link 12.9 years ago by David Langenberger 11k

1

Entering edit mode

By the way: The adaptor for the Illumina HiSeq 2000 miRNA protocol is TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC.

Try this: ./fastx_clipper -i SRR324686.fastq -o SRR324686.clipped.fastq -a TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC -l 15 -M 20 -c

ADD REPLY • link 12.9 years ago by David Langenberger 11k

0

Entering edit mode

I run it as you said , but many miRNA reads trimmed still too long far from 20nt, what should I do next?

ADD REPLY • link 11.1 years ago by xiaojuhu13 ▴ 150

0

Entering edit mode

and the adaptor for the Illumina HiSeq 2000 miRNA protocol is TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC., when I search in miRBase, there appeared some mature miRNA accessions, then how to explain the sequences are adapters or mature miRNA sequences? And why there are different illumina adaptors like TCGTATGCCGTCTTCTGCTTGT(A: Problems with analysis of small RNAseq data - Adapter trimming)?

ADD REPLY • link 11.1 years ago by xiaojuhu13 ▴ 150

0

Entering edit mode

I don't thinks so, when I hard clip 10bp from 3' only 11% reads align, clipping 10bp from 5' gives 3% reads aligned.

ADD REPLY • link 12.9 years ago by Leszek 4.2k

0

Entering edit mode

And if you don't know your adapter sequence, hard clip a longer fraction of the 3'end... at least 25nt. Then you be able to map ~80%, I assume.

ADD REPLY • link 12.9 years ago by David Langenberger 11k

score 0 · Answer 2 · 2012-02-27

0

Entering edit mode

12.9 years ago

Darked89 4.7k

Check the size of the inserts. You may be running into adapter sequences. Mapping with last will give you all unique mappings, assuming you got long enough inserts. Also mapping to mirBASE makes more sense when you have 21bp inserts.

ADD COMMENT • link 12.9 years ago by Darked89 4.7k

score 0 · Answer 3 · 2012-03-01

0

Entering edit mode

12.9 years ago

Vikas Bansal ★ 2.4k

If the problem is really because of adapters, then I will recommend MicroRazerS: rapid alignment of small RNA reads.paper. Reads can be of arbitrary length and can contain adapter sequence at the 3' end, you can find rest of the information here.

ADD COMMENT • link 12.9 years ago by Vikas Bansal ★ 2.4k