Question

RNA-seq mapping rate

0

Entering edit mode

6.1 years ago

afli ▴ 190

Hi, I have a basic question about RNA-seq analysis. If reads alignment rate is about 40-50% (from bowties, hisat2, or other alignment tools), would it be appropriate to increase the sequencing depth and get enough aligned reads to do analysis? Or this low alignment rate would cause some bias so we should abandon these samples? Thank you!

The sample is rice, and has high quality reference genome. I used bowtie2 to do the alignment, the summary is:

82280146 reads; of these:
  82280146 (100.00%) were paired; of these:
    41474464 (50.41%) aligned concordantly 0 times
    12443024 (15.12%) aligned concordantly exactly 1 time
    28362658 (34.47%) aligned concordantly >1 times
    ----
    41474464 pairs aligned concordantly 0 times; of these:
      1444965 (3.48%) aligned discordantly 1 time
    ----
    40029499 pairs aligned 0 times concordantly or discordantly; of these:
      80058998 mates make up the pairs; of these:
        73562998 (91.89%) aligned 0 times
        414858 (0.52%) aligned exactly 1 time
        6081142 (7.60%) aligned >1 times
55.30% overall alignment rate

The reason why the rate is low is that there is condamination of some bacterium. I just want to know if this kind of reads could be appropriate for downstream analysis.

RNA-seq • 7.7k views

ADD COMMENT • link updated 6.1 years ago by Devon Ryan 105k • written 6.1 years ago by afli ▴ 190

1

Entering edit mode

Would you mind adding the hisat2 alignment summary here ?

ADD REPLY • link 6.1 years ago by lakhujanivijay 5.9k

0

Entering edit mode

I've added the information above.

ADD REPLY • link 6.1 years ago by afli ▴ 190

1

Entering edit mode

It might be interesting to know which species you're working with since a mapping rate of 40% would seem very low in human or mice but not in another species that is less well annotated. And the tissue you are working with obviously also plays into that evaluation.

ADD REPLY • link 6.1 years ago by Wietje ▴ 240

1

Entering edit mode

Please be as complete as possible and add information such as:

organism
commands used
alignment summary data
read length
library prep method
...

ADD REPLY • link 6.1 years ago by WouterDeCoster 47k

0

Entering edit mode

Thank you for your suggestion.

ADD REPLY • link 6.1 years ago by afli ▴ 190

0

Entering edit mode

I don't think bowtie2 is a suitable aligner for spliced reads, as I assume rice has.

ADD REPLY • link 6.1 years ago by WouterDeCoster 47k

1

Entering edit mode

In case of bacterial contamination, you can use e.g. BBSplit to separate the reads originating from the bacterium. While continuing with the "host" reads, you may want to control for the bacterial influence (directly to the gene expression, or indirectly by distortion of the fragment ratios in the library). You can include it as a factor in your DE-model and check it as Devon suggested with a PCA or a NLDA.

ADD REPLY • link 6.1 years ago by michael.ante ★ 3.9k

1

Entering edit mode

Do the samples have a sufficient read length, so > 50bp. I experienced on downloaded data that low mapping rates might primarily be due to poor read length (like 36bp or 25bp).

ADD REPLY • link 6.1 years ago by ATpoint 86k

score 2 · Answer 1 · 2018-11-27

2

Entering edit mode

6.1 years ago

Devon Ryan 105k

As a rule of thumb if one of your samples has a much lower alignment rate than the others you're probably going to exclude it in downstream analyses, since it will tend to have other problems. Make a PCA and see if it sticks out as an outlier. If so, exclude it. If not, then I guess you can keep it.