Question

STAR unique reads

1

Entering edit mode

6.9 years ago

TEman ▴ 10

In STAR, one can use the --outFilterMultimapNmax 1 to report only reads that map exactly one time to the reference genome. This can be useful when analyzing genes and elements that have high sequence similarity.

However, one thing that is unclear for me is how STAR handles reads that maps to multiple positions with different number of mismatches / alignment score.

If one alignment has 0 mismatches and perfect score, and another alignment for the same read has 1 mismatch and lower score, is this read discarded then since it maps two times? In general, how many mismatches or low score of a secondary alignment before it is not counted, and can this influence the output when using --outFilterMultimapNmax? And is that something that can be tweaked?
If I set --outFilterMismatchNoverLmax 0.06 versus --outFilterMismatchNoverLmax 0.01, there will in some cases be more possible alignments when allowing for 0.06 mismatches. Will that mean that when allowing 0.06 mismatches/base, there will be fewer uniquely mapping reads, since in that case there will be more allowed alignments possible?

In general, I would like to know how --outFilterMultimapNmax works, how it defines how many times one read aligns, and how that is connected to the --outFilterMismatch parameters.

Thanks!

STAR RNA-Seq alignment • 4.2k views

ADD COMMENT • link 6.9 years ago by TEman ▴ 10

0

Entering edit mode

Have you considered asking this on the STAR google group? Alex Dobin would be the best person to reply to this since he's the one who actually wrote the algorithms in question.

ADD REPLY • link 6.9 years ago by Devon Ryan 104k