Hello All,
I have been trying to figure out what would be the best parameters to use in the STAR aligner to find all possible alignments to an rna read. I realized that somehow STAR limits alignments or does not find a way to do alignments in soft mask regions. As I tried GSNAP and with this aligner I found several different positions in the genome where the read has been located. Even though, there are overlaps of alignments between STAR and GSNAP, there are some interesting positions that STAR is not able to align even though there is no mismatch or soft-clip that would be needed in order to STAR do the alignment.
I use the same parameters for both aligners, like : set the max of multimapping (very high) and set the minimum match to output the alignment. The other parameters are pretty much the basic. I have a preference for STAR as it finds a more alignments compared to GSNAP, however in some cases it doesn't retain some positions that I would be interested.
I would be glad to hear your suggestions!
Thanks in advance.
You may find this prior thread of interest: Hard vs soft masked references while aligning with STAR
There is an old thread in which Alex mentions (post #2) that soft-masked regions are not treated any differently by STAR.
Thanks for you reply ! So, in conclusion, STAR isn't a good aligner to align reads to repetitive regions! I didn't find any explanation of why this happens. I made some tests with Hard and Soft masked reference and actually, the result was the same. It seems that STAR simply doesn't align to repetitive zones in the genome. Then, I have either choose another aligner or use STAR + another aligner that do the job for the repetitive regions.
Well, thanks again for your reply!