Entering edit mode
6.9 years ago
Sharon
▴
610
Hi everyone
I am using STAR for alignment with goal of variant calling, I feel the default parameters are too loosy, like too much mismatches, too much soft clipping, ..etc.
I want to get some feedback from your experience about what are much reasonable parameters, it's my first use for STAR. This is what I use so far:
${STAR}/STAR --runMode genomeGenerate --genomeDir ${WHERE} --genomeFastaFiles ${WHERE}/genome.fa --sjdbFileChrStartEnd ${WHERE}/SJ.out.tab --sjdbOverhang 75 --runThreadN 4 g --outFileNamePrefix ${WHERE} --alignEndsType EndToEnd --outFilterMismatchNmax 4
Thanks
Especially when it's the first time that I'm using software I don't consider myself smarter than the author and I'll go through the manual to see which options are recommended, but mostly stick to the defaults because those should be sensible.
The GATK best practices for variant calling in RNA-seq also uses STAR, so that would also be a logical place to look for optimal parameters.
This is what I am already following, the reason for my question is what they say in the top of the pipeline:
At that point you might as well use a different aligner, since you're basically just decreasing the alignment rate.
Like what? would salmon work with variant calling? And decreasing the alignment rate yes, but I don't want alignment with reads that has too many mismatches and softclipped, the default paramters are kinda high in this. What do you think?
I mean in variant calling. mismatches would be considered later as SNP?
Mismatches are only considered SNPs if it makes sense to do so given the totality of the data. It's not like variant callers are defining every mismatch in a read as a variant. Given that, while it might make sense to tamp down on soft-clipping (or just trim your adapters), doing much more than that and filtering by alignment score is just going to bias against regions with multiple variants.
So how much to tamp down was my question, so you think default parameters of STAR is still fine and not aggressive? Thanks Devin so much !
Generally I think the defaults for STAR are pretty good, though if the GATK best practices has some different suggestions then definitely follow them.
Thanks Devon and WouterDeCoster so much !