Entering edit mode
4.4 years ago
rezaeir75
▴
40
I have several siRNA sequences that I need to find their off-targets.
How can I efficiently use NCBI BLAST to find the off-targets?
Which parameters should I change from their default and change it to what value?
What genome or transcriptome human database I should use?
Remember that siRNAs don't just cause off target effects via a close match to the complete sequence. They can also act as a miRNA. We don't full understand how to predict miRNA binding sites, but we do know that matches between the seed sequence of the siRNA/miRNA (bases 2-8 of the siRNA) and the 3' UTR of the target are important. Hexamer and Heptamer seed matches are short enough that there is not much hope of identifying them using BLAST.
https://rnajournal.cshlp.org/content/12/7/1179.short
https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-11-175
What is the length of your input query sequences? You could likely want to use
blastn-short
since your input is going to be short. You would want to use a complete genome since you are looking to find off-targets.the query length is 22bp. Just changing the overall settings to short sequence blast is enough? Should I change anything about mismatch tolerance?
Depending how things go you can also turn
-dust off
. Reducing the default word size (--word_size
) to a smaller value would be another option to vary.I was hoping that someone gives me a benchmarking paper or a specific configuration for blasting siRNAs for off-target. If no one gives me that, I will go with trying different parameters to see what works better!
There is a paper mentioned (A: blastn no hit found for very short sequences ) but that may be for old
blast
package.