I copied this line out of a paper.
Firstly, all the known pre-miRNAs in the training set are used as queries to BLAST search against the genome with a sensitive BLAST parameter setting (word-length 7 and E-value cutoff 10). Next, sequence segments of the potential regions are cut from the genome with 70 nt flanking sequences to each end and scanned by a 100 nt-sliding window with a step of 10 nt. The sequences overlapping with repeat sequences are discarded.
What does the statement in bold mean? I cant understand what they mean by this sliding window and how do they do it. Can somebody help me to understand it.
Also, if you have worked with pre-miRNA analysis. What do you think is the best parameters to do a BLAST search of pre-miRNAs against a genome. I took word length as 8 and an e value of <0.01. Kindly suggest. Thank you
They extracted the sequence in the genome that was a hit for pre-miRNA BLAST search and also extracted sequences or regions that were 70 bp upstream and downstream of the hit.
Any idea about what a 100 nt sliding window is?