I have almost 8000 small RNA sequences, i want to get their Top 20 possible locations using Blast or Blat, for each sequence. Is there any method or script which can be used, Kindly enlighten me
I have almost 8000 small RNA sequences, i want to get their Top 20 possible locations using Blast or Blat, for each sequence. Is there any method or script which can be used, Kindly enlighten me
What is your final purpose ? I think you could use blast and only get the top 20 hits with the options max_target_seqs
or max_hsps
.
Are you ok with using bash commands ?
You should first start by installing blast on your computer (https://www.ncbi.nlm.nih.gov/books/NBK52640/)
Then i have no idea if you want the top 20 possible location on a draft assembly genome or on a scaffold ? The problem is that blast results are divided in two parts : The target sequences and the HSPs. A target sequences can have many HSPs :
scaffold : =========================================================
Hsps : ====== ====== ======
Anyway, you could run a blast :
blastn -query 20ksequence.fasta -db yourgenome.fasta > results_RNA_vs_genome.blastn
Then you could easily parse the blast results with your criteria (best evalues ? best target sequences ?) using the biopython module (I think many people have already asked question on how to parse results in the same way than you and the biopython cookbook is very good : http://biopython.org/DIST/docs/tutorial/Tutorial.html)
I hope I helped you a bit,
Maxime
Yes I can use bash commands, and blast in linux. your response is informative and i have used blast before, actually i want to use blat for miRNA sequence to get locations for each sequence for hg19. But the i have list of miRNA sequences, according to my information i have to make fasta file for them, but i do not know what will be the output. My requirement is to get list of locations against my list of my query sequences as text file or excel file.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Check here for the difference between blast and blat, and see what suits your data. I would use (stand alone) blast in this case.