Entering edit mode
4.0 years ago
cddiligent
•
0
Hello all
I have 2 fasta files containing the same number of sequences. I'd like to blast them pairly (seq1 in file 1 to seq1 in file 2, seq2 in file 1 to seq2 in file 2 ......) and get the highest match for each pair. For example:
file1.fa
>a1
ATCGATAC
>b1
TAAATGTCGA
...
file2.fa
>a2
TGACGCTG
>b2
AGCGATGA
...
I want to blast a1 to a2 and b1 to b2 ......
Is there a good way to perform this?
Thank you a lot.
Your best bet is to split the sequences into individual files (How to split a multi fasta file into individual chromosomes ) and then do the blasts you need. Otherwise you could simply do an all against all blast search and pick out results you need.
Thank you. I thought about separating them, but there are >1000sequences in each file. I will try blasting all against all, if it won't take too long to finish.
If you are not performing blind BLAST search (ex: against entire nr/nt), why not map using a read mapper (ex: bowtie) against reference sequences by taking these 2 files are paired data. This way you can also get an advantage of filtering out the mapping based on inner distance between each pair.