Entering edit mode
6.2 years ago
karthic
▴
130
Hi,
I have a fasta file with around 1 million sequences. I did a blast search and got hits for around 7500 sequences. Now I want to extract those sequences which do not have a hit and take them for further analysis.
So far am using a custom sed script which is very slow, judging from the speed, it might take several days to complete. Please help me with fast and robust solutions.
the script am using currently is below..
cat CG061MR_S20_R1_001_AR_filter_un_ren.fa > CG061MR_S20_R1_001_AR_filter_unblasted.fa
for j in $(cat CG061MR_blastids.txt)
do
sed -i -e '/'$j'/{N;d}' CG061MR_S20_R1_001_AR_filter_unblasted.fa
done
Thank You KK
faSomeRecords or GetFaRecords? karthic
Sorry, its faSomeRecords. Corrected it.