Entering edit mode
3.4 years ago
yingcraft
•
0
I have a contigs' fasta file (2.5Gb) and I want to remove some contigs which can blast to mouse genome database from it. That means I need to input my fasta file, database and get the output screened fasta file without the mouse related contigs.
Can I use the blat or the vsearch to finish this work? How to set the parameters? I read some help files but can't find this screen function.
Thanks very much if you could give me some advice
Thinking aloud but I doubt that a local aligner such as BLAST is a good choice here, it will always find some kind of local similarities between large contigs and a database. Can you elaborate, so how long are these contigs, and what is the database, is it mouse genome? How much overlap would there need to be that the contig would require removal?
These contigs' length are from 100 to 10 000 000. They are assembled by metaSPAdes. The database is mouse genome. Maybe above 90 percent overlap contigs need to be removed. I want to use metaprodial after the screening.
Can't you simply use something like minimap to map these contigs onto the mouse genome and then filter for appropriate hits?
I'm not sure whether minimap can input contigs fasta file like this, but I will try it, thank you for your help