I have sequenced and assembled a eukaryotic genome using 10X genomics technology. I used their supernova software package. Now, I want to inspect if there is any bacterial contig among my assembled sequences.
I know that one way is to blast the contigs against NCBI bacterial refseq (ftp://ftp.ncbi.nih.gov/refseq/release/bacteria) and remove the contigs that have a certain percentage matches. I am wondering if there is any other way? Any software package/pipeline?
Thanks!
Can you comment on the size of the assembly and the length and number of sequences? You could do a naive blast search as stated.
You could try
sketch
from BBMap suite.If you suspected that your data had contamination, it would have been much better to have identified that before assembly using
kraken2/centrifuge
.Thanks. My sample is not contaminated. I just want to get the endosymbiont sequences. The genome is 600Mb, 260million 150bp reads. Thanks