Efficient way to align assembled scaffolding to genome sequences from NCBI?
1
0
Entering edit mode
2.6 years ago
Kacper ▴ 10

Hello all,

I am trying to identify a human functional homolog of a specific bacteria. I assembled and scaffolded the bacterial genome using SPAdes. I am now trying to align the assembled scaffolds onto a human-derived Lactobacillus species genome available on NCBI to identify a human-derived species who is a functional homolog of the assembled genome.

My question is, is there an efficient way which I can find a published genome which would be the closest to my assembly for the purpose of alignment?

I tried running a BLAST, however, I was told the query was too long. The total length I'm working with is 2.1 million bp.

Thank you in advance!

sequencing genome alignment NCBI scaffolding • 487 views
ADD COMMENT
1
Entering edit mode
2.6 years ago

Find all lactobaccilus genomes put all genomes into a single file, run minimap2 on your contigs, and align against the combined genome.

Finally summarize the BAM with idxstats that will tell you which genome matches the most often.

Some from command line but you can also download the genomes from the NCBI website, see the FAQ

https://www.ncbi.nlm.nih.gov/genome/doc/ftpfaq/

ADD COMMENT

Login before adding your answer.

Traffic: 956 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6