Counting k-mers shared between genome assemblies
1
0
Entering edit mode
3.1 years ago

Hi there,

I am trying to count kmers shared between genome assemblies. I know this can be done with bbmap tools from this post. But I still have a few questions.

Can I get what I want from running the following?

kmercountexact.sh in=assembly1.fa,assembly2.fa,assembly3.fa out=shared_kmers.fa mincount=3

If I understand correctly, kmercountexact.sh already counts unique kmers per file. Or do I need to compress each assembly file with kcompress.sh before running kmercountexact.sh?

Thanks in advance!

Raúl

genomics kmers • 731 views
ADD COMMENT
1
Entering edit mode
3.1 years ago
GenoMax 147k

Yes you should run kcompress.sh individually as shown by @Brian. It will speed the process up, I would think. Depending of size of your assemblies you may need a lot of memory if you ran the countexact directly with fasta files. If you have small assemblies you could try that.

ADD COMMENT
0
Entering edit mode

Thanks for that!

ADD REPLY

Login before adding your answer.

Traffic: 1032 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6