Tool:Jellyfish - Fast, Parallel K-Mer Counting For Dna
1
1
Entering edit mode
12.5 years ago

JELLYFISH is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. JELLYFISH can count k-mers using an order of magnitude less memory and an order of magnitude faster than other k-mer counting packages by using an efficient encoding of a hash table and by exploiting the "compare-and-swap" CPU instruction to increase parallelism.

http://www.cbcb.umd.edu/software/jellyfish/

JELLYFISH is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in an binary format, which can be translated into a human-readable text format using the "jellyfish dump" command.

JELLYFISH runs on 64-bit Intel-compatible processors running Linux or FreeBSD (including Intel Macs). It requires GNU GCC to compile.

jellyfish • 7.0k views
ADD COMMENT
0
Entering edit mode

Hi Istvan,

Im trying to use JellyFish to calculater 4-mer usage in my assembled contigs to get a table of tetranucleotide freq. Im using command line of : jellyfish count -o aftercut.cotig -m 4 -t 2 -s 100000 --both-strands Contigs.fasta

but all I got is a list of warning of 'Warn: Bad character in sequence', I ve double checked my fasta it seems fine. Do you know how to fix this? Or may be JellyFish is not suit for the calculation of tetranucleotide frequency. Looking forwards to your reply.

Kylie

ADD REPLY
0
Entering edit mode

please ask this as new question not as a comment to post - find the link named New Post at the top right of the page.

ADD REPLY

Login before adding your answer.

Traffic: 2098 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6