Question

sequence length distribution

0

Entering edit mode

5.3 years ago

evashan96 • 0

how to check sequence length distribution of trimmed reads? (fasta). is there any tool to calculate it?

next-gen sequencing • 3.3k views

ADD COMMENT • link updated 5.3 years ago by michael.ante ★ 3.9k • written 5.3 years ago by evashan96 • 0

0

Entering edit mode

What have you tried so far?

why not run FastQC (again) on your cleaned reads? that report includes a length distribution plot.

EDIT. Oh, ok, you're asking it for fasta format as input, then FastQC might not be the answer, unless you still have the fastq files as well then use those as input

ADD REPLY • link 5.3 years ago by lieven.sterck 15k

score 2 · Answer 1 · 2019-09-10

2

Entering edit mode

5.3 years ago

jean.elbers ★ 1.7k

BBMap/BBTools has a readlength.sh tool that you can you to make a readlength histogram of the reads after trimming.

ADD COMMENT • link 5.3 years ago by jean.elbers ★ 1.7k

score 0 · Answer 2 · 2019-09-10

0

Entering edit mode

5.3 years ago

gb ★ 2.2k

A bit of an old tool but you can also use PRINSEQ. They also have a webversion, but never used it.

website: http://prinseq.sourceforge.net/

ADD COMMENT • link 5.3 years ago by gb ★ 2.2k

score 0 · Answer 3 · 2019-09-10

0

Entering edit mode

5.3 years ago

vivek.mathema • 0

Try OSTRFPD: Multifunctional tool for genome-wide short tandem repeat analysis for DNA, transcripts and amino acid sequences with integrated primer designer link: https://github.com/vivekmathema/OSTRFPD

ADD COMMENT • link 5.3 years ago by vivek.mathema • 0

score 0 · Answer 4 · 2019-09-10

0

Entering edit mode

5.3 years ago

michael.ante ★ 3.9k

Lawrence et al published FAST: FAST Analysis of Sequences Toolbox (paper | github).

There you have a tool called faslen, which annotates the fasta header with its length. To only get the length, you can run:

faslen my_test.fasta | grep \>

Cheers,

Michael

ADD COMMENT • link 5.3 years ago by michael.ante ★ 3.9k