DNA sequence complexity with a sliding window
1
0
Entering edit mode
3.7 years ago
User000 ▴ 710

Dear all,

I have fasta seguence like this:

CCCCAAAGACACCCCCCACAGTTTATGTAGCTTACCCCTTTT

I would like to calculate the complexity of this sequence based on sliding window of 3-4 nucleotides, and plot a graph with a score or something similar, in order to identify regions like "CCCC","TTTT","AAAA". Is there a software that does something similar? OR alternatively how can I do this?

R python dna • 3.7k views
ADD COMMENT
0
Entering edit mode

you need to mention window size and step size. In addition, please post how you would calculate the complexity. If you want to generate multiple sequences with defined window and step sizes from a single sequence, seqkit window function would help.

ADD REPLY
0
Entering edit mode

window size of 3-4 nucleotides and step of 1 nucleotide. I am not sure how to calculate the complexity, it is like I want to find and homopolymers in the sequence and plot them on a graph, so if there are homopolymers I would like to see a pick.

ADD REPLY
2
Entering edit mode
3.7 years ago
shelkmike ★ 1.4k

You probably need the tool complex from EMBOSS.

ADD COMMENT
0
Entering edit mode

Is it possible to download EMBOSS complex? or is it a web based tool? I am having trouble understanding it.

ADD REPLY
0
Entering edit mode

There are web interfaces available for EMBOSS programs from various places on the web but it is a command line tool. You can download the entire suite of tools from here.

ADD REPLY
0
Entering edit mode

The problem is I cannot open the link it does not work.

ADD REPLY
0
Entering edit mode

Which link are you referring to? Link in original answer is simply for the manual page of complex program. As for the link I provided above that is for the command line EMBOSS package you can download. It is not a ready to run program link.

ADD REPLY
0
Entering edit mode

I mean the ftp link to download the program: ftp://emboss.open-bio.org/pub/EMBOSS/

ADD REPLY
0
Entering edit mode

Here is a direct link for the latest EMBOSS.

ADD REPLY
0
Entering edit mode

I do not know why, I cannot open it.

ADD REPLY
1
Entering edit mode

I will directly paste the link here: ftp://emboss.open-bio.org/pub/EMBOSS/emboss-latest.tar.gz

For some reason the new biostars code is not parsing that link. We are going to investigate.

ADD REPLY
0
Entering edit mode

ok I downloaded the link you sent we with wget,otherwise it does not open... thanks a lot!

ADD REPLY
0
Entering edit mode

Sorry, I cannot find a complex in the EMBOSS I downloaded...

ADD REPLY
1
Entering edit mode

Sigh. You are correct. It looks like that tool was removed from latest EMBOSS. Since original link seems to suggest that it was in v.6.0 you could try to download an older version from here: ftp://emboss.open-bio.org/pub/EMBOSS/old/

There is this review for similar tools see if you can find a copy to get additional options perhaps.

ADD REPLY
0
Entering edit mode

Could you please give me a link with a tar.gz file, I cannot open the link i can only download with wget

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

thanks, I could make it work, do you know where and if I can find a manual for emboss complex? it is not clear how it calculates the complexity.

ADD REPLY
1
Entering edit mode

Manual was linked by Shelkmike in the original post above: http://emboss.sourceforge.net/apps/release/6.0/emboss/apps/complex.html

You can just run complex on its own. It should produce some in-line help.

ADD REPLY
1
Entering edit mode

there was a "bug" with the code, ftp links were not allowed to pass, this has been fixed.

ADD REPLY

Login before adding your answer.

Traffic: 1686 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6