Weighted sequence logos and motifs
0
0
Entering edit mode
5.7 years ago

Dear all,

Most libraries and software aimed at obtaining DNA sequence logos (e.g. ggseqlogo) or discovering sequence motifs (e.g. MEME tools) take as an input a fasta file containing a list of sequences:

>seq1
AGATCATCATCTCAT
>seq2
GTCTAGCTACGTACT
>seq3
TGCATGCATGCATCC

(in the case of motif finding, a list of negative sequences is often used as well)

However my list of sequences contain individual scores for each of my input sequences:

>seq1 53.4
AGATCATCATCTCAT
>seq2 21.5
GTCTAGCTACGTACT
>seq3 11.8
TGCATGCATGCATCC

I was wondering if anyone is aware of any tools that would take into account the sequence scores (53.4, 21.5, 11.8) to guide the creation of sequence logos or discovery of motifs.

Any hints would be quite useful.

logo motif • 1.8k views
ADD COMMENT
2
Entering edit mode

Maybe to duplicate the sequences based on the weight as the input?

ADD REPLY
0
Entering edit mode

That could work! But when adding sequences I would have to round decimal numbers to integers, which could result in a huge number of sequences after all, however this may not be a problem here.

ADD REPLY
1
Entering edit mode
tools that would take into account the sequence scores

Neither of the linked tools does. Therefore moved to a comment. It is appreciated that you aim to provide help but if you simply and only link content that matches the topic of the top-level question rather than answering what OP asked for, it simply does not help. Please stop doing that.

ADD REPLY
0
Entering edit mode

Thank you, I had a read through the docs. Even though you can input what they call seeds, I could not find a way to incorporate sequence scores into the motif discovery.

ADD REPLY

Login before adding your answer.

Traffic: 2419 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6