Tools For Chipseq Scale De Novo Motif Finding On Unaligned Sequences?
3
2
Entering edit mode
13.5 years ago

Following up on this question: http://biostar.stackexchange.com/questions/598/tools-for-chipseq-scale-motif-finding

I've got a large amount of unaligned eukaryotic regulatory sequences and I want to do de novo motif discovery on them. These unaligned regulatory sequences are already filtered from reads that have no mapping, or reads that wouldn't make a peak.

I've seen most tools require aligned sequences and/or search only for a list of pre-defined motifs.

In it's simplest form, what I am looking for is a program that would read file.fa, where file.fa contains ~1M 50-200bp regulatory sequences, and produce the motif predictions, not needing to align it to a reference or scan for known motifs.

Does anybody know of a tool that would work for this amounts of unaligned fasta sequences and do de novo motif discovery?

chip-seq motif denovo • 4.3k views
ADD COMMENT
3
Entering edit mode

How large were your ChIP fragments, and how far did you sequence in? As ChIP-seq sequences from the end of your fragment inwards, do you think the unaligned reads will even have the potential regulatory motifs contained within them?

ADD REPLY
1
Entering edit mode

On prokaryotic or eukaryotic data ?

ADD REPLY
0
Entering edit mode

These unaligned regulatory sequences are already filtered from reads that have no mapping, or reads that wouldn't make a peak. So most of the data with no potential is already filtered out.

ADD REPLY
0
Entering edit mode

It's in eukarya

ADD REPLY
3
Entering edit mode
13.5 years ago
Amyemilie ▴ 30

Hi,

Im using GimmeMotifs, it is a de novo motif prediction pipeline, especially suited for ChIP-seq datasets.

Its free, easy to install and to launch. I also think this is the more precise tool on internet.

Good luck :).

http://www.ncmls.eu/bioinfo/gimmemotifs/

ADD COMMENT
1
Entering edit mode

This looks interesting, thanks. How robust are its predictions?

ADD REPLY
1
Entering edit mode
13.5 years ago

I don't quite see why there would be an issue with unaligned reads, as most de novo motif finding algorithms accept FASTA input.

You could try CisFinder or ChIPMunk. The already proposed GimmeMotifs seems nice too.

ADD COMMENT
0
Entering edit mode

In it's simplest form, what I am looking for is a program that would read file.fa, where file.fa contains ~1M 50-200bp regulatory sequences, and produce the motif predictions, not needing to align it to a reference or scan for known motifs. Would CisFinder or ChIPMunk work like that?

ADD REPLY
0
Entering edit mode

Yes - although 1 million is a lot. The most I have tried was about 100,000 sequences with CisFinder, which worked well.

ADD REPLY
1
Entering edit mode
12.7 years ago
Dataminer ★ 2.8k

Try GimmeMotifs, it is one of the best in business and Emilie has done an internship on the same. Wish you luck

ADD COMMENT

Login before adding your answer.

Traffic: 1837 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6