Search For Mismatches In A Seq
4
2
Entering edit mode
13.7 years ago
Andreia ▴ 20

Dear all,

I want to look for a motif in a set of short reads allowing one mismatch. I have found in a post the reference to Bio::Grep::Backend::Vmatch. But I do not understand how to implement it. What is the format of the file that the method generate_database needs? See the code from the package synopsis below. thanks for the help.

generate a Vmatch suffix array. you have to do this only once.

  $sbe->generate_database({ 
    file          => 'ATH1.cdna', 
    description   => 'AGI Transcripts',
    datapath      => 'data',
    prefix_length => 3,
  });
bioperl perl • 3.3k views
ADD COMMENT
2
Entering edit mode
13.7 years ago

Emboss has a few tools that can help with this. If you know the motif differences to expect and want to represent it as a regular expression for search, use dreg:

http://emboss.sourceforge.net/apps/release/6.3/emboss/apps/dreg.html

If you want to specify a pattern plus allowed mismatches, use fuzznuc with the -pmismatch parameter:

http://emboss.sourceforge.net/apps/release/6.3/emboss/apps/fuzznuc.html

These can be install locally and are also available as part of Galaxy:

http://usegalaxy.org

ADD COMMENT
1
Entering edit mode
13.7 years ago
Farhat ★ 2.9k

I am not familiar with the Bio::grep module but have you looked into using agrep (approximate grep) http://linux.die.net/man/1/agrep?

ADD COMMENT
0
Entering edit mode
13.7 years ago
hadasa ★ 1.0k

This sounds like a hamming distance problem.

Here are ruby snippets from a similar question on stackoverflow: http://stackoverflow.com/questions/5322428/finding-a-substring-while-allowing-for-mismatches-with-ruby

Also see this gist for implementing the same using a suffix array approach by Sammi Larbi https://github.com/codeodor/miscellany/blob/master/ruby_suffix_array/suffix_array.rb

ADD COMMENT
0
Entering edit mode
13.6 years ago

The absolute fastest deterministic pattern scanning tool - which allow for given mismatches, insertions, and deletions - is scan_for_matches which can be downloaded here.

ADD COMMENT

Login before adding your answer.

Traffic: 1832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6