Question

MUMmer show-tiling is out of date?

0

Entering edit mode

9.8 years ago

marcelelaux ▴ 20

Hello guys

I am doing a recruitment analysis looking for metagenomic islands. The inputs are a fasta file of my target species, extracted from the original metagenomic reads file, and the complete reference genome of my target species.

I am using MUMmer Promer utility for the alignment and show-coords utility for data exploration. The only utility that give me the length of the gap between a contig and the next, respecting the synteny of the reference genome is show-tiling, and I have used it to look for gaps larger than 10000 bp. It seems to work weel, when exploring this areas in show-coords output, but I read in a post that show-tiling would be out of dated. This is too bad for me, because I dont know how to look for this low/no alignment regions in the show-coords output.

Can anyone tell me something about show-tiling and, if it really is out of date, how can I get this gap between recruited reads (reads that aligned about 98 identity, similarity and coverage)?

Here is a example of the show-tiling output:

[S1] [S2] [gap between this contig and the next] [length/contig] [align.cov.] [average identity] [orientation] [contig id]
2813632 2813778 15787   147     94.56   85.87   +       HD4RJIW01BV3T7
3618731 3618964 10484   234     96.15   90.67   -       HD4RJIW02GX7DC
3813833 3814001 12862   169     100.00  85.19   -       HD4RJIW02HFIBU

The third field is what I need, and what I am using.

Here is a example of the show-coords output:

[S1] [E1] [S2] [E2] [L1] [L2] [%id] [%sim] [%stp] [cov R] [cov Q] [FRM] [TAGS]

where S=subject, E=query, L=length, stp=stop codon, FRM=orientation/frame

775     903     7       135     129     129     93.02   95.35   1.16    0.00    94.16   1       1       gi|166362741|ref|NC_010296.1|   HD4RJIW01EHAFJ
811     963     169     17      153     153     96.08   100.00  0.00    0.00    90.00   1       -2      gi|166362741|ref|NC_010296.1|   HD4RJIW02IJPAZ
823     1008    200     15      186     186     96.77   100.00  1.61    0.00    92.08   1       -3      gi|166362741|ref|NC_010296.1|   HD4RJIW01DF547

I tried to write a perl script that read a line, split, read the orientation[FRM], then choose between S1 or S2 field, then jump to the next line, split again, read the orientation again, choose the right field and do the subtraction. If larger than 10000, send the two lines to a new file, and so on, but it doesn't worked at all, I couldn't jump to the next line.

If you like, I can show you my frustrated script.... But if is all ok with show-tiling, so I don't need to worry.

EDIT:

The input file:

[marlaux@n4-202-14 recruit_2015]$ cat -A RL5_promer.tiling | head
>gi|166362741|ref|NC_010296.1| 5842795 bases$
99^I306^I-49^I208^I97.60^I92.37^I+^IHD4RJIW01C9YF5$
258^I348^I303^I91^I85.71^I92.31^I+^IHD4RJIW01BYQ91$
652^I751^I17^I100^I88.00^I92.86^I+^IHD4RJIW01ATUR3$
769^I905^I-85^I137^I97.08^I89.77^I+^IHD4RJIW01EHAFJ$
821^I1022^I411^I202^I93.56^I88.89^I-^IHD4RJIW01DF547$
1434^I1611^I-106^I178^I94.38^I78.57^I-^IHD4RJIW02ITKT8$
1506^I1771^I16^I266^I98.50^I61.63^I+^IHD4RJIW02IB4P3$
1788^I2083^I-150^I296^I96.28^I86.32^I+^IHD4RJIW02JGD11$
1934^I2267^I20^I334^I96.41^I88.21^I-^IHD4RJIW01B47F5$

This script I tried to wrote to use the show-coords, instead of show-tiling, because I am not sure if show-tiling is the right utility for this. I hope it is. The script is:

#!/usr/bin/perl
use diagnostics;
use warnings;

print "arquivo 1:\t";
$arq1 = <STDIN>;
open (MYFILE, $arq1);
open (NEW_FILE, '>>coords_teste.tab');
@file = <MYFILE>;
close (MYFILE);
@new_file=();
    foreach $line (@file) {
        @aligns = split (/\t/, $line);    
        $p1=$aligns[0];
        $p2=$aligns[1];
        #print "@aligns $p1 $p2\n";
        if ($aligns[11] > 0)    {
            $a=$p2;
            print "$aligns[11] $a\n";
            next;
            @aligns2 = split (/\t/, $line);
                if ($aligns2[11]>0)    {
                $b=$aligns2[0];
                }    
                else    {
                $b=$aligns2[1];
                }
        }
        else    {
            $a=$p1;
            next;
            @aligns2 = split (/\t/, $line);
                if ($aligns2[11]<0)     {
                            $b=$aligns2[1];
                            }
                else    {
                            $b=$aligns2[0];
                            }
                        }
        $gap=$b-$a;
        print "$a $b $gap\n";
        if ($gap>9000)    {
        push (@new_file, ("@aligns\n@aligns2\n"));
        print NEW_FILE @new_file;
        }
        @aligns=();
        @aligns2=();
        $p1=();
        $p2=();
        $a=();
        $b=();
        $gap=();
    }

Thank you

recruitment MUMmer Show-coords • 3.2k views

ADD COMMENT • link updated 2.7 years ago by Ram 44k • written 9.8 years ago by marcelelaux ▴ 20

0

Entering edit mode

Please paste your script here. It looks like the problem is either with the delimiter you're using or the EOL character, so could you also give us the output of:

cat -A show_tiling_file | head

where show_tiling_file is the file your perl script is trying to parse please?

EDIT: Thanks for adding information, I've moved all the pertinent info to the question.

ADD REPLY • link 9.8 years ago by Ram 44k