How to get the contig details from the Velvet assembly? I have the AMOS output from the velvet i.e. velvet_asm.afg file.
Thanks -Abhi
How to get the contig details from the Velvet assembly? I have the AMOS output from the velvet i.e. velvet_asm.afg file.
Thanks -Abhi
It's basically these 4 steps:
You can find the details on this blog post if you want to get it done quickly but AMOS also has a nice wiki.
Alternatively, I think that hawkeye does tell you the number of reads in a contig as well.
from http://www.cbcb.umd.edu/research/contig_representation.shtml
{CTG
iid:1
eid:1
seq:
CCTCTCCTGTAGAGTTCAACCGA-GCCGGTAGAGTTTTATCA
.
qlt:
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
.
{TLE
src:1027
off:0
clr:618,0
gap:
250 612
.
}
}
we can parse this in perl easily enough without getting into all that bank stuff
open( ASM, "<velvet_asm.afg" );
while (<ASM>) {
if (/\{RED/) { $reads++ }
elsif (/{SCF/) {
#i need to ignore TLEs in scaffolds b/c they are contigs
$inScf=1;
}
elsif (/\{CTG/) { $inScf=0;$getCtgID = 1; }
elsif (/iid:(\d+)/){if($getCtgID){$ctgID=$1;$getCtgID=0; }}
elsif (/\{TLE/) { if(!$inScf){$inTile = 1; $tiles++;}}
elsif (/src/) {
if ($inTile) {
$src++;
if($readSrc{$_}){
warn "seen $_ before"
}
$readSrc{$_} = $ctgID;
$inTile = 0;
}
}
}
foreach my $read(keys %readSrc){
print "read $read is in contig ".$readSrc{$read}."\n";
}
print "looks like I used ".scalar(keys %readSrc)." out of $reads total reads\n";
it should be pretty obvious how you can flip that hash to something more contig-centric
Used PERL AmosLib API to parse velvet_asm.afg file to get the count of reads in each contig.
use AMOS::AmosLib
my $fileName = "velvet_asm.afg";
open(fileHandle, $fileName);
my $totalReads = 0;
my $totalContigs = 0;
while (my $record = getRecord(\*fileHandle)){
my ($rec, $fields, $recs) = parseRecord($record);
if($rec eq "CTG"){
$totalContigs++;
my $nReads = 0;
for(my $r = 0; $r <= $#$recs; $r++){
my ($srec, $sfields, $srecs) = parseRecord($$recs[$r]);
if($srec eq "TLE"){
$nReads++;
$totalReads++;
}
}
print "Number of Reads in contig#", $$fields{iid}, " are ", $nReads, "\n";
}
}
print "Total # of Contigs :", $totalContigs, "\n";
print "Total # of reads in contigs:", $totalReads, "\n";
close(fileHandle);
Hi,
I am trying to figure out how many reads made it into each contig after I run velvet. I used the flag -read_trkg yes and -amos_file yes. I found some advise online on how to process them but I am a bit stuck. So from the amos file I have tried this path:
$ /home/admin/amos-3.1.0/src/Bank/bank-transact -m velvet_asm.afg -b velvet_asm.bnk -c
and then $/home/admin/amos-3.1.0/src/Validation/analyze-read-depth velvet_asm.bnk -i -d -r > output
The problem is that the file output is something like this : 1 1646 1340 1340 108.275 108.275 2 1 97 97 1.03093 1.03093 3 2085 1589 1589 113.904 113.904 4 175 75 75 219.547 219.547 5 187 162 162 105.944 105.944 6 124 76 76 157.171 157.171 7 1023 768 768 112.914 112.914 8 9 99 99 9.09091 9.09091 9 151 168 168 72.8214 72.8214 10 2224 1602 1602 123.306 123.306 11 6582 5312 5312 107.116 107.116 12 2850 2487 2487 98.653 98.653 13 3 105 105 2.62857 2.62857 14 3 103 103 2.91262 2.91262 15 4350 3350 3350 115.28 115.28 16 1788 1268 1268 128.341 128.341 17 1746 1325 1325 116.024 116.024
could anyone tell me what the filed are or if there is a different way to get the information I need??
Thank you,
F.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks for your quick response. I do have AMOS installed and I can see the bnk in hawkeye and see the number of reads in each contig. But wanted to get that information programmatically. I followed the steps you had mentioned in your blog, but hit the same problem i.e. no {FRG in my .afg file.
I am trying to convert to ACE format and get the details, but 'amos2ace' takes too long. Wondering if there are other quick options.
Can this be done with ABySS as well?
As long as ABySS is able to create a mapping file that AMOS can read (afg, ACE, etc.) it should.