Entering edit mode
5.1 years ago
Njagi Mwaniki
•
0
I have long read data in a fasta file (that I got from converting BAM to fasta) that I'm trying to build a graph out of using the instructions here https://github.com/vgteam/vg/wiki/Long-read-assemblies-using-vg-msga#long-read-assembly.
One thing that could be the issue is that, my reads have >UUIDs at the start and not >ref so I left -b flag out entirely.
What I'm finding even more weird is that it fails silently even when I add -D.
Could I be going about it all wrong?
EDIT:
The computer I'm using has 128G or RAM.
EDIT:
Works on a different dataset of unaligned reads.
That's interesting that you don't even get the "preparing initial graph" message. Leaving out -b should be OK; it will start with the longest sequence.
The very first thing msga does looks to be loading all the sequences into memory. Is your FASTA file anywhere near 128G?
You should be able to run your problem vg command and then immediately run
echo $?
to get the exit code; a 9 there might suggest that it is being killed for running out of memory.I gave up on that dataset and tried it on the one I thought would work (in edit 2) which is 14G of reads. That failed with "error:[gssw] Could not allocate memory required for alignment traceback matrixes." I'm now going to try it on reads about 7MB in size and get back to you.
Also what if I wanted to combine reads from different fa files into the same graph? Is there a way? i.e closely related reads from a quasi species being merged into the same graph. I'm guessing
--graph
will work.