Hello all,
I cannot find the abyss wiki anymore (it is gone) and the github repository says that abyss does have the "se" command for single-end reads. However, the only example I can find where "se" is used for some reason is used in conjunction with multiple libraries (https://github.com/bcgsc/abyss).
I want to know if I can use abyss just to assemble my single-end reads (100bp) only. If I have cleaned/trimmed fastq files for different species, can I just use:
abyss-pe k=50 se="d_r1.fastq"
I would run this command for all my unique species one by one?
I honestly cannot find a clear example of abyss commands for one set of fastq files so I am asking the community!
Abyss manual can be found on https://github.com/bcgsc/abyss#readme
If you're trying to assemble a microbial genome then also try SPAdes as it can run multiple k-mer based assembly in a single run and may provide better assemblies than Abyss.
I am not working on microbial genomes, vertebrate exome data is what I have. I've asked a lot of questions and it seems that trying to assemble exome data is difficult becuase of missing intergenic regions. Unfortunately a lot of the papers I read in my field have used WGS but the data I HAVE to work with was produced by another lab where they did WES, and I can work with it easily for obtaining reads for single gene sequences, but I really wanted to figure out a way to assemble everything so I can extract all coding regions. Did not expect it to be so tricky. I have tried de novo and have only obtained n50 values of about 300 (maximum is about 4000 which I guess makes sense for exon lengths), however, I am not getting nearly enough contigs. When I blast my contigs, I get only 300 hits which is ridiculously low. Unless I blasted it wrong somehow..but I don't think so because I see individual genes/exons come up but 300 only? that's a joke!
DNAngel : While the manual refers to using orphaned mates as single-end input with
se
option, I don't thinkabyss
is intended to be an aligner for single-end data alone. You could try running it as you note above but the results may not be optimal even if the program runs.Thank you for this response - I apologize that I misunderstood the question.