Assembly Illumina Paired End Reads
3
1
Entering edit mode
12.1 years ago
vijay ★ 1.6k

Which would be the best tool to assemble paired end reads generated by Illumina?

-Vj

next-gen • 9.0k views
ADD COMMENT
2
Entering edit mode

People can help you better when you give some more information. DNA or RNA? Which species? How much RAM do you have? What is more important, contig accuracy or contiguity? Read lenght, insert size, total amount of reads? Do you suspect DNA contamination from other species?

ADD REPLY
0
Entering edit mode

This is a metagenome sample. Hence I can't be sure of the number of species, since I am yet to recieve my sequence data. Just on a preparatory note I wanted to know this. I would need contiguity since this is a metagenome. read length would be app. 150bp

ADD REPLY
3
Entering edit mode
12.1 years ago
Josh Herr 5.8k

This is for metagenomics and we're not assembling the reads yet, correct? Amplicon data? You didn't tell us. There's no need to assemble the reads yet, you are just looking to mate the paired-end sequences from your library? I think the terminology is confusing and I prefer "mate" when combining paired-end data over "assembly" as one would do after your paired-end data is matched up and you are looking to make contig sequences from your data. If you have amplicon data (16S, 18S, ITS, etc.) then you can make consensus sequences, but this is not assembly in my opinion.

You didn't give us any information on the technology, but I am assuming from the 150 bp size that this is Illumina data and in FASTQ format?

Here's a previous SEQanswers thread and Best Way To Preprocess Barcoded Illumina Paired-End Data on this topic. There are a couple of options for mating Illumina paired-end data: I have used FastqJoin, PANDAseq, and CLC bio, but I am sure there are many other options out there.

ADD COMMENT
1
Entering edit mode
12.1 years ago

There are some papers comparing different assembles, I'd look at their result-tables and choose what fits best for your data (hard to tell over here)

Assemblathon

GAGE

and for fun, here's another review: Assembly of large genomes using second-generation sequencing

ADD COMMENT
0
Entering edit mode
8.8 years ago
jigarnt ▴ 30

*Getting this error: /Users/lindakohn/Desktop/tools/SPAdes-3.7.1-Darwin/bin/spades.py -k 21,33,55,77 --careful --only-assembler --pe<#>-12 <euro_plasmid_r1_paired.fastq euro_plasmid_r2_paired.fastq=""> --pe<#>-s1 <euro_plasmid_r1_unpaired.fastq> --pe<#>-s2 <euro_plasmid_r2_unpaired.fastq> -o Euro_plasmid_spades_output

-bash: syntax error near unexpected token `newline' what is wrong with the command?**

ADD COMMENT

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6