Using Paired End And Orphaned Singles For De Novo Assembly
2
0
Entering edit mode
11.1 years ago

I have been using FastX to process reads prior to de novo assembly and mapping. What I have discovered and few have pointed out is the FastX will delete reads leaving reads unpaired which changes the order of the separate paired fastq files. While it is difficult to know if this is affecting assembly with Trinity, it is definitely a problem for assembly with Velvet/Oases and mapping with Bowtie or BWA. Because the order of the paired reads has changed due to deletions of low quality reads, the reads are no longer order properly and will not map as paired.

There are some work arounds provided by sfg.stanford.edu and others to separate the reads that are still paired and place the orphaned reads in a separate file. But here is the problem, I would like to use paired reads in combination with single reads for de novo assembly. In Trinity, one designates as --right -- left or --singles, but you cannot do both.

Question: Can any assembler use both paired and single reads at the same time for de novo assembly?

Q2: Has anyone else run into this problem? Here is a related post: http://seqanswers.com/forums/showthread.php?t=24076

Q3: This issue is going to eliminate FastX from my pipeline of assembly and mapping. It seems like this should be a bigger issue but there is fairly little out there about this. Am I doing something wrong with FastX that is causing this problem?

Thanks.

fastx paired-end • 7.0k views
ADD COMMENT
0
Entering edit mode

Hi, have you sorted that out? Im using trinity and i'm struggling with the same problem: I want to use my single data (merged paired end reads) and left-right (unpaired reads) together for a big assembly that I want to use as a reference, if not I loose a lot of data.

ADD REPLY
2
Entering edit mode
11.1 years ago
Chris Fields ★ 2.2k

According to the docs and Trinity mail list (via Brian Haas) you can mix single and paired-end data for Trinity:

http://sourceforge.net/mailarchive/message.php?msg_id=31248814

I personally tend to leave these out if they do not make up a significant portion of the data (or if you have tons of reads, >100M), as I have found they make little difference with the actual assembly.

ADD COMMENT
1
Entering edit mode
11.1 years ago
Vivek ★ 2.7k

SoapDenovo allows you to use paired and single end reads at the same time for denovo assembly but you have to specify them in separate files when creating the config file.

http://soap.genomics.org.cn/soapdenovo.html

I'm not totally sure if the order of reads is important within the files, since I haven't worked on denovo assembly for a while but I'd think any error correction tool that discards low quality reads should output the resulting singletons into a different file and keep the high quality pairs in the same order.

ADD COMMENT
1
Entering edit mode

Note that the user is running a transcriptome assembly (Trinity), not a genome assembly. There is a SOAPdenovo transcriptome assembler for this purpose, though:

http://soap.genomics.org.cn/SOAPdenovo-Trans.html

ADD REPLY

Login before adding your answer.

Traffic: 1118 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6