I have a paired-end dataset. After using Trimmomatic to filter the data, I get singletons over. These are reads whereby one of the pairs survived filtering, but the other did not. I have singletons from both left and right reads and I am wondering how to include them in a Trinity assembly.
Thankfully, there are a few places online where I can find suggestions, but they recommend different things and I would like to understand it more in detail.
The Trinity FAQ says:
If you have additional singletons, add them to the .fq file that they correspond to based on the sequencing method used (if they're equivalent to the left.fq entries, add them there, etc).
This seems to suggest that the left singletons goes into left and the right singletons goes into right? Does Trinity handle that so that it will not treat to singletons that have nothing to do with each other as pairs?
Trinity mailing list says
For running Trinity, you don't need to separate any unpaired reads from the paired reads. If you want to, you can just cat them all together into a single file, and run trinity as:
Trinity.pl --single all_reads.fastq --run_as_paired <other_opts>
One SEQanswers post from 2012 suggests renaming all right singletons to /1 and adding all singletons to left:
If you have both paired and unpaired data, and the data are NOT strand-specific, you can combine the unpaired data with the left reads of the paired fragments. Be sure that the unpaired reads have a /1 as a suffix to the accession value similarly to the left fragment reads. The right fragment reads should all have /2 as the accession suffix. Then, run Trinity using the --left and --right parameters as if all the data were paired.
My main question is this. Should I:
- add all of my left singletons to the left file and all of my right singletons to the right file?
- add all of my singletons regardless of /1 or /2 suffix, to the left one?
- add all of my singletons, renamed to /1 suffix, to the left one?
Does it matter? Or is this one of those questions where I need to do all options and compare the results? Or just ignore it and run Trinity with a single read file with all reads --run_as_paired
to make it less of a hassle?