Entering edit mode
3.3 years ago
arshad1292
▴
110
I performed the following steps with my data.
- Performed digital normalization on the individual 205 samples --- success
- removed the host (e.g. human) and adapter sequences from the individual samples --- success
- concatenated the samples into groups ---success
- performed the assembly with metaspades and megahit --- success
Tried to perform binning step but it gives the following error:
ERROR [mem_sam_pe] paired reads have different names: "A00534:36:HCHN3DRXX:2:1101:7437:1031", "A00534:36:HCHN3DRXX:2:1101:13693:1031"
Any idea how to fix this issue? Could it be digital normalization messing up the read names?
Would really appreciate your help!
Do you understand what the problem is? Have you figured out which step introduced it?
It could be digital normalization step because the reads without digital normalization work just fine. And within digital normalization, I needed to interleave the read1 and read2 into one. I did that with bbmap
After performing digital normalizatin steps, I needed to convert interleaved reads back into read1 and read2. Again, I did that with bbmap as below:
However, when I run the qc step (for adaptor and host sequences removal), it gives the following error:
May be bbmap is not splitting the read1 and read2 properly and giving them proper name? Not sure.. Again, this error doesn't appear if I perform this step without digital normalization.
But is there any other tools that I can use to interleave and then later split the reads?
I would really appreciate the help
maybe a couple of solutions can be found here: link
Ok i have fixed the problem. I should not have used bbmap but should have used provided script with khmer.
This was done incorrectly, unless you left out a part of the command.
This command should have been
Then use the
interleaved.fastq
file for whatever you needed to do.The only way this could have gone wrong is if your original files were out of sync. Which I think is the case based on the error you posted above. Perhaps you trimmed the read files independently, which should not be done.
Using
repair.sh
from BBMap would have brought your original files back in sync removing singletons before usingreformat.sh
to interleave the reads.Hi I have the same issue. Which script from khmer you used? Thanks a lot!