Entering edit mode
9 months ago
Sony
▴
20
Hello everyone,
I have paired end reads whole genome sequencing data of Brassica varieties and reference genome. My objective is focus on unmapped reads. Here is my workflow:
- Check quality of raw paired end reads sequencing data with FastQC. Removed adapter sequence and low quality bases with Trimmomatic.
- Mapped trimmed paired end reads with reference genome using BAW-mem
- Converted SAM to BAM, sorted BAM file.
- Extracted all unmapped reads and converted to fatsq file samtools view -b -f 4 SRR4289357_mapped.sorted.bam > SRR4289357_unmapped.bam samtools sort SRR4289357_unmapped.bam > SRR4289357_unmapped.sorted.bam samtools bam2fq SRR4289357_unmapped.sorted.bam > SRR4289357_unmapped.sorted.fastq
- Calculated average insert size, stdev of average insert size (bbmap) and JF-SIZE for configuration file of MaSuRCA.
- Assembly with MaSuRCA, and here is stats of assembled sequence:
- Summarize statistics of assembly using QUAST:
- Validate the assembly by remapping reads back to the assembled sequence of extracted unmapped reads (based on this tutorial: https://biomedicalhub.github.io/genomics/03-part3-unmapped-assembly.html ), following these steps on this tutorial. My expectation of remapping trimmed paired end reads with assembled sequence of extracted unmapped reads is: “ very few of the reads do not map back to the contigs and the high rate of reads are properly paired which indicate that there are not too many mis-assemblies.” Here is mapping statistic when I remapped trimmed paired end reads with assembled sequence from MaSuRCA.
Based on my results, only 1.19% reads are mapped back to the contigs, and only 0.71 properly paired. This result is not look like expectation on the tutorial that I mentioned earlier
Is my results is normal? Or I wrong somewhere?
Please do not post screen shots of text material. These are hard to see (some of us have old eyes). You can copy and paste the text content and then format using
101010
button.If these unmapped reads are random contamination then they are not likely to assemble into anything meanful.