jellyfish.histo file
fastqc file
I generated a kmer count file using jellyfish and subsequently a histogram, which when plotted in R gave the attached graph. I am confused about why I have a small peak at coverage 22. I see a similar tiny peak even for kmer values as high as 115.
- How does one interpret this for a genome expected to have 50-60% repeats.
- How can I extract reads pertaining to the tiny peaks?
- I am suspecting that I can correlate this with higher GC content in some reads, as you can see in the attached file generated by fastqc.
- Can I safely interpret this tiny peak as a non-erroneous peak and retain those kmers for assembly?
Be more careful adding images please: the URL you use must point directly to the image. For example, you used: https://ibb.co/gFwTZc where you should have used: https://image.ibb.co/kNrHSx/per_sequence_gc_content.png
Right click on the image in the page (https://ibb.co/gFwTZc) and select Copy Image Address to get the actual image URL.
Better option is to click on the
embed code
tab at the bottom of the page and then copy full image HTML link and paste in the post (like below).