I have human whole genome sequence data generated from paired ended 150 bp sequence reads. It represents about 30X coverage. So, each hWGS dataset set has lane1_read1.FASTq.gz, lane1_read2.FASTq.gz.
I want to split or fragment each FASTq paired file to simulate lower coverage like 7X, 15X, 20X etc so that I can determine if such lower sequence coverage can detect what I am looking for. I will use the simulated lower sequence coverage to generate BAM files and subsequent downstream applications.
Please advise what is the best approach and bioinformatics tools to do so.
Thank you for your assistance.