Hi, I'm new to bioinformatics and I'm analyzing some genomes I obtained from BGI. The average size of these reads is 100bp, but after trimming they are on average 80bp. The coverage is low, so I decided to perform the alignment with BWA aln - BWA sampe, but the size of the files is almost 300GB. Does anyone have suggestions for adjusting the parameters to prevent such large file sizes?
I'm using BWA sampe with the command:
bwa sampe GCA_000001405.15_GRCh38_no_alt_analysis_set.fna X_1.sai X_2.sai X1.fq.gz X_2.fq.gz > X_aln.sam
Hello, thank you very much for your response. I was referring more to using these options:
Honestly, use bwa mem like everyone else and call it a day.
See the bwa mem manual on how to build a reference.