In this tutorial there's a part in which they try to shuffle the lines of a file and then split the file into two. I do not understand why when splitting the file into 2, instead of dividing the number of lines by two they add 1 and then divide by two. Why is that?
That part of the code is:
nlines=$(samtools view merged.bam | wc -l ) # Number of reads in the BAM file
nlines=$(( (nlines + 1) / 2 )) # half that number
samtools view ${tmpDir}/${NAME1}_${NAME2}_merged.bam | shuf - | split -d -l ${nlines} - "${tmpDir}/${EXPT}" # This will shuffle the lines in the file and split it
into two SAM files
cat ${tmpDir}/${EXPT}_header.sam ${tmpDir}/${EXPT}00 | samtools view -bS - > ${outputDir}/${EXPT}00.bam
cat ${tmpDir}/${EXPT}_header.sam ${tmpDir}/${EXPT}01 | samtools view -bS - > ${outputDir}/${EXPT}01.bam
You can try for yourself, run the relevant part with part of the tutorial with
nlines=$(( (nlines + 1) / 2 ))
andnlines=$(( (nlines ) / 2 ))
, and see if it makes a difference.You may also open an issue at the repo with your question. But please first try for yourself and check for differences, if any.
Are there an odd number of lines in the bam?