After performing a transcriptome assembly, I am using BWA to align the reads to my assembled contigs. From the resulting SAM file, I am able to calculate the # of reads that map to each contig, but I would like to normalize this abundance measure by using the length of each alignment.
My naive question: is the length of each alignment equal to the length of the read that is being aligned, or does BWA sometimes align only part of a sequence query to the target?
What sequencing platform are you using? Your read lengths vary that much?
These are Illumina short reads. The variation in read length occurs because we used an adaptive algorithm to trim the reads based on quality score prior to assembly.