I'm looking at what the average number of mitochondrial genomes per cell a sample. This is drosophila melanogaster genome. I'm using samtools. So I'm doing command:
samtools view -c -q20 mapped-sorted.bam mitochondrion_genome
to get how many reads map to the mitochondrion genome that are over a quality of 20
I'm assuming the reads are equal to the number of mitochondrial genomes. My only concern is since if we account for the pairs for the reads? And how would we do that if that was the cause!
So the length of my mitochondria genome is 19,303. using
I get 370 reads for the reverse strand, and using
I get 359 reads for the forward pair.
Making it 729 reads in total. So what you're saying is to divide mitchondrial length by the amount of reads to get an approximation of the genome? coverd? Sorry, not quite at the level you're at!
forward and reverse alignments are not directly related to copy number variation.
I was specifically talking about reads that come from regions that are unique vs regions that are present with multiple (and unknown) copies. In that case, that unknown number of copies may be inferred from the coverage.
If you are brand new to the field I would suggest learning more about the basics of short-read alignments.
Copy number variation is one of the more difficult subjects because the interpretation of the data is more subjective and may be beset by all kinds of challenges.