Entering edit mode
4 months ago
esiv
•
0
I am trying to determine if an individual is rh- using samtools. I figured I would calculate the coverage of the chromosome the gene is on, and then the coverage of just the position of the Rh gene, and if there were significantly less reads in the Rh region, I could assume the individual was Rh-
I used the following code for the average:
samtools depth -r 1 *file.bam* | awk '{sum+=$3} END { print "Average = ",sum/NR}'
and for the Rh region:
samtools depth -r 1:25272486-25330445 *file.bam* | awk '{sum+=$3} END { print "Average = ",sum/NR}'
I ran this on over 50 individuals from a population that should be at least 25% Rh- but I didn't find a significant difference between any of the two values. I am hoping someone sees something I am missing.
1) for Rh- gene, is it always a deletion or an inactivation ? 2) you should normalize by the overall depth 3) you median instead of average
1) yes it is almost always a complete deletion 2/3) Does using coverage instead of depth fix both of these?
I realized I might need to include the -a option in order to include all positions and used the code:
and
But I am still getting no significant differences. I would appreciate any help!