Dear All!
i want to calculate the coverage of a specific fragment from a bam file. i am using this script right now
samtools view -u infile.bam chromosome:76437781-76439092 | samtools pileup - -cf ref.fasta | awk '{if ($2 >= 76437781 || $2 <= 76439092) print }' | awk '{ count++ ; SUM += $8 } END { if (count==0)print SUM "\t" count "\t" "0";else print "\t" SUM "\t" count "\t" SUM/count }' > out.file
out put is something like that:
coverage
16418 1501 10.938
"76437781-76439092" i am giving these coordinates manually in the shell. i know i can have variables in shell but i want to use a bed file to give the coordinates and put it in the loop to get the coverage of all my fragments in one file. i think i have to do this in perl,this looks quite difficult to me to use three files 1)bam file 2)refrence fasta 3)bed file with coordinates can i do this in perl. or any other easier way.
thanks in advance
sorry if i did not conveyed myself properly. i got your point but you are getting the coverage while i want to calculate the average coverage of the whole fragment between the coordinates as you can see in the question i am using awk '{ count++ ; SUM += $8 to do this. that i think you can not do with genomeCoverageBed.
Check this post, might help you, in that case your script is fine, just reformat a bit using loop or try in R. http://www.biostars.org/post/show/5165/tools-to-calculate-average-coverage-for-a-bam-file/