Entering edit mode
8.6 years ago
PsFeathers
•
0
Forgive this beginner question...I'm a bit new to this.
I have what looks like a vcf file with some of the headers as: chr, pos, ref, alt, etc
I was asked to find the coverage by adding the ref and alt and finding the maximum value. But since the ref and alt columns aren't numeric, what do I add?
A table example is:
chr ref alt
7 G A
7 G A
7 G A
7 A G
7 A G
7 GA G
Do I count the number of occurences of lets say 'G' in the ref column...and then add them to the number of occurrences of 'G' in the alt column?
Sorry I am new to bioinformatics too... I have problem on getting mean coverage, I've tried 'bedtools coverage' and 'samtools pileup', but the outputs are per site or region. I was wondering how can I get just one value (say 70X) averaged depth for CDS regions.