what do the bcftools ROH output "markers & quality" fields indicate ?
1
1
Entering edit mode
5.3 years ago

Hello all,

I understand this is a very basic question but I am unable to find the explanation for the 2 columns in bcftools ROH output. I went through their manual in https://samtools.github.io/bcftools/howtos/roh-calling.html. But could not find the output format explanation.

The header looks like this

# RG    [2]Sample       [3]Chromosome   [4]Start        [5]End  [6]Length (bp)  [7]Number of markers    [8]Quality (average fwd-bwd phred score)

What is RG, Number of Markers and Quality indicate.

I presumed the Number of markers are the SNPs present but the below line in my output shows more markers than the length of homozygous region.

RG      sample1     1  66881536        66889308        7773    46939   84.5

I have several positions like this. Any help would be appreciated.

sequencing ROH • 3.3k views
ADD COMMENT
0
Entering edit mode
5.3 years ago
XiBon • 0

I have the same question. What is RG and are the regions contained in the *bcf.txt.gz file the homozygous regions? I cannot find documentation on bcftoolsRoH's output

ADD COMMENT
3
Entering edit mode

I am not sure if this is the correct answer but I think I figured out what RG and ST can be. When you use multiple sample VCF file, it calculates ROH over a region (RG) but when you use a single sample VCF file, it gives you the state (ST; either HW or AZ)

ADD REPLY
0
Entering edit mode

Thank you so much for making it clear. But I have a question that ROH is an intrapopulation approach. Then what is the logic to use multiple sample VCF file for the analysis? Kindly elaborate on that.

Thank you in advance

ADD REPLY

Login before adding your answer.

Traffic: 2066 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6