Converting GATK-style interval (1-based) to BED format
0
0
Entering edit mode
2.1 years ago

I am using a tool (FGTPartitioner) to conduct a 4 gamete test to partition regions passing the filter, but it outputs regions in GATK-style intervals which are 1-based as follows: <chr>:<start>-<end>

From original file:

Scaffold_12628_Chr2:1-362964
Scaffold_12628_Chr2:362965-362996
Scaffold_12628_Chr2:362997-363033

I want to use bcftools to filter the VCF based on these regions but it won't accept this input, so I tried to convert them to BED format using the following command:

regions.out | sed 's/:\|-/\t/gi' | awk '{print $1"\t"$2-1"\t"$3}'

And it gives me the following output:

Scaffold_12628_Chr2 0   362964
Scaffold_12628_Chr2 362964  362996
Scaffold_12628_Chr2 362996  363033

My question is does this reflect the actual coordinates that will exclude the potential recombination breakpoint? Because the end column of the first region and the start of the second are the same base number, and if I include the recombining region obviously the test would be pointless. Which coordinates would be appropriate for this?

Thanks, Miles

BED recombination gamete four coordinates VCF • 405 views
ADD COMMENT

Login before adding your answer.

Traffic: 1575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6