Question

create BED file from whole genome index file

0

Entering edit mode

2.2 years ago

Ranold Grijon • 0

Dear altruists,

I found the below code online to create the bed file from whole genome indexed file. But I don't understand the code. Can anyone help me to crack this code?

awk -v FS="\t" -v OFS="\t" '{print $1 FS "0" FS ($2-1)}' GRCh38.primary_assembly.genome.fa.fai > GRCh38.primary_assembly.genome.bed

bed awk • 651 views

ADD COMMENT • link updated 20 months ago by Ram 44k • written 2.2 years ago by Ranold Grijon • 0

score 0 · Answer 1 · 2022-09-26

the fai file is a tab delimited file where the two first columns are

chromosome name
chromosome size

awk the command line

-v FS="\t" defines a variable named FS as tab. FS is the input column delimiter for AWK. The first item/column for the current line will be $1, the second will be $2 etc...
-v OFS="\t" defines a variable named OFS as tab. OFS is the out column delimiter for AWK
{...} for each line in the input file, apply the following statements
print $1 FS "0" FS ($2-1) : print "the first column of the current line, FS=the separator, the first base of the bed file which is 0, again FS, and then the value of the second column (the size of the chromosome minus 1)

your code is WRONG. You shouldn't substract 1 from the size, FS ad OFS shouldn't be specified that way.

a better way.

awk -F '\t' '{printf("%s\t0\t%s\n",$1,$2);}' in.fai