Question

How To Convert A Basic Bed File With Only 3 Columns (Chrname, Start, End Site) Into A Bigger Bed With 6 Columns

0

Entering edit mode

13.7 years ago

Hamilton ▴ 290

Hi,

My bed file has only 3 columns - chr name, start, and end. But for macs in galaxy, it requires a bed file with 6 columns. How can I convert?

chip-seq macs bed • 9.7k views

ADD COMMENT • link updated 2.9 years ago by Ram 45k • written 13.7 years ago by Hamilton ▴ 290

0

Entering edit mode

The question is not clear. Can you give examples input and output, and what additional columns you want to add?

ADD REPLY • link updated 2.9 years ago by Ram 45k • written 13.7 years ago by Rm 8.3k

Ram · Answer 1 · 2011-12-01

4

Entering edit mode

13.7 years ago

Gjain 5.8k

well if you have no extra information, then you can add name, score, strand column which are basically your column4, column5 and column6 by adding .(dot) for column4, 0(zero) for column5 and +(strand) for column6.

If you can give an example of what kind of data is available to you, I can modify my answer to have the correct name, score and strand.

ADD COMMENT • link updated 2.9 years ago by Ram 45k • written 13.7 years ago by Gjain 5.8k

Ram · Answer 2 · 2011-12-01

3

Entering edit mode

13.7 years ago

Wolf ▴ 130

MACS uses strand information (which would be in column 6) for the fragment size model it builds. If you want to use MACS and you expect your peaks to be narrow (i.e. you would want to use the model building step), I think you should try to get the strand information from whatever aligner you used into your bed file. Without knowing more, I can't help you with how to do that.

If you don't have the strand information (i.e. if you made them all + strand), you have to use the --nomodel option.

ADD COMMENT • link updated 2.9 years ago by Ram 45k • written 13.7 years ago by Wolf ▴ 130

0

Entering edit mode

i have got this bed file from author of wang et al 2011 pnas paper as it is publicly available. initially, it has only 3 columns. what if i add up . for col4 , 0 for col5, + for col6 assuming that i dont have any extra information for that but i only know basic information as Gjain suggested?? this can give any bias result??

ADD REPLY • link 13.7 years ago by Hamilton ▴ 290

0

Entering edit mode

it means that you can't ask MACS to estimate ChIP fragment size from the data (i.e. use --no-model). Usually you would use the fragment size to shift/extend plus strand reads to the right and minus strand reads to the left, so that the cover the actual binding site that was pulled down. You won't be able to do this, so you should set the shift size to 0. That will reduce your resolution somewhat, but depending on what you are planning on doing, it might still be ok.

ADD REPLY • link 13.7 years ago by Wolf ▴ 130

Ram · Answer 3 · 2015-10-05

2

Entering edit mode

9.8 years ago

Fidel ★ 2.0k

I think this should work

cat bed3.bed | perl -lane 'print "$F[0]\t$F[1]\t$F[2]\t.\t0\t."' > bed6.bed

ADD COMMENT • link updated 2.9 years ago by Ram 45k • written 9.8 years ago by Fidel ★ 2.0k

Ram · Answer 4 · 2015-10-05

0

Entering edit mode

9.8 years ago

cpad0112 21k

See if this works:

awk -v OFS='\t' '{print $1,$2,$3,".",".","."}' bed3.txt > bed6.txt

If last three columns are to be empty, instead of ".", this may work:

awk -v OFS='\t' '{print $1,$2,$3,"","","",""}' bed3.txt  > bed6.txt

ADD COMMENT • link updated 2.9 years ago by Ram 45k • written 9.8 years ago by cpad0112 21k

0

Entering edit mode

In theory, the 5th column should be an score from 0 to 1000 (see https://genome.ucsc.edu/FAQ/FAQformat.html#format1), that's why 0 is better than '.'

ADD REPLY • link updated 2.9 years ago by Ram 45k • written 9.8 years ago by Fidel ★ 2.0k