Hi,
My bed file has only 3 columns - chr name, start, and end. But for macs in galaxy, it requires a bed file with 6 columns. How can I convert?
Hi,
My bed file has only 3 columns - chr name, start, and end. But for macs in galaxy, it requires a bed file with 6 columns. How can I convert?
well if you have no extra information, then you can add name, score, strand column which are basically your column4, column5 and column6 by adding .(dot) for column4, 0(zero) for column5 and +(strand) for column6.
If you can give an example of what kind of data is available to you, I can modify my answer to have the correct name, score and strand.
MACS uses strand information (which would be in column 6) for the fragment size model it builds. If you want to use MACS and you expect your peaks to be narrow (i.e. you would want to use the model building step), I think you should try to get the strand information from whatever aligner you used into your bed file. Without knowing more, I can't help you with how to do that.
If you don't have the strand information (i.e. if you made them all + strand), you have to use the --nomodel
option.
i have got this bed file from author of wang et al 2011 pnas paper as it is publicly available. initially, it has only 3 columns. what if i add up . for col4 , 0 for col5, + for col6 assuming that i dont have any extra information for that but i only know basic information as Gjain suggested?? this can give any bias result??
it means that you can't ask MACS to estimate ChIP fragment size from the data (i.e. use --no-model). Usually you would use the fragment size to shift/extend plus strand reads to the right and minus strand reads to the left, so that the cover the actual binding site that was pulled down. You won't be able to do this, so you should set the shift size to 0. That will reduce your resolution somewhat, but depending on what you are planning on doing, it might still be ok.
I think this should work
cat bed3.bed | perl -lane 'print "$F[0]\t$F[1]\t$F[2]\t.\t0\t."' > bed6.bed
See if this works:
awk -v OFS='\t' '{print $1,$2,$3,".",".","."}' bed3.txt > bed6.txt
If last three columns are to be empty, instead of ".", this may work:
awk -v OFS='\t' '{print $1,$2,$3,"","","",""}' bed3.txt > bed6.txt
In theory, the 5th column should be an score from 0 to 1000 (see https://genome.ucsc.edu/FAQ/FAQformat.html#format1), that's why 0 is better than '.'
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
The question is not clear. Can you give examples input and output, and what additional columns you want to add?