End coordinate bigger than chromosome size converting bedgraph to bigwig/ Setting macs2 chromosome size
1
0
Entering edit mode
7.3 years ago
salamandra ▴ 550

I used bedGraphToBigWig to convert the treat_qvalue.bdg file that macs2 outputs into a bigwig file.

But it gives the error:

 End coordinate 92691 bigger than chr19_gl000208_random size of 92689 line 3066 of macs_chipvsinput_FOS_day2_treat_qvalue.bdg

This happens because the coordinates that are asked are falling outside of chromosome size, when we align at one version of the genome but try to convert to bigwig using another version. I always used the hg19 version (code bellow). Some people said macs2 doesn't know about the size of chromosome and he tries to extend some reads and when they are close to chromosome end they go further and probably that's why they fall outside of chrom.size.

Some suggest to clip the peaks that fall outside of chromossome size. This solves the immediate problem, but now i'm afraid of having done something wrong with the analysis. Why does macs2 doesn't recognize the size of chromossomes?

# BOWTIE
SUM=path/ChIP_summary_.txt
UNMAP=path/ChIP_unmappedreads.txt
BWTID=path/btw2index/hg19
READS=path/ChIP.fq
OUTPUT=path/ChIP.SAM
bowtie2 -p 24 --un $UNMAP --no-unal -x $BWTID -U $READS -S $OUTPUT 2> $SUM  
# then repeated this for file with Input instead of ChIP

# Convert SAM into BAM
samtools view -bS ChIP.SAM > ChIP.bam
samtools view -bS Input.SAM > Input.bam

# MACS2
TRT=path/ChIP.bam
CTRL=path/Input.bam
OUT=path/macs
NAME=macs_chipvsinput
mkdir $OUT
cd $OUT
macs2 -t $TRT -c $CTRL -g hs -n $NAME -f BAM -B -q 0.05 # -B outputs in bedgraph

# CONVERT TO BIGWIG
# note: fetchChromSizes is the script (retrieved at http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/fetchChromSizes) used to produce the file hg19.chrom.sizes
sh fetchChromSizes hg19 > hg19.chrom.sizes
bedGraphToBigWig macs_chipvsinput_treat_qvalue.bdg hg19.chrom.sizes chipvsinput.bw
ChIP-Seq • 4.8k views
ADD COMMENT
1
Entering edit mode

Download the bedClip tool from the UCSC utilities webpage. Run this on your bedGraph file and it should remove those features which are running off the end of your chromosomes

ADD REPLY
1
Entering edit mode
7.3 years ago

You answered it yourself - the reads get extended past the end of the chromosome as part of MACS analyses. Plus MACS doesn't know which reference genome you've actually aligned your reads to, nor does it care. Your analysis is fine.

I find that removing the mitochondrial chromosomes and random haplotype crap usually help with this a lot. You can also write a script to run through your bedGraph files to remove/fix the elements that go past the end of the chromosome.

ADD COMMENT

Login before adding your answer.

Traffic: 2516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6