Question

Blocky p300 ChIP-seq reads

0

Entering edit mode

23 months ago

alphaflylizard • 0

I recently performed a p300 ChIP-seq on two different genotypes.

I aligned my reads using STAR aligner, and made UCSC .bedGraph files that I loaded on IGV viewer. I noticed that the reads for one of my genotypes - genotypeX (red - combined replicates; purple and pink - individual replicates) look very blocky. The reads for the other genotype - genotype Y (black - combined replicates; green, blue - individual replicates) look better!?

enter image description here

Hnrnpa0 is a known target of p300 which is why I chose it as an example.

What could be the reason why the reads look 'blocky'?

How can I troubleshoot this?

Could it be due to duplications or just very bad ChIP?

I attempted to run samtools markdup. I had the following line of commands set up for one my genotype replicates:

Convert my sam files to bam format:

samtools view -b genotypeX_R1.sam > genotypeX_R1.bam

Then I followed the samtools markdup steps as follows:

samtools sort genotypeX_R1.bam -o sorted_genotypeX_R1.bam
samtools index sorted_genotypeX_R1.bam
samtools markdup -r sorted_genotypeX_R1.bam marked_duplicates_genotypeX_R1.bam
samtools flagstat marked_duplicates_genotypeX_R1.bam

Are these the right set of commands to use to mark up duplicates in my files?

I was introduced to bioinformatics recently and am still struggling to figure out/trouble shoot errors that come up in my analysis workflow. 80% of the time I have no idea how to approach or what I am doing to solve any issues with my data. :(

Thank you so much for your time and help! Appreciate it!

chip-seq p300chip igv • 1.1k views

ADD COMMENT • link updated 23 months ago by ATpoint 88k • written 23 months ago by alphaflylizard • 0

score 1 · Answer 1 · 2023-08-24

1

Entering edit mode

23 months ago

ATpoint 88k

If this screenshot is representative for the dataset then it means that the ChIP failed, as there is no peaks and just noise. Call peaks with macs2, a failed ChIP will manifest in very few to no peaks.

ADD COMMENT • link 23 months ago by ATpoint 88k

0

Entering edit mode

Yes the ChIP is from my data! Would you say the data for genotype Y is okay?

ADD REPLY • link 23 months ago by alphaflylizard • 0

0

Entering edit mode

No, from this screenshot it all looks like pure noise.

ADD REPLY • link 23 months ago by ATpoint 88k

score 0 · Answer 2 · 2023-08-24

0

Entering edit mode

23 months ago

Ming Tommy Tang ★ 4.7k

it is ChIP-seq data, you should use a DNA aligner. STAR is for RNA-seq which aligns the reads to the transcriptome rather than the genome.

ADD COMMENT • link 23 months ago by Ming Tommy Tang ★ 4.7k

0

Entering edit mode

STAR aligns to the genome, with a GTF to guide splicing if wanted (during indexing). There is nothing intrinsically wrong with this here if the index was built without a GTF, despite bowtie2 or bwa are more typical choices for these data indeed. @OP you can try another aligner such as bowtie2 to be sure though.

ADD REPLY • link 23 months ago by ATpoint 88k