Bowtie2 map results ? and very low number of Peaks in ChIP seq data?
1
0
Entering edit mode
7.4 years ago
Azhar ▴ 50

Hellow every one I used Two TF to do ChIP in Mouse tissues , ,I know usually we used unpaired end seq for ChIP but I used pair end seq. when i got the results, did mapping using Bowtie2 I got these strange results for mapping

    7908233 reads: off these:
       7908233 (100%) were paired; of these:
           5096777 (64.45%) aligned concordantly 0  times
           2186477 (27.65%) aligned concordantly  exactly 1 time
           624979 (7.90%)  aligned concordantly >1 times
......
                 5096777 pairs aligned concordantly  times: of these:
                 1284864 (25.21%) aligned disconcordantly 1 time
.........
        3811913 pairs  aligned  0 times concordanlty   or disconcordantly 
   7623826 mates make up  the pairs; of these:
         6918974 (90.75%) aligned 0 times 
         262375 (3.44 %) aligned  exactly 1 time 
       442477 (5.80%) aligned >1 times 
56.25% overall alignment rate

I know that >70% unique mapping is normal but i could not understand these results what they explain and is it good mapping or not ? my seq resulty according to mapping is good or not

Peaks: Another thing is the Peak number when i find theh peaks number is very low like 244 for IgG and 222 for TF1 and 170 for TF2 and when i filter them against igG it goes further lower number my first question according to this situtaion is i think usually the peak number for TF across the genome is in thousands but its in hundards? second IgG seams to have higher peak number then the samples ?

ChIP-Seq • 4.1k views
ADD COMMENT
0
Entering edit mode

Have the mapping output been truncated - looks like there is somthing missing? Fx: "3811913 pairs aligned ) times" does not make sense. Did you check the quality of the fastq files befor mapping them? A good tool for that is FastQC

ADD REPLY
0
Entering edit mode

its just a typind mistake yes i did fastqc check basic statistics looks okay if you want i can share qc report, it has some cases but basic statistics is okay

ADD REPLY
0
Entering edit mode

Could you correct all copy/paste mistakes - its very hard to help without that :-). Fx "5096777 pairs aligend concordantly XX times".

ADD REPLY
0
Entering edit mode

Now i have copy paste the result in the Post , I edit the Post and copy paste the exact result So now kindly help me out First tell me that is >50 aligenment is okay for Peak calling? and So less peak number ?

ADD REPLY
0
Entering edit mode
7.4 years ago

50% alignment is far from ideal. I would be very wary of any peaks called from these alignments. It's hard to say much though, as we really need more information, but about your methodology and your experimental design and expectations.

  1. What commands did you use to align these files?
  2. What peakcaller did you use and with what options?
  3. How many peaks would you expect for these TFs? Do they bind readily across the genome or are they much more focused and limited?
  4. Additionally, are you sure the antibodies used were good for ChIP?
  5. Did you do any PCR with positive/negative control primers from your ChIP DNA to ensure the ChIP actually worked?

Be explicit and provide as much information as possible if you want anyone to put in the effort to help you with your problem.

Typically, yes, one would expect thousands of peaks from a ChIP-seq experiment, but if a given TF was very specific maybe you'd only see <1000 peaks. If these are common TFs that have been used for ChIP-seq before, I'd recommend looking at the data from those papers and seeing how many peaks they might have found. I expect it'd be more than a few hundred.

ADD COMMENT
0
Entering edit mode

Thanks a lot Methodology ? I used a validated protocol in our lab. experimental design for tissues for two conditions and expectations is about thousands of peaks (Prove one of my lab mate has done ChiP for same TF with same protocol). 1. Ans is I used Bowtie2 and command bowtie2 -x genomepath/genome -1 pathtofile/file -2 pathtofile/file -S pathtofile/samfile 2.I used Bowtie2 command and options are shown above 3.My TF is renowned as the master wavier of the genome (cell) is genome wide binding TF 4 and 5. I used the anitbodies before to check my protocol and validates by qPCR results. The ChIP was working .I prepared the seq library it was okay .Seq company send the first QC report before seq it was okay.

ADD REPLY
0
Entering edit mode

bowtie2 doesn't call peaks unless the software has drastically changed recently.

You might try aligning each file individually and seeing if the alignment % increases much. If it doesn't increase, you would know that something went wrong with your library prep or sequencing itself and the company might be worth contacting. If it does increase, you would know something is wrong with your alignment pipeline.

ADD REPLY
0
Entering edit mode

offcourse I know what bowtie is for ? i used it for mapping .? my question is still there that how can i explain my mapping results?

and I have peaks for IgG 244 and S1 100 and O1 (second TF) 27

usully the peak number is in several thosuands for my TF in previously publsihed papers in tiusses it is several thousands

ADD REPLY
0
Entering edit mode

Then why did you say that used it for peakcalling as an answer to my second question?

What peakcaller did you use and with what options? I used Bowtie2 command and options are shown above

Regardless, have you tried aligning each file individually? That will tell you if the actual reads are all bad somehow, or if you're running into an issue during alignment. Have you visualized the aligned files at all in IGV or a genome browser? That might help you determine how much enrichment you're actually getting for your positive control regions.

Any chance of contamination in your cells? Maybe you have a large amount of contaminating DNA from a fungal or bacterial infection if it was done from cultured cells.

ADD REPLY
0
Entering edit mode

I think I mentioned that I used tissue specifically heart tissue , I checked my fastqc report again there is red sign Error for duplicated reads which explains that there are more then 50 % of duplicated reads so how can I remove duplicate reads can you suggest me any tool? I have sam files is it convenient to remove duplicates at fastqc level before mapping or after mapping in sam files ? and for both of the conditions which are appropriate tools

ADD REPLY
0
Entering edit mode

I'd be careful removing duplicated reads from a TF ChIP-seq experiment, as you ideally will have some amount of duplicate reads if your ChIP is enriching well. 50% does seem pretty high though.

picard's MarkDuplicates command can be used for removing duplicate reads from a bam or sam file very easily.

Still, I'd really recommend visualizing your files in IGV to see if there are any peaks at your positive control regions.

ADD REPLY
0
Entering edit mode

Thanks Jared.Andrews07 you are right . I have visualized the IgG and SampleTF bedgraph files IgG have same peaks size or even more with sample and number of peaks is higher then sample peak number thats strange ? is int ? I also try to use Picard but I am dont know I am using right becuase for picard installation you need to clone with git that command did not work in my system so i download jar file and unzip cd and run picard tools -help but still i have problem i run java .jar picard.jar MarkDuplicates I=file.sam O=file.sam but after some process on shell screen it says sam file need to be sorted

ADD REPLY
0
Entering edit mode

You need to sort your sam file with samtools sort -o sorted.sam unsorted.sam first then. It really seems to me that your ChIP may not have worked, particularly if you're getting the same peaks in both your TF and IgG samples - it's probably artificial.

If your ChIP qPCR worked as you described, it seems like something went wrong during library prep.

ADD REPLY

Login before adding your answer.

Traffic: 2721 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6