Entering edit mode
5.5 years ago
Assa Yeroslaviz
★
1.9k
I have a ChIP-Seq data set with 4 IP and and Input samples as well as two IgG samples. When mapping the fastq samples (pair-end samples), I get very good results for the Input samples, but very bad results for the IgG samples and slightly better ones fro the IP samples:
Input1 | 93.61%
Input2 | 97.41%
Input3 | 97.75%
Input4 | 94.08%
IP1 | 54.41%
IP2 | 83.16%
IP3 | 75.63%
IP4| 63.76%
IP_IgG1 | 24.76%
IP_IgG2 | 52.90%
I would like to understand why the IP and IgG reads weren't mapped as well as the Input. I have searched for contamination, but couldn't see any.
I would appreciate your suggestions as to how to pin-point this problem.
thanks
If I just see your percent of aligned reads, I would say only one sample i.e IP_IgG1 has got some problem, rest appear to be fine, The problem could be due to efficiency of antibody, you might also look into total number of reads. How did you search for contamination ?
I ran the files wiith fastq_screen against multiple indexed genomes. It didn't find any disconcerting results. In my experience ~50% is not a good mapping results. I can live with it, but I would like to know why it didn't map.
I don't think it is the AB. Sample IP_IgG1 was also the one with the least read counts (approx. 5mill), but I don't know if this should be an explanation aas to why 75% of the sequenced reads were not mapped. AB specificity would cause the low amount of reads in the sample, which is especially in IgG-treated samples is expected to be low. But in my opinion, if the reads were found - why shouldn't they be also mapped? If they didn't map, what did I sequenced?
There are two reason I could think of, either the reads are not mapping due to low quality reads or there is some contamination. I would try doing blast for few of the unmapped reads and see if I get some hits for contamination.