Hi everyone,
I have just gone through my first ChIP-seq analysis with Galaxy and MACS2. Just to start, I'd like to point out that I have read the MACS2 and MACS 1.4.2 readmes over and over again as well as doing many google searches and forum searches for this answer first but I'm still not clear and would like some confirmation if possible so I'm not making the wrong assumptions.
After calling peaks with MACS2, I get a bed file and an xls file for my peaks. When I look at my xls file column headers, I have 10 columns like this :-
chr start end length abs_summit pileup #NAME? fold_enrichment #NAME? name
chr1 860036 860236 201 860135 25 16.6324 7.08525 11.87312 MACS2_peak_1
chr1 879295 879531 237 879459 19 10.08018 5.00001 6.17729 MACS2_peak_2
chr1 895015 895266 252 895104 20 10.92848 5.25001 6.86955 MACS2_peak_3
chr1 933521 933733 213 933677 15 9.8888 5.66653 6.02773 MACS2_peak_4
chr1 949415 949781 367 949584 23 14.15974 6.28273 9.6336 MACS2_peak_5
I have to say, I don't find the MACS documentation clear and all that useful for beginners. Especially when it says things like "Information include: chromosome name, start position of peak, end position of peak etc etc". My table includes the column headers described in the macs readme files clearly but I have 2 extra columns that aren't described and aren't labelled and I'd like to know what they are. I don't want to assume the wrong things. Can anyone tell me what the "#NAME?" columns I have here are?? My bed file table seems to lack this information in the column headers too.
Also, can someone help me understand, in the pileup column, I understand that this is the number of reads at the summit location? And is this number normalised to reads per million or if I want that information I divide these numbers by the library size after the MACS2 algorithm??
Thanks!
Claire (a frustrated tired newbie)
Thanks Joseph!