Question

Explaining PCA plots made from BigWig coverage files in DeepTools

1

Entering edit mode

6.1 years ago

a.rex ▴ 350

I have 3 ATAC-seq conditions, with 2 replicates in each. I want to compare how similar they are to each other. I mapped with BWA and the output bam file was converted to bigwig format using bamcoverage, and normalising using RPKM.

I then used multibigwigsummary as following:

./multiBigwigSummary bins -b cellpop1_rep1 cellpop1_rep2 cellpop2_rep1 cellpop2_rep2 cellpop3_1 cellpop3_2 -out cell123.npz --binSize=100

The PCA plot of the resultant count summary looks like this: enter image description here

Am I right in thinking that my replicates cluster well together, meaning they are quite similar. However, there are differences in PC2. But the eigenvalues of this second component are minimal? Meaning that the differences are small between the samples, but they do exist?

deeptools atac-seq • 3.3k views

ADD COMMENT • link updated 6.1 years ago by Ram 44k • written 6.1 years ago by a.rex ▴ 350

0

Entering edit mode

Rerun plotPCA with --transpose.

ADD REPLY • link 6.1 years ago by Devon Ryan 105k

0

Entering edit mode

what does this do? I don't understand what transpose does to the information?

ADD REPLY • link 6.1 years ago by a.rex ▴ 350

0

Entering edit mode

It ... transposes the data, turning the rows into columns and columns into rows. In processes where the columns and rows are significantly different in meaning, such as when working with data.frames and PCA, this can make a significant difference in the meaning of the output, as opposed to when you're working with data structures such as 2D matrices of numbers.

ADD REPLY • link 6.1 years ago by Ram 44k

0

Entering edit mode

Basically it does what Ram indicated. The gist is that at the moment PC1 is genomic position level changes, which will tend to be huge for ATAC data. That's going to end up masking what you're actually interested in, namely how well your samples actually cluster together. By transposing the matrix you look more at that (this is the standard in things like RNA-seq).

ADD REPLY • link 6.1 years ago by Devon Ryan 105k