PCA in two similar conditions

0

Entering edit mode

11 months ago

Meghan.T ▴ 10

I have RNA-seq data of two very similar conditions ( monomer and dimer of a treatment) and although other invitro tests show some differences, when I perform PCA, they do not cluster well together. The result looks like: enter image description here

I looked through the top 20 PC1 genes but they were irrelevant genes. Do you have any suggestion to make this plot better? Should I remove some genes? If so, should I remove them when doing DGE analysis too?

RNA-seq PCA • 931 views

ADD COMMENT • link updated 11 months ago by jared.andrews07 ★ 19k • written 11 months ago by Meghan.T ▴ 10

0

Entering edit mode

Has the experiment been performed in two batches? PC1 drives unwanted variation so could simply be a batch effect.

ADD REPLY • link 11 months ago by ATpoint 89k

0

Entering edit mode

Unfortunately all of the samples were in the same batch. However we used Takara's TCR RNA-seq library prep kit. Essentially it collects 10,000 T cells and PCR amplify them and then sequence them.Since the T cells are very heterogeneous, this could be a reason for that. In this case, Do you have any suggestions?

ADD REPLY • link 11 months ago by Meghan.T ▴ 10

0

Entering edit mode

Still, PC1 drives unwanted variation, for whatever reason. Define all samples left of PC1=0 as batch1 and right as batch2 and include that into your design. You can use removeBatchEffect on the log2-scale normalized counts to explore the effect of including this batch information into the design.

ADD REPLY • link 11 months ago by ATpoint 89k

0

Entering edit mode

I'd also look at PC3 and onwards to see if any of those capture your expected variation between groups.

ADD REPLY • link 11 months ago by jared.andrews07 ★ 19k

Login before adding your answer.