Question

Determine Condition using LogFC values

0

Entering edit mode

8.9 years ago

AB ▴ 360

Hi Everyone,

I am analyzing RNA-Seq data having 22 samples from 3 batches for differential expression of genes. My condition is test between defective and normal phenotype. For one of the samples, the condition is indeterminate. From the slides, it appears to be slightly defective but might not necessarily be so. Is there anyway I can determine what to label it as? Would it help to see changes in logFC values when I first label it as defective and then as normal?

Thanks

RNA-Seq fold-changes • 2.0k views

ADD COMMENT • link updated 2.3 years ago by Ram 44k • written 8.9 years ago by AB ▴ 360

0

Entering edit mode

If you're unsure what the sample is, it is best to exclude it (not only for your interests but for others). On the other hand, if you want to look at how similar the replicates are then you can use a simple Pearson correlation values/plots to make the decision.

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 8.9 years ago by arnstrm ★ 1.9k

Ram · Answer 1 · 2015-12-21

1

Entering edit mode

8.9 years ago

dariober 15k

Principal components analysis (PCA) is sometimes applied to expression levels from RNA-Seq data to spot outliers or otherwise unexpected sample behaviours. You could apply it to your case and see first, if samples cluster neatly by condition. Then see which cluster your undetermined sample best belongs to. Having said that, I would be careful with labelling this sample as one or the other group just on the bases of PCA.

If you are using edgeR look at function plotMDS, I think DESeq has some similar function.

ADD COMMENT • link updated 5.0 years ago by Ram 44k • written 8.9 years ago by dariober 15k

0

Entering edit mode

PCs on all genes will classify the unknown sample correctly, but I bet you $100 the anomalous sample lies somewhere between the two and removed in the third PC.

ADD REPLY • link 8.9 years ago by karl.stamm 4.1k