Hello All,
I am applying Voom normalization to RNA-Seq raw Counts data obtained from TCGA. I have constructed a Matrix of ~20000 Rows and 341 Columns with first column being of Gene_id.
I am using Voom()
method to normalise the data. I have done the following code.
## Libraries
library(limma)
library(edgeR)
## Matrix File
raw.data <-read.delim("Combined_matrix_340.txt")
attach(raw.data)
names(raw.data)
d <- raw.data[, 2:341]
rownames(d) <- raw.data[, 1]
# Pheno data file
pheno<-read.table("pheno_data_BRCA.txt", header=TRUE, sep="\t")
##To design matrix---
Group<-factor(pheno$Status,levels=levels(pheno$Status))
design<-model.matrix(~0+Group)
##Normalisation
y <- voom(d,design,plot=TRUE)
colnames(design)
fit <-lmFit(y,design)
##Designing Contrast Matrix for group Differentiation
cont.wt<-makeContrasts("Metastatic-Normal_Control","ERPositive-Normal_Control","PRPositive-Normal_Control","HER2Positive-Normal_Control","ER_PR_HER2_Neg-Normal_Control",levels=design)
fit2 <-contrasts.fit(fit,cont.wt)
fit3<-eBayes(fit2)
DE<-topTable(fit3, coef=2 )
After this, The output is as follows:
Gene_ID logFC AveExpr t P.Value adj.P.Val B
ACTB|60 12.59366 12.54151 202.8138 0.000000e+00 0.000000e+00 806.7855
EEF1A1|1915 12.06986 12.51399 187.5779 0.000000e+00 0.000000e+00 781.7838
ACTG1|71 11.93940 12.03115 179.5847 0.000000e+00 0.000000e+00 767.7521
UBC|7316 10.71139 11.15274 176.8877 0.000000e+00 0.000000e+00 761.7751
TPT1|7178 10.99882 11.58788 159.5321 0.000000e+00 0.000000e+00 728.9007
HSP90AB1|3326 11.00446 11.12925 157.1734 9.881313e-323 3.381237e-319 724.0502
FTH1|2495 10.98239 11.26717 153.0514 8.557019e-319 2.509774e-315 715.3888
EEF2|1938 10.82150 11.46502 151.3403 3.886332e-317 9.973786e-314 711.5412
PSAP|5660 10.71044 11.06326 147.8964 9.572234e-314 2.183639e-310 703.8942
HSP90AA1|3320 10.74747 10.94257 144.5401 2.294330e-310 4.710489e-307 696.4700
My Question: I am getting only a list of 10 genes, I am not able to pull all list. And, I want someone to validate my codes and method followed. Let me remind you all, I am a novice in coding/Bioinformatics. Please let me know if I am coding it correct or should I modify it.
Thanks a lot for your help.
-Ateeq Khaliq
gives all genes. Default
topTable
outputs only top ten genes.Thanks a lot poisonAlien.... Can you please Validate my code?
David, could you please help me and tell me how did you construct the matrix?