Entering edit mode
9.7 years ago
gustavoborin01
▴
90
Hi,
I'm trying to filter out low counts features of my RNASeq data with noiseqbio function of NOISeq package before I run WGCNA package to construct a co-regulatory network, but I'm getting this error when I try to do that. Can anyone help me to solve this?
# rpkm = matrix with more than 9,000 genes and 7 conditions (2 biological replicates)
rpkm<-read.csv("rpkm_all.csv")
head(rpkm)
F24h_1 F24h_2 C6h_1 ....
e_gw1.1.1022.1 10.6933092 8.91526912 7.24161321 ....
e_gw1.1.104.1 0.0000000 0.02118639 0.02090429 ....
e_gw1.1.1046.1 0.1131807 0.15213278 0.16165381 ....
myfactors=data.frame(condicao=c("F24h","F24h","C6h","C6h","C12h","C12h","C24h","C24h","B6h","B6h","B12h","B12h","B24h","B24h"),replicas= c("F24h_1","F24h_2","C6h_1","C6h_2","C12h_1","C12h_2","C24h_1","C24h_2","B6h_1","B6h_2","B12h_1","B12h_2","B24h_1","B24h_2"))
head(myfactors)
condicao replicas
1 F24h F24h_1
2 F24h F24h_2
3 C6h C6h_1
4 C6h C6h_2
5 C12h C12h_1
6 C12h C12h_2
mydata<-readData(data=rpkm, factors=myfactors,length = NULL,biotype = NULL,chromosome = NULL,gc = NULL)
mydata
ExpressionSet (storageMode: lockedEnvironment)
assayData: 9852 features, 14 samples
element names: exprs
protocolData: none
phenoData
sampleNames: F24h_1 F24h_2 ... B24h_2 (14
total)
varLabels: condicao replicas
varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:
mynoiseqbio=noiseqbio(mydata,k=0.5,norm="rpkm",factor=myfactors$condicao, lc=0, r=50, =1.5, plot=TRUE, a0per=0.9, random.seed=12345,filter=1)
Error in `[.data.frame`(input@phenoData@data, , factor) :
undefined columns selected
Thanks Komal for your answer, but when I type this, I have another message error:
I have also tried the options below, but I got another error messages.
So, do you have another suggestion komal? Thank you again.
I have updated my answer. Like the error says, you need to specify which conditions you want to compare. You can do that in the conditions parameter. It should be "a vector containing the two conditions to be compared by the differential expression algorithm (needed when the factor contains more than 2 different conditions)". As an example, I have specified F24h and C6h as the conditions to be compared.
Sorry about my inexperience Komal, but it still doesn't work.
So I tried this, but I did not have success.
Wait, you do have C6h in your conditions, right?
Right.
This is what I did and it is working:
Komal, I was reading again the NOISeq tutorial and I was thinking if it's really necessary apply this function because there is filtered.data function too which looks like have the same or similar function than noiseqbio. Have you ever used this function?
Umm, I thought your aim was to compute differential expression. There is a difference between the two functions, noiseqbio computes differential expression in addition to filtering out low count features, whereas filtered.data just filters out the low count features. If you just want to filter out low count featues and then move on to some other method for differential expression, then you can use filtered.data function instead of noiseqbio.
Thank you so much, Komal. Your script code has worked now with me. I really appreciate your answers. I was wondering now if I will have to run this script for each duplicate biological I have to exclude the low counts. If yes, I think the filtered.data function it is more appropriate, don't you agree?
You could use filtered.data first to remove low count features across all samples, and then use noiseqbio with the argument filter = 0 so that it does not perform any filtering.