Entering edit mode
10.1 years ago
Parham
★
1.6k
Hello,
I am doing goseq and I have two vectors for a code line. de.genes vector is about 250 and lengthData is about 7000. I have to make lenghtData to match up with de.genes I guess, as far as I understood from the error below. But since I am not expert on codes and stuff I cannot figure out how to do it. Can someone help me with that? Very appreciated!
> gene_pwf = nullp(de.genes, bias.data=lengthData)
Error in nullp(de.genes, bias.data = lengthData) :
bias.data vector must have the same length as DEgenes vector!
You are right Devon, I had to be more specific. So I prepared
de.genes
the same way as you showed me here fromdeseq2res
output. Then forlengthData
I did as follow as goseq workflow suggests:May be I am wrong interpreting the source of the problem. If you need any other information please write.
Sorry, I missed the reply. Something along the lines of
de.lengths <- lengthData[which(d$padj<0.05)]
should solve the problem. Just usede.lengths
then.No worries! This worked but the data types of these two objects are not the same and I get an error for that
"Error in sum(y[ww][1:size]) : invalid 'type' (character) of argument"
. Can one be converted to another?According to help(nullp),
de.genes
should be "A named binary vector where 1 represents DE, 0 not DE and the names are gene IDs." I recall that being different in a previous version of goseq, though perhaps I'm misremembering. So, something like:Then just use
lengthData
anddeGenes
as is (as long as they have the same order).Thanks Devon, this looks like it will work unless there is minor thing in the last line. It gives an error that I don't know what to do with. Also I have a question. What's the difference between second line your wrote comparing to
degenes <- rep(0, nrow(d)))
? I created both and it seems they both contain the same data! Thanks again for your help.Try instead
names(deGenes) <- row.names(deseqres)
The problem I had in the beginning is back!
de.lengths <- lengthData[which(deseqres$padj<0.05)]
length is 237 and the deGenes is 6089! I guess we should tellde.lengths
to filter out fromdeGenes
. Is it correct?Please reread my comment from 13 hours ago. Apparently in the most recent versions of goseq one doesn't subset things.
Right, now I understand! Sorry asking somethings twice. I am learning and it is not easy for me to think of all aspects at once.
However the
lengthData
that I create fromtxdb
holds whole full genes list with length of7019
, but thedeGenes
which is created fromdeseqres
holds6089
sincedeseq
removes the rows that have a sum of zero during calculations! So I have to remove those rows that are not present inlengthData
to make them same length. Is it right what I think?Correct, you'll to use
%in%
to see which of the rows oftxsByGene
are indeseqres
.Can I just remove the rows in
lengthData
that are not present indeGenes
? If you could show how?Whether you subset
lengthData
ortxyByGene
is up to you. You'll need to use%in%
either way. You should be able to figure out how to do this yourself.Ok, it took a long time until I could come up with something that might do the job. However I would like to check with you if it is correct, if you could have a glance. So first I make a vector with all the genes present in both
lengthData
anddeseqres
then I subsetlengthData
into anew_lengthData
with them. I even can't express myself very well. But here is how I did:Looks correct.
Devon did you see my reply here? I appreciate if you can give some help in here.
Thanks!