Question

How to equalize two vector lengths?

2

Entering edit mode

10.1 years ago

Parham ★ 1.6k

Hello,

I am doing goseq and I have two vectors for a code line. de.genes vector is about 250 and lengthData is about 7000. I have to make lenghtData to match up with de.genes I guess, as far as I understood from the error below. But since I am not expert on codes and stuff I cannot figure out how to do it. Can someone help me with that? Very appreciated!

> gene_pwf = nullp(de.genes, bias.data=lengthData)
Error in nullp(de.genes, bias.data = lengthData) : 
  bias.data vector must have the same length as DEgenes vector!

GO vector length • 5.1k views

ADD COMMENT • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Parham ★ 1.6k

Ram · Answer 1 · 2014-11-01

4

Entering edit mode

10.1 years ago

Devon Ryan 105k

Presuming that de.genes is a subset of the original length 7000 genes vector, then just subset lengthData in the exact same way as you did genes. We can't know exactly how you did that, you didn't show us.

ADD COMMENT • link 10.1 years ago by Devon Ryan 105k

0

Entering edit mode

You are right Devon, I had to be more specific. So I prepared de.genes the same way as you showed me here from deseq2resoutput. Then for lengthDataI did as follow as goseq workflow suggests:

txdb <- makeTranscriptDbFromBiomart(biomart="fungal_mart", dataset="spombe_eg_gene", host="fungi.ensembl.org")
txsByGene=transcriptsBy(txdb, "gene")
lengthData=median(width(txsByGene))

May be I am wrong interpreting the source of the problem. If you need any other information please write.

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Parham ★ 1.6k

1

Entering edit mode

Sorry, I missed the reply. Something along the lines of de.lengths <- lengthData[which(d$padj<0.05)] should solve the problem. Just use de.lengths then.

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Devon Ryan 105k

0

Entering edit mode

No worries! This worked but the data types of these two objects are not the same and I get an error for that "Error in sum(y[ww][1:size]) : invalid 'type' (character) of argument". Can one be converted to another?

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Parham ★ 1.6k

1

Entering edit mode

According to help(nullp), de.genes should be "A named binary vector where 1 represents DE, 0 not DE and the names are gene IDs." I recall that being different in a previous version of goseq, though perhaps I'm misremembering. So, something like:

d <- read.csv("deseq2res.csv", header=T, row.names=1)
deGenes <- c(rep(0, nrow(d))
deGenes[which(d$padj<0.05)] <- 1
row.names(deGenes) <- row.names(d)

Then just use lengthData and deGenes as is (as long as they have the same order).

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks Devon, this looks like it will work unless there is minor thing in the last line. It gives an error that I don't know what to do with. Also I have a question. What's the difference between second line your wrote comparing to degenes <- rep(0, nrow(d)))? I created both and it seems they both contain the same data! Thanks again for your help.

> row.names(deGenes) <- row.names(deseqres)
Error in `rownames<-`(x, value) : 
  attempt to set 'rownames' on an object with no dimensions

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Parham ★ 1.6k

1

Entering edit mode

Try instead names(deGenes) <- row.names(deseqres)

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Devon Ryan 105k

0

Entering edit mode

The problem I had in the beginning is back! de.lengths <- lengthData[which(deseqres$padj<0.05)]length is 237 and the deGenes is 6089! I guess we should tell de.lengths to filter out from deGenes. Is it correct?

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Parham ★ 1.6k

1

Entering edit mode

Please reread my comment from 13 hours ago. Apparently in the most recent versions of goseq one doesn't subset things.

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Devon Ryan 105k

0

Entering edit mode

Right, now I understand! Sorry asking somethings twice. I am learning and it is not easy for me to think of all aspects at once.

However the lengthDatathat I create from txdbholds whole full genes list with length of 7019, but the deGeneswhich is created from deseqresholds 6089 since deseqremoves the rows that have a sum of zero during calculations! So I have to remove those rows that are not present in lengthDatato make them same length. Is it right what I think?

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Parham ★ 1.6k

1

Entering edit mode

Correct, you'll to use %in% to see which of the rows of txsByGene are in deseqres.

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Devon Ryan 105k

0

Entering edit mode

Can I just remove the rows in lengthDatathat are not present in deGenes? If you could show how?

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Parham ★ 1.6k

1

Entering edit mode

Whether you subset lengthData or txyByGene is up to you. You'll need to use %in% either way. You should be able to figure out how to do this yourself.

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Devon Ryan 105k

0

Entering edit mode

Ok, it took a long time until I could come up with something that might do the job. However I would like to check with you if it is correct, if you could have a glance. So first I make a vector with all the genes present in both lengthDataand deseqresthen I subset lengthDatainto a new_lengthData with them. I even can't express myself very well. But here is how I did:

> select_genes <- as.vector(names(lengthData)%in%row.names(deseqres))
> new_lengthData <- lengthData[select_genes]

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Parham ★ 1.6k

1

Entering edit mode

Looks correct.

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Devon Ryan 105k

0

Entering edit mode

Devon did you see my reply here? I appreciate if you can give some help in here.

Thanks!

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Parham ★ 1.6k