Hi! Can anyone point me at right direction?
I'm trying to get UP and DOWN regulated genes between 2 groups (4 species in each), resulted by a knockdown experiment. I've managed to use grape-nf pipeline and obtained expected counts for each sample.
What is the best way to get from expected count (i have 8 files with genes counts total) to differentialy expressed genes? I've tried deseq2, but i can't get my data into the right format for it.
Well, i have as an input 8 files, generated by RSEM. first i do
Where files is object with my 8 files
then, when i do final transformation to deseq2 data:
I get an error
which is self-explanatory, but i can't really fight it, because all lengths were obtained in an automated way from RSEM. Which is why i'd appreciate if i could understand the template for deseq2 data object input, so i could generate it myself with different scripting
Often when running complex packages that someone else creates you come across an error that stumps you. The first thing to do is google the error! These packages have been around for years....someone will have had the same problem before you.
Google results
From the first google result here is the reply of the author of DESEQ2 on the bioconductor forum i mentioned above: "This error is because the modelling cannot have transcripts of length zero included in the offset calculation. What does it mean to have a gene of length 0 after all? I don't know the best solution here, the zeros are being produced by the upstream software, but I've recommended to others to just insert a length of 1 instead for these 0's, before moving on to DESeq2.
And from the second google result again from the author: You can edit the 0 lengths to be 1, by editing the length matrix in txi, before starting with DESeq2 We're just taking the gene effective lengths as reported by RSEM, whereas for summarizing from transcript level, tximport ensures the lengths are nonzero.
Welcome to "problem solving for bioinformatics 101".