I have imported HTseqcount files to R with 'DESeqDataSetFromHTSeqCount' function, to then do differential gene expression with 'Deseq' package. However, I read online that by default 'Deseq' does not normalize for gene length, and that to normalize for gene length we need to import with txtimport() function.
By reading the manual of txtimport() I got confused: Does txtimport() does per se this gene length correction or is it only when counts are imported from 'Salmon', 'Sailfish', 'kallisto', 'RSEM' and 'StringTie'. I mean: does importing HTseqcounts with txtimport corrects for length or do we need to use txtimport with counts produced by the tools referenced above?
You should note that tx imports transcript level estimates and aggregates them to the gene level. Be default, featureCounts or HTseq quantify counts already on the gene level, so using tximport does not make sense here anyway. DESeq does not take gene length into account, and it also does not need to, because when comparing multiple samples, the length of the gene is always the same. One would only need that if comparing genes within one sample (not discussing now if this makes sense or not). That means if you load your countmatrix into R/DESeq2, and stick to the manual, you are OK.