Hi newbie to bioinformatics research, I performed de novo assembly on ~100 RNA_Seq data sets from one study with different experimental setup. I got the assembled transcripts and removed the redundant transcripts using clustering algorithms (CD-Hit) Do I use non-redundant transcripts for quantification step or else redundant transcripts for transcript quantification step ? please let me know. Once after quantification is performed all the quantified transcripts subjected to the downstream analysis using edgeR to know the differential gene expression.
Thanks
Hi, thanks for the useful information. I started analysis on pilot study, I chosen samples 1 - 32.
I had generate gene map and matrix files for all the samples individually, result looks fair enough. The next step is to perform cross normalization for each experiment is it right ? combining gene_trans_map files and feeding quant.sf files as an input to generate a combine abundance_estimates matrix is it right ? Please correct if I'm wrong.
Hi, I made gene counts matrix for ~32 samples individually, how do I combine them into one matrix as an input for DESeq2. I'm unable to understand tximport, much simpler way.Suggestions! gene.counts.matrix for one sample