Question

output file conversion from FeatureCount to DESeq2 - no duplicate for samples

0

Entering edit mode

2.3 years ago

Angelina_G ▴ 10

Hello, I did a bulkRNA-seq and now have an output gene count file from: featureCounts -s 0 -p -P -d 0 -D 1000 -B --primary -t exon -g gene_name -a gtf -T 6 -o output bam1 bam2 bam3 (I did it via hisat2 then samtools sort then featurecounts using linux command line)

The three bam files belong to 3 cell lines and I want to do a differential analysis on their RNA gene expression, see which cell line expresses higher level of what genes.

The problem is, I did not do any duplicates, so I only have one sample per each cell line, and when I tried doing dds1 <- DESeq(dds1) it tells me:

The design matrix has the same number of samples and coefficients to fit,
  so estimation of dispersion is not possible. Treating samples
  as replicates was deprecated in v1.20 and no longer supported since v1.22.

What should I do if I want to compare them and get a result on which gene is expressed higher in one cell line compared to another?

Meanwhile, my data looks like below:

        H1_LIM9 NiPSC_LIM9 RUESC_LIM9
DDX11L1     0         0            0           
WASH7P      217       209          116         
MIR6859-1   1         0            0           
MIR1302-2HG 0         0            2           
MIR1302-2   0         0            0

Currently I'm importing them into dds via splitting the data matrix into 3 files each with two cell lines, so that they get to be compared with only one other cell line. Is there other better ways?

Thank you!

FeatureCount differential DESeq2 R analysis • 1.0k views

ADD COMMENT • link 2.3 years ago by Angelina_G ▴ 10

0

Entering edit mode

Cross-posted: https://support.bioconductor.org/p/9146279

ADD REPLY • link 2.3 years ago by ATpoint 86k

score 3 · Accepted Answer · 2022-08-31

3

Entering edit mode

2.3 years ago

ATpoint 86k

Without replicates no DEG analysis in DESeq2, simple as that. Please google "differential analysis without replicates", it has been asked many times before, and by the way I am afraid to say that it is a poor experimental design. That is why you should read about analysis or talk to an analyst first before conducting an experiment.

ADD COMMENT • link 2.3 years ago by ATpoint 86k

0

Entering edit mode

Thank you so much! It was a pilot test so we only had one data per cell line. I am new to bioinformatics and was copying others' pipeline for our own dataset, likely searched in google for a bunch of wrong keywords without considering the replicate part. Thank you so much!

ADD REPLY • link 2.3 years ago by Angelina_G ▴ 10