How can merge RNA-Seq biological replicates ?
2
0
Entering edit mode
10.1 years ago
hana ▴ 190

Hi

I have 6 RNA-Seq samples (biological replicates) . I would like to choose an FPKM cut-off to identify high and low express genes. I want to know how I can merge all biological replicates and then find the best FPKM cut-off threshold?

Can I use cuffmerge and then then cuffdiff?

thank you

RNA-Seq • 8.3k views
ADD COMMENT
0
Entering edit mode

my samples are in the same condition

ADD REPLY
4
Entering edit mode
10.1 years ago

I would recommend using Cuffnorm to merge your samples (also runs on bam files, see here). You can then plot histograms of transcript detection (#samples where detected under threshold / 6) for a set of specified FPKM thresholds. After that you can re-formulate low- and high-expressed gene thresholds, for example as a threshold above (below) which 95% of genes are expressed in 5-6 (0-1) samples.

I don't think it is a good idea to merge all .bam files first and then perform an analysis, as you'll loose the info on expression variability coming from your biological samples. The rationale for using cuffnorm is that it will account for effects like different sequencing depth in samples, etc.

ADD COMMENT
0
Entering edit mode

I second the concern about merging biological replicates. The purpose of performing biological reps is to measure variability between your samples. You need to at least show there is reasonably low variability between the replicates before merging them. You could calculate Pearson or Spearman values between replicates as one way to accomplish this task.

ADD REPLY
0
Entering edit mode

Thank you for your comment. I have already run cuffnorm. Would you please tell me which file I have to use to make the graph, the genes.fpkm_table file or genes.count_table file and how can I chose the FPKM thresholds based on this file?

I am very new in RNA-seq analysis and would be very appreciate if you could give me suggestion

thank you

ADD REPLY
0
Entering edit mode
10.1 years ago
Chirag Nepal ★ 2.4k

To merge replicates sample, use samtools

Usage:   samtools merge [-nr] [-h inh.sam] <out.bam> <in1.bam> <in2.bam> [...]

Options: -n       sort by read names
         -r       attach RG tag (inferred from file names)
         -u       uncompressed BAM output
         -f       overwrite the output BAM if exist
         -1       compress level 1
         -R STR   merge file in the specified region STR [all]
         -h FILE  copy the header in FILE to <out.bam> [in1.bam]

Note: Samtools' merge does not reconstruct the @RG dictionary in the header. Users
  must provide the correct header with -h, or uses Picard which properly maintains
  the header dictionary in merging.
ADD COMMENT
0
Entering edit mode

I have run tophat and cufflink to each of my 6 samples. How to get an 'average' gene expression levels of this replicates? Can I run cuffmerge on assembly files (transcript.gtf) or first I have to merge replicates all accepted_hits.bam files by samtools and then run cufflinks on the merged file?

ADD REPLY

Login before adding your answer.

Traffic: 2771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6