Question

RNA-seq analysis between three conditions

0

Entering edit mode

6.8 years ago

pagetbailly.philippe • 0

Hi everyone,

The problem I have isn't really a bioinformatic problem but I came across a hard nut to crack when analyzing my RNA-seq results. I had 3 conditions in triplicate: - 1 = control - 2= expressing 2 isoforms of a protein - 3= expressing 1 isoform of the protein

Seq generated 50 million 75 bp single end reads per sample.

The DE analysis I realized using cuffdiff gave me the following results: - 1 vs. 2 = 400 DEG - 1 vs. 3 = 50 DEG - 2 vs. 3 = 800 DEG

The same .BAM files analyzed using DESeq2 gave me more "marked" results: - 1 vs. 2 = 700 - 1 vs. 3 = 10 DEG - 2 vs. 3 = 2300 DEG

Here both analyzes show negligible DEG between condition 1 and 3. I know that DE analysis has the purpose to highlight significantly deregulated genes but to have so much DEG between 1 vs 2 and 2 vs 3 means there is something going on in my third condition right ? even tho there is not much DEG between control and third condition ? I don't know how to put this results into words.

Does anyone has encountered a similar situation ?

Thanks for the help you can provide !

RNA-Seq cuffdiff DE • 2.5k views

ADD COMMENT • link 6.8 years ago by pagetbailly.philippe • 0

0

Entering edit mode

Hi,

Is the isoform in your third condition the same as one of the isoforms in your second condition?

And regardless of my first question, why are two isoforms clubbed in your second condition? 1 vs. 2 = 400 DEG might have been different if the isoforms were kept as two different conditions.

ADD REPLY • link 6.8 years ago by vinayjrao ▴ 260

0

Entering edit mode

Hi, in deed the isoform in the third condition is the same as the one in the second condition. Long story short, the two isoforms come from an alternative splicing. The isoform2 from the third condition was shown to have elusive effects in the litterature yet it accounts for 90% of mRNA from the gene. On the contrary, isoform1 an its effects are very well caracterized. e were unable to generate cellular clones expressing only the first unspliced isoform because inhibiting splicing increases by 10 fold the expression of the unspliced which is lethal for the cells unfortunately ...

Like you suggest the best experimental set up would be one condition for each isoform. But we had to chose this one so we have: - control - 10% isoform1 / 90% isoform2 (closest to in vivo) - isoform2 (our interest)

ADD REPLY • link 6.8 years ago by pagetbailly.philippe • 0

0

Entering edit mode

Hi,

I am not very much convinced by the method employed, but I am myself an amateur in the field. Although, looking at the number of DEs across your condition, I want to know what was the quality filter applied while selecting the reads (Phred Score), and are the number of reads obtained from all three sets similar?

Edit: Another option would have been to add a 4th condition - isoform 1

ADD REPLY • link 6.8 years ago by vinayjrao ▴ 260

0

Entering edit mode

Hi, the average Phred score was 33. (40 million reads above 32). We obtained 50 million reads for each replicate (46 min to 57 max).

This fourth condition would have been nice indeed.

ADD REPLY • link 6.8 years ago by pagetbailly.philippe • 0

0

Entering edit mode

Hi,

The number of reads in the three sets are similar, with a good Phred score cut-off. I'm sorry, but I can't think of a conclusive reason for your results, although I would suggest you to repeat your analysis using another pipeline.

ADD REPLY • link 6.8 years ago by vinayjrao ▴ 260

0

Entering edit mode

I'm starting to think that DEG analysis can't answer the question i'm asking.

Thanks you very much for your time !

ADD REPLY • link 6.8 years ago by pagetbailly.philippe • 0

0

Entering edit mode

Maybe not, but now that you have already invested time in it, you should try the analysis with another pipeline. You would at least know if there was an error in the pipeline, or in the analysis.

ADD REPLY • link 6.8 years ago by vinayjrao ▴ 260

0

Entering edit mode

Can you give me any recommandation ? I'm fairly new to bioinformatics :)

ADD REPLY • link 6.8 years ago by pagetbailly.philippe • 0

0

Entering edit mode

Sure. You could try the hisat2 protocol (new tuxedo protocol), and also other aligners and mergers.

https://www.nature.com/articles/nprot.2016.095

https://www.nature.com/articles/nprot.2013.099#procedure

These are two established pipelines, and you could try analyzing the example data to be sure you understand what's going on.

Good luck :)

ADD REPLY • link 6.8 years ago by vinayjrao ▴ 260

0

Entering edit mode

I will try these. Thanks again !

ADD REPLY • link 6.8 years ago by pagetbailly.philippe • 0