Difference of results with the same input [RNAseq analysis]
0
0
Entering edit mode
2.8 years ago

Hello!

I am trying to optimize the treatment of some RNAseq files by splitting the input reads into several files. I am comparing the results I have obtained with:

  • the reads input as one file
  • the split input as several files treated in parallel. I merge the SAM files after alignment.

I align with STAR then I assemble the transcriptome with Cufflinks.

On one sample (paired end, around 2Gb per file), I am having these differences of FPKM on this gene: (left value is FPKM of entire file, right is the splitted file)

  • Inpp4a|XM_006496019.3: 11.08, 9.37

  • Inpp4a|NM_030266.4: 5.11, 3.67

  • Inpp4a|NM_001374630.1: 1.06, 4.18

I used BamCompare of Deeptools to understand the difference between the two sample on this gene (NC_000067.7, 37338000->37450000) and the difference (--operation: substract) is less than 0.05 on this region.

  • With experience, would you consider the FPKM values obtained as different? I consider it as different as Cufflinks provides FPKM confidence interval: second value is outside of the confidence interval.

  • I would need help to understand which factor can cause this difference and what could be done to fix it?

Any leads or reference is highly appreciated!

Thank you very much!

RNAseq split cufflinks STAR • 416 views
ADD COMMENT

Login before adding your answer.

Traffic: 2362 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6