Question

How To Resolve A Dispute Over Sequencing Data

5

Entering edit mode

12.3 years ago

waldojoe ▴ 50

We have problem with the RNA-seq data not working right. Boss thinks that the sequeincing people messed up, but they think that we messed up library prep. Big fight people mad! Only 5% data maps to genome, 70% reads duplicated. If I trim back reads to 70 long I can get mapping rate of 20%.

How can we settle who is right?

sequencing rna-seq • 2.8k views

ADD COMMENT • link updated 12.3 years ago by Rm 8.3k • written 12.3 years ago by waldojoe ▴ 50

3

Entering edit mode

Did you guys do the entire library prep? Or did you guys just give them a RNA extraction sample? If you guys did the entire library prep and they just plated it and stuck it in the machine, chances are you guys screwed up making the library.

Sounds like bad fragmentation, over amplification, or just bad first strand synthesis from RNA.... it can be many many things. Did you guys QC the library?

If they screwed up plating your library, it should be pretty easily detectable. How many monoclonals vs polyclonals? Did a lot of beads break?

ADD REPLY • link 12.3 years ago by Damian Kao 16k

0

Entering edit mode

Thank you for all answers! I am new compute guy in lab, I will ask labmate for info.

ADD REPLY • link 12.3 years ago by waldojoe ▴ 50

0

Entering edit mode

You did not specify a lot of things. Sequencing technology, sample prep and library prep protocol , Organism, read depth, what the duplicated reads are, how you mapped, quality of the reads etc...

ADD REPLY • link 12.3 years ago by Ido Tamir 5.2k

score 7 · Answer 1 · 2012-09-08

I have been on both sides of this argument. While working at a genomics facility that did sequencing for many Universities, we would get calls everyday from people saying that we messed up their sequencing when it was obvious their samples were not cleaned up properly. I've also had a number of instances where my own samples have been lost, mixed up, or produced really terrible results that were (seemingly) out of my control. The solution is to document everything (spec your DNA/RNA as many ways as possible, check for fragmentation, check for protein contamination, use high quality tissue, etc.) and present a good case to your boss and the sequencing provider.

I am in the middle of solving such a dispute right now with some gene capture data that is not mapping to the genome at all. If you keep good records and are polite, I've found that people will do another sequencing run gratis and fix the issue, though it is inconvenient. Also, I wouldn't spend months trying to figure it out. In some cases, I've found that I have sequence data from another species, but in other cases I was never able to figure out what was going on. Save your time and just do some more sequencing, if that is possible.

score 3 · Answer 2 · 2012-09-08

3

Entering edit mode

12.3 years ago

Sean Davis 27k

Check for:

adapter contamination
insert size (too short, in particular)
duplication metrics

Problems with any of these is likely to signify a problem with library or sample prep or quality. For a quick look at 1 and 3, try running FASTQC on the fastq files.

ADD COMMENT • link 12.3 years ago by Sean Davis 27k

score 2 · Answer 3 · 2012-09-08

2

Entering edit mode

12.3 years ago

Ryan Thompson ★ 3.6k

One thing I've seen done in the past is to either do a full lane or a spike-in of phiX DNA with the run. If the phiX reads don't map, then the problem was in the sequencing.

It's too late for you to do that with these samples, but it's something to consider in the future.

ADD COMMENT • link 12.3 years ago by Ryan Thompson ★ 3.6k

score 1 · Answer 4 · 2012-09-10

Check and see if you have done QC for various steps of the library preparation. Some times even the Batch of the KITs used may be bad or expired...(check with the vendour to see if they belong to a bad lot). To present your case: show all the QC steps you followed with gel images etc. We experienced similar case with few flowcells failing to map well; After a long debates back n forth, we followed a marathon back tracing each n every step and found some of the initial steps were bad in the sample preparation.